مرکز منطقه ای اطلاع رساني علوم و فناوري - Optimizing Attack Surface and Configuration Diversity Using Multi-objective Reinforcement Learning

Abstract :

Minimizing the attack surface of a system and introducing diversity into a system are two effective ways to improve system security. However, determining how to include diversity in a system without increasing the attack surface more than necessary is a difficult problem, requiring knowledge about the system characteristics, operating environment, and available permutations that is generally not available prior to system deployment. We propose viewing a system´s components, interfaces, and communication channels as a set of states and actions that can be analyzed using a sequential decision making process, and using a multi-objective reinforcement learning algorithm to learn a set of policies that minimize a system´s attack surface and execute those policies to obtain configuration diversity while a system is operating. We describe a methodology for designing a system such that its components and behaviors can be translated into a multi-objective Markov Decision Process, demonstrate the use of multi-objective reinforcement learning to learn a set of optimal policies using three different multi-objective reinforcement learning algorithms in the context of an online file sharing application, and show that our multi-objective temporal difference afterstate algorithm outperforms the alternatives for the example problem.