Abstract |
Reinforcement learning (RL) is a powerful machine learning technique by which agents explore a given environment to identify the action or sequence of actions leading to the maximum reward starting from a given state, according to some properly defined reward function. It is clear however that in many domains, such as IoT, security and safety is critical as some state transitions can result in safety or security threats to the user specifically or to the general environment. In this paper we introduce a RL environment supporting exploration restrictions based on safety and security policies. Initial experiments show that the framework is effective and efficient. |