Dec 6, 2022
Human teammates often form conscious and subconscious expectations of each other to better interact. Teaming success is contingent on whether such expectations can be met. For an intelligent agent to operate with humans beside them, similarly, it must consider the human's expectation of its behavior. Otherwise, it may result in loss of trust and degraded team performance. A key challenge here is that the human's expectation may not align with the agent's optimal behavior, due to, for example, the human's partial or inaccurate understanding of the task domain. Prior work on explicable planning describes the ability of agents to respect their human teammate's expectations by trading off task performance for more expected or "explicable" behaviors. In this paper, we introduce Explicable Policy Search (EPS) to significantly extend such an ability to a reinforcement learning (RL) setting and to handle stochastic domains with continuous state and action spaces. However, in contrast to the traditional RL method, EPS must at the same time infer the human's hidden expectations. Such inferences require information about the human's belief about domain dynamics and her reward model but directly querying them is impractical. We demonstrate that they can be sufficiently encoded by a surrogate reward function, which can be learned based on the human's feedback on the agent's behavior. The surrogate reward function is then used to reshape the agent's reward function, which is shown to be equivalent to searching for an explicable policy. We evaluate our method for EPS in a set of continuous navigation domains with synthetic human models and in an autonomous driving domain with a user study. The results suggest that our method can generate explicable behaviors that reconcile task performance with human expectation intelligently and has real-world relevance in human-agent teaming domains.Human teammates often form conscious and subconscious expectations of each other to better interact. Teaming success is contingent on whether such expectations can be met. For an intelligent agent to operate with humans beside them, similarly, it must consider the human's expectation of its behavior. Otherwise, it may result in loss of trust and degraded team performance. A key challenge here is that the human's expectation may not align with the agent's optimal behavior, due to, for example, th…
Account · 958 followers
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Yixing Xu, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Shu Ding, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Rongzhe Wei, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%