Envy-free Policy Teaching to Multiple Agents

Nov 28, 2022

Speakers

About

We study envy-free policy teaching. A number of agents independently explore a common Markov decision process (MDP), but each with their own reward function and discounting rate. A teacher wants to teach a target policy to the diverse group of agents, by way of modifying the agents' reward functions, providing additional bonus to certain behaviors or penalizing others. These reward modifications are personalized for each agent. An important question in this setting concerns how a teaching program can be designed so that the agents think that they are treated fairly. We adopt the fairness notion of envy-freeness (EF) to formalize this question and define three different EF notions, each imposing stronger requirements than the previous one. Using these notions, we then investigate several fundamental questions, including the existence of EF solutions in the policy teaching setting, the computation of cost-minimizing solutions, and the price of fairness (PoF), i.e., the increase in cost due to consideration of fairness. We show that an EF solution may not exist when penalties are not allowed, but exists otherwise. Depending on the cost measures, computing a cost-minimizing EF solution can be formulated as convex or linear programming and hence solved efficiently. Asymptotically, the PoF increases but at most linearly with the geometric sum of the discount factor in general, the size of the MDP, and the number of agents involved. Thus, fairness can be incorporated in multi-agent teaching without significant computational or price-of-fairness burdens.

Organizer

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Interested in talks like this? Follow NeurIPS 2022