I’ll restrict myself to thinking about only the cooperative Multi-agent RL (MARL) setting. Here, multiple agents are tasked with maximizing the joint team reward shared by everyone. This setting has a lot of practical applicability such has robot warehouse automation and has garnered a significant amount of interest.

There is a lot of ground to cover with such MARL systems. Some of the research areas in this field include:

Algorithms: