Emergent reciprocity - Ram Rachum / AI Safety Research

%% ## Todo %% I believe that emergent reciprocity is the main hump that the MARL field needs to get over. I believe that after we achieve emergent reciprocity, we'll get a Cambrian explosion of social behavior in selfish agents. This is critical for my research goals. "Reciprocity" means that an agent correlates how good another agent is being to them with how good they should be to the other agent. In Iterated Prisoner's Dilemma, the [Tit-for-Tat](https://en.wikipedia.org/wiki/Tit_for_tat) policy is a simple example of reciprocity. Reciprocity is very powerful, because it allows cooperation to "stick" in a population. Agents that reciprocate know how to cooperate with other agents, but they also know how to not let defectors take advantage of them. "Emergent" means that this policy should appear of itself. Not because we coaxed our agents into it, but because they came to the conclusion that it's the best thing for them. (An antonym for emergent in this context can be either "prescribed", "hand-crafted" or "explicit".) It's very difficult to get an RL agent to learn reciprocity. That's because reciprocity only makes sense after other agents in the population also learn it. RL agents learn by trial-and-error. If they do trial-and-error on reciprocity before the other agents learn it, they'll just lose points. This is a Catch 22 problem. To look at it differently: I'm interested in the first moment in which an AI agent becomes aware that there is another agent in its environment. How does the agent decide whether to treat the other agent as a friend or a foe? This is a crucial junction that determines the social dynamics of a population of agents.