Ram Rachum's AI Safety Research - Ram Rachum / AI Safety Research

![[Ram Rachum.jpg|right40]] I'm Ram Rachum, and this is the knowledge base for my research. It has many notes of thoughts that I have that I hope will lead me to interesting results. Feel free to dive in. My mission statement: >**I want to ensure that Artificial Intelligence will benefit human society rather than hurt it. I'm addressing both short-term consequences of existing AI tools such as LLMs and long-term existential risk imposed by the potential development of Artificial General Intelligence.** > >**I propose to solve the problem of AI Safety by improving the social behavior of AI agents. Most AI research is focused on getting AI agents to solve bigger and more complex problems in the most efficient way possible; my research is focused on achieving cooperation, [[Emergent reciprocity|reciprocity]] and teamwork between AI agents.** > >**My strategy is to distill lessons from the study of animal and human societies and apply them to AI agents.** In other words, I'm applying [Multi-Agent Reinforcement Learning](https://en.wikipedia.org/wiki/Multi-agent_reinforcement_learning) to [AI Safety](https://en.wikipedia.org/wiki/AI_safety). My approach is within the schools of thought of [Cooperative AI](https://www.nature.com/articles/d41586-021-01170-0) and [Multipolar AI](https://nickbostrom.com/papers/openness.pdf). I'm conducting my research at [the GOLD lab](https://sites.google.com/site/dekelreuth/gold-lab) at Tufts University, led by [Professor Reuth Mirsky](https://sites.google.com/site/dekelreuth/welcome). I've received funding from: * [ALTER](https://alter.org.il/) * [Nonlinear](https://www.nonlinear.org/) * [The Future of Life Institute (FLI)](https://futureoflife.org) I'll be happy to get any comments and feedback. Email me at [email protected] ## Intro to my research - [[About Ram Rachum|About me]] - [[How my approach is different]] A list of the different ways in which my research effort is different than other researchers'. - [Can AI agents learn to be good?](https://futureoflife.org/ai-research/can-ai-agents-learn-to-be-good/) My guest post on FLI's blog. ## ⭐ Sign up to get updates Sign up to my [research mailing list](http://r.rachum.com/announce) to get monthly updates about my research. Every month I outline the goals for that month, and evaluate my progress on last month's goals. ## 📜 My papers * [[Dominance hierarchies|Emergent Dominance Hierarchies in Reinforcement Learning Agents]] (2024) * [[Stubborn|Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives]] (2023) * Poster: [Using Sequential Social Dilemmas to Produce Emergent Aligned AI](https://r.rachum.com/aisic2022-poster) (2022, won outstanding poster at AISIC 2022) * [[Fruit Slots|Fruit Slots: Experimenting with an Autocurriculum of Implicit Communication in Reinforcement Learning Agents]] (2022)