![[Ram Rachum.jpg|right40]]
My mission statement:
>AGI will likely be developed soon. It could have an extremely good or extremely bad effect on the human race. We are in a race to figure out how to make that effect be good.
>
> Right now my main angle on AI Safety is explainability: Developing tools that answer the question "why did the AI system make this decision?"
I'm a research intern at [Center for Human-Compatible Artificial Intelligence (CHAI)](https://humancompatible.ai/) and an affiliate at [the GOLD lab](https://sites.google.com/site/dekelreuth/gold-lab) at Tufts University, led by [Professor Reuth Mirsky](https://sites.google.com/site/dekelreuth/welcome).
I've received funding / fellowships / internships from:
* [ALTER](https://alter.org.il/)
* [CHAI](https://humancompatible.ai/)
* [The Future of Life Institute (FLI)](https://futureoflife.org)
* [Nonlinear](https://www.nonlinear.org/)
I'll be happy to get any comments and feedback. Email me at
[email protected]
[[About Ram Rachum|More about me]].
## 📜 Selected papers
* [[BXRL|BXRL: Behavior-Explainable Reinforcement Learning]] (2026)
* [LinuxArena: A Control Setting for AI Agents in Live Production Software Environments](https://r.rachum.com/linuxarena-pdf) (2026)
* [[Dominance hierarchies|Emergent Dominance Hierarchies in Reinforcement Learning Agents]] (2024)
## 💬 My posts
* [Can AI agents learn to be good?](https://futureoflife.org/ai-research/can-ai-agents-learn-to-be-good/) My guest post on FLI's blog.
* A Conservative Vision for AI Alignment, a LessWrong sequence with David Manheim:
* Part 1: [A Conservative Vision for AI Alignment](https://r.rachum.com/conservative-alignment-1)
* Part 2: [Messy on Purpose](https://r.rachum.com/conservative-alignment-2)
* Part 3: [12 Angry Agents, or: A Plan for AI Empathy](https://r.rachum.com/conservative-alignment-3)
## ⭐ Monthly newsletter
Sign up to my [research newsletter](http://r.rachum.com/newsletter) to get monthly updates about my research.