![[Ram Rachum.jpg|right40]] My mission statement: >AGI will likely be developed soon. It could have an extremely good or extremely bad effect on the human race. We are in a race to figure out how to make that effect be good. > > Right now my main angle on AI Safety is explainability: Developing tools that answer the question "why did the AI system make this decision?" I'm a research intern at [Center for Human-Compatible Artificial Intelligence (CHAI)](https://humancompatible.ai/) and an affiliate at [the GOLD lab](https://sites.google.com/site/dekelreuth/gold-lab) at Tufts University, led by [Professor Reuth Mirsky](https://sites.google.com/site/dekelreuth/welcome). I've received funding / fellowships / internships from: * [ALTER](https://alter.org.il/) * [CHAI](https://humancompatible.ai/) * [The Future of Life Institute (FLI)](https://futureoflife.org) * [Nonlinear](https://www.nonlinear.org/) I'll be happy to get any comments and feedback. Email me at [email protected] [[About Ram Rachum|More about me]]. ## 📜 Selected papers * [[BXRL|BXRL: Behavior-Explainable Reinforcement Learning]] (2026) * [LinuxArena: A Control Setting for AI Agents in Live Production Software Environments](https://r.rachum.com/linuxarena-pdf) (2026) * [[Dominance hierarchies|Emergent Dominance Hierarchies in Reinforcement Learning Agents]] (2024) ## 💬 My posts * [Can AI agents learn to be good?](https://futureoflife.org/ai-research/can-ai-agents-learn-to-be-good/) My guest post on FLI's blog. * A Conservative Vision for AI Alignment, a LessWrong sequence with David Manheim: * Part 1: [A Conservative Vision for AI Alignment](https://r.rachum.com/conservative-alignment-1) * Part 2: [Messy on Purpose](https://r.rachum.com/conservative-alignment-2) * Part 3: [12 Angry Agents, or: A Plan for AI Empathy](https://r.rachum.com/conservative-alignment-3) ## ⭐ Monthly newsletter Sign up to my [research newsletter](http://r.rachum.com/newsletter) to get monthly updates about my research.