Chris Lengerich
February 9, 2024
Personal respect for those working on California's SB 1047 which would affect frontier models ([https://leginfo.legislature.ca.gov/faces/billNavClient.xhtml?bill_id=202320240SB1047…](https://t.co/2AlgNKtZb4)) but it has some serious risks and misunderstandings:
It asks a developer of frontier models to make a "positive safety determination" of the long-term impact of their models:
"a determination, pursuant to subdivision (a) or (c) of Section 22603, with respect to a covered model that is not a derivative model that a developer can reasonably exclude the possibility that a covered model has a hazardous capability or may come close to possessing a hazardous capability when accounting for a reasonable margin for safety and the possibility of posttraining modifications."
and then proposes to apply **criminal law** (perjury) to the accuracy of this assessment. If you get your future prediction wrong, you might go to jail.
A positive safety determination presumably would be built on a probability model of the future socioeconomic impact of your ML model.
A couple questions:
1. Is it actually possible to perjure a future prediction? I've worked on these investment problems, and future predictions can be wildly different from each other and from later reality, even with good prior evidence. Uncertain ML models x uncertain socioeconomic models = uncertainty^2.
2. Would fear of perjury actually create perverse incentives (moral hazard) to not collect evidence on exactly these problems? Probably, yes (we know this happens with current products, where fear of liability causes companies to be hostile to whistleblower investigations).
3. Could a judgment this broad be misused, by say, an authoritarian state who wants to use a monopoly on frontier models to control speech? (likely yes)
4. Would a judgment this uncertain be expensive and time-consuming to litigate? (likely also yes)
What we do need, though, is funding for eval sets of risk and mandates for scoped risk mitigations of AI deployments:
1. **Funding** - scientists and public investors build eval sets every day, because they're curious and they want to make the world a better place. It still costs money, though. Eval sets aren't profitable, products are. Guaranteed funding from the industry, collected from both responsible and irresponsible companies, could fix this. The government is uniquely able to coordinate passing the hat for this, and has done so in other sectors for things like systematic risk in banking, post SVB.
2. **Deployment** - the government can force deployers of models to mitigate specific, testable risks that are expressed in eval sets. For example, screening DNA sequences for known pathogens before synthesis is a no-brainer. Banning AI robocalls and non-consensual explicit deepfakes are also helpful legal steps towards known broad harms. But context matters in choosing what eval sets to turn into standards for behavior - entertainment may be treated different from politics for example.
Steps towards either (1) or (2) would likely have more real safety outcomes than the current California proposal.
Still, I'm glad people are asking these questions and look forward to working with the proposers.
Related analysis: https://hyperdimensional.substack.com/p/californias-effort-to-strangle-ai