This blog is no longer active, you can find my new stuff at:,,, and

I Was (Relatively) Wrong About AGI

Previously I wrote various articles on the topics of AI and AI alignment. My central claims were that people overvalue (conceptual) intelligence, that “building up” intelligence without trial and error is hard, that unsupervised methods can’t get you very far, that alignment research as a separate magesteria makes little sense, that superintelligence isn’t very useful due to physical-substrate-bottlenecked experiments being required to understand the world and that tool-like AI is going to be a lot more useful and focused on than AGI.

I think the last couple of years, and especially the last few months, have proved me partially wrong on most of those accounts.

People that failed to life a finger to integrate better-than-doctors or work-with-doctors supervised medical models for half a century are stoked at a chatbot being as good as an average doctor and can’t wait to get it to triage patients.

The most hot-button topic in tech circles is inventing new ways to allow LLMs to execute code unsupervised over increasingly large, hard-to-monitor, and resource-rich environments.

LLMs that have gotten surprisingly good given that most of it happened via supervised learning.

I’m especially ashamed I hadn’t predicted this given that I was working on building & selling ML tools in a B2B setting, as well as building & working with LLMs starting in 2018, my model of this space was incredibly sub-optimal.

I don’t want to make a lot of predictions or speculations here because the main purpose of this article/PSA is to state that my worldviews have obviously failed to materialize and you should probably update away from my models if you previously liked them.

I’ve updated my “chance of intelligence explosion” to “non-zero but not significantly high”, my “chance virtually all humans are killed within the next five decades” to “high single digits”, and my “chance an AI-driven catastrophic event kills billions of people in the next five decades, directly or through 2nd order effects” to “low double digits”

I’m still mostly interested and betting on the “AI as tools” angle, in which alignment is solved by aligning the tool users. But given the overall excitement about agents combined with the lack thereof for tools, I’m more pessimistic people will opt for “tool AI”, even if it’s objectively better for the use-case. And, given the success of “agent-lite AI”, and the relatively small amount of large-scale exploration done in this area so far, I can only expect it to improve by quite a large margin in the near future.

I can see a few more subtle shifts being sufficient to move me from the “tool” camp into the “agent” camp in terms of the “technology of the future” … at which point all of my risk estimates get a few times worse.

Which is to say I’m still borderline optimistic by LW/TPOT/SSC standards, but I can see myself moving to neutral.

Published on: 2023-03-28



twitter logo
Share this article on twitter
 linkedin logo
Share this article on linkedin
Fb logo
Share this article on facebook