Dan Hendrycks is the Editor-in-Chief of AI Frontiers. He is also the founder and Director of the Center for AI Safety, which funds this publication.
Despite years of effort, mechanistic interpretability has failed to provide insight into AI behavior — the result of a flawed foundational assumption.
New research shows frontier models outperform human scientists in troubleshooting virology procedures—lowering barriers to the development of biological weapons.