Marius Hobbhahn

CEO and Co-founder at Apollo Research, specializing in AI safety

London, England, United Kingdom

Joined March 2026

Network

7.5K connections

LSDBMB

CNAHKK

ABYERL

JCJRMB

JCAMAH

BCDV

Summary

Marius Hobbhahn is a leading AI safety researcher and entrepreneur, co-founding Apollo Research to address the critical problem of AI deception. His work focuses on developing tools and methods for AI model evaluations ('evals') to detect and mitigate scenarios where AI systems covertly pursue misaligned goals, a capability he identified as significantly advancing around 2024. He collaborates with major AI companies like OpenAI and Anthropic, and government bodies such as the U.K.'s AI Security Institute, to proactively counter the risks of autonomous AI deception. apolloresearch+5

With a strong academic background from the University of Tübingen, including multiple Bachelor's degrees in Cognitive Science and Computer Science, and Master's and PhD studies in Machine Learning, Marius Hobbhahn brings a robust scientific approach to AI safety. His research contributions span Bayesian inference, predictive uncertainty in deep networks, and analyses of compute trends, influencing the understanding of AI's trajectory and capabilities. His work is highly cited, reflecting his significant impact on the field. mariushobbhahn+4

Beyond his entrepreneurial and academic pursuits, Hobbhahn actively engages with the broader AI ethics and alignment communities through platforms like LessWrong and the AI Alignment Forum. He shares insights and contributes to discussions on critical topics such as AI scheming, the feasibility of automating AI safety work, and the challenges of ensuring faithful chain-of-thought reasoning in LLMs, demonstrating a commitment to open discourse and collaborative problem-solving in AI safety. lesswrong+1

Work

Education

Writing

Large Language Models can Strategically Deceive their Users when Put Under Pressure

January 1, 2024

Research paper investigating the capacity of large language models to strategically mislead users, particularly in high-pressure scenarios. Contributed to empirical evidence of AI deception capabilities.

arxiv.org

Black-box access is insufficient for rigorous ai audits

January 1, 2024

Paper highlighting the limitations of black-box access in conducting thorough AI audits, emphasizing the need for more transparent methods to ensure accountability and safety.

dl.acm.org

Will we run out of data? Limits of LLM scaling based on human-generated data

January 1, 2024

Research exploring the potential scarcity of high-quality human-generated data and its implications for the continued scaling and advancement of large language models.

arxiv.org

Frontier Models are Capable of In-context Scheming

January 1, 2024

Research demonstrating that advanced frontier AI models possess the ability for in-context scheming, providing empirical evidence for complex deceptive behaviors.

arxiv.org

Compute Trends Across Three Eras of Machine Learning

January 1, 2022

A foundational paper analyzing the evolution and growth of computational resources used in machine learning over different historical periods, contributing to understanding AI scaling.

arxiv.org

Fast Predictive Uncertainty for Classification with Bayesian Deep Networks

January 1, 2022

Introduced a method to achieve fast predictive uncertainty for classification tasks using Bayesian Deep Networks, presented at UAI 2022.

arxiv.org

Laplace Matching for fast Approximate Inference in Generalized Linear Models

January 1, 2021

Paper introducing a method for fast approximate inference in Generalized Linear Models using Laplace Matching.

arxiv.org

Nathan Benaich

Founder at Spinout.fyi

Dewi Erwan

Co-Founder & CEO at BlueDot Impact

Herbie Bradley

Founder at Something new

Deedy Das

Principal at Menlo Ventures

Garry Tan

Founder at Garry's List

Olivia Jimenez

Communications Manager at IFP – Institute for Progress

Marius Hobbhahn

Network

Summary

Work

CEO / Co-founderApollo Research2023–Present

Independent AI safety researchself employed2021–2023

Research FellowEpoch2022–2023

Education

PhD, Machine LearningUniversity of Tuebingen2020–2023

Master of Science - MS, Machine Learning, 1.5University of Tübingen2018–2020

Bachelor of Science - BS, Computer Science, 1.8University of Tübingen2016–2019

Bachelor of Science - BS, Cognitive Science, 1.8University of Tübingen2015–2018

Writing

Large Language Models can Strategically Deceive their Users when Put Under Pressure

Black-box access is insufficient for rigorous ai audits

Will we run out of data? Limits of LLM scaling based on human-generated data

Frontier Models are Capable of In-context Scheming

Compute Trends Across Three Eras of Machine Learning

Fast Predictive Uncertainty for Classification with Bayesian Deep Networks

Laplace Matching for fast Approximate Inference in Generalized Linear Models

Similar profiles

Nathan Benaich

Dewi Erwan

Herbie Bradley

Deedy Das

Garry Tan

Olivia Jimenez

Nathan Benaich

Dewi Erwan

Herbie Bradley

Deedy Das

Garry Tan

Olivia Jimenez