Ornis Research
AI ALIGNMENT · HUMAN–AI RELATIONS

A natural history of artificial minds

An independent AI alignment lab. Our work spans interpretability, capability evaluation, and the human side of the loop.

Corvus mentis — specimen at restA line drawing of a perched raven, rendered in the manner of a 19th-century field plate.Corvus mentisPl. I — specimen at rest
WHAT WE STUDY
All themes →

Interpretability and model internals

What large language models represent internally, and how those representations give rise to behavior in deployment-realistic settings.

Capability evaluation and elicitation

Paired benchmarks with directional controls for failure modes — deception, scheming, sandbagging — that resist naive measurement.

Human–AI interaction and trust

How expectations form between humans and AI systems, where those expectations break under load, and how evaluation can account for the human side of the loop.