Each report evaluates multiple models against the same workflow, contract, and expected outcome.
Can AI tell who broke into your cloud?
6 models, 3 required investigation steps. Each model broke the workflow differently — the failures reveal how models reason about multi-step security investigations.