Case Studies

How teams use Composo to catch failures and ship with confidence.

Healthcare: The Notes That Looked Fine

A Series B clinical AI company's ambient scribe had been in production for months. Evals were passing. We analysed 847 notes and found 127 failures - 23 severity-critical.

B2B SaaS

B2B SaaS: The Support Agent That Was Giving Away Money

A Series B SaaS company's AI support agent was making unauthorised financial commitments, inventing product features, and ignoring escalation requests. We found 189 failures in 1,243 responses.

Financial Services

Financial Planning: The Agents That Drifted After a Model Update

A financial planning platform had 20+ AI agents across customer operations. A model update caused subtle drift that went undetected for three weeks - hundreds of slightly-off decisions compounding.

Enterprise

Enterprise SaaS Platform Achieves 99.7% Agent Reliability with Composo

How an enterprise SaaS platform achieved 99.7% AI agent reliability, reduced QA costs by $1.2M, and closed $4.5M in deals using Composo evaluation.

Legal Tech

Legal Tech Startup Ships MVP in 4 Weeks Using Composo

How a seed-stage legal tech startup shipped their AI lease review MVP in 4 weeks, cut evaluation costs by 96%, and passed enterprise pilots with Composo.