HMHamza Mostafa

Essays & field notes

Writing

Agents, systems, experiments, and the knowledge that makes intelligence useful.

01Featured / Agent Systems

The Model Is the Wrong Thing to Own

Enterprises should own their intelligence. That means accumulating domain knowledge every new model can inherit, not training a model of their own.

July 24, 2026 / 5 min read

Archive

All field notes

05 essays

02EvaluationSalesBench: The Long-Horizon Agent-to-Agent EvalA long-horizon RL environment where a small model learns to manage an insurance sales pipeline against an LLM buyer, scored by revenue closed instead of by an LLM judge. The trained model vastly outperforms the untrained base, and the gap widens as the eval gets harder.May 14, 202612 min read

03Agent SystemsThe Agent Research LoopWhat Karpathy's autoresearch really means, where agent systems are headed, and an open-source harness that ran 550 experiments over a weekend.March 16, 20268 min read

04Personal AgentsI Run a Personal AI Agent 24/7 on a Mac Mini. Here's How It Actually Works.A Mac Mini, some markdown files, and seven communication channels. Inside the setup that gives me a 24/7 AI assistant that monitors my email, iMessage, WhatsApp, and Twitter - and actually does useful things.March 7, 202612 min read

05Model TrainingI Let AI Agents Train Their Own Models. Here's What Actually Happened.Two frontier agents, a pile of bugs, and a reality check on the future of autonomous AI research.February 8, 20267 min read