AI News — May 05, 2026

Good morning. The Musk v. Altman trial dominates today’s news as it grinds into week two, with Greg Brockman on the stand, ominous text messages surfacing, and Musk’s lone expert witness getting his existential-risk testimony curtailed. Away from the courtroom, Sierra picked up another billion dollars to chase enterprise AI agents, Cerebras filed for what could be the year’s biggest tech IPO, and llama.cpp quietly shipped something that may double inference speeds for a lot of local setups.

Brockman takes the stand, journals do the damage. OpenAI president Greg Brockman testified this week and proved a famously slippery witness — quibbling over word choices, correcting minor misreadings, and answering technically where he could have answered plainly, The Verge reports. The reporter’s wry observation: Brockman’s own journal entries have done more harm to OpenAI’s case than anything he said on the stand, and they’re now central to Musk’s argument that the for-profit conversion betrayed the original deal. The Verge is also running live updates from the courtroom as Musk pursues Altman and Brockman’s removal, up to $150B in damages, and an unwinding of OpenAI’s PBC structure.

The “ominous texts” and the AGI expert. OpenAI’s lawyers disclosed that Musk texted Brockman two days before trial proposing settlement, then warned that “by the end of this week, you and Sam will be the most hated men in America” after Brockman countered with mutual dismissal, per TechCrunch. The judge ruled the exchange inadmissible but the public filing did its work. Meanwhile Musk’s only AI expert, Berkeley’s Stuart Russell, testified about AGI risk and arms-race dynamics, though the judge limited his existential-risk discussion and OpenAI’s cross noted Russell hadn’t actually evaluated OpenAI’s structure or safety practices. The whole spectacle keeps bumping into the same contradiction: Musk signed the 2023 pause letter and then started xAI. MIT Tech Review’s in-the-room writeup captures the mood, including protestors outside hoping both sides lose.

Sierra raises $950M at $15B. Bret Taylor’s customer-experience AI company pulled in another $950M from Tiger Global and GV, pushing valuation past $15B and giving it over $1B in cash, per the company and a separate TechCrunch piece. Sierra says it now serves over 40% of the Fortune 50 and grew ARR from $100M to $150M in roughly two months. HN reaction is skeptical: one commenter who supervised a Sierra rollout said performance and pricing were genuinely impressive but warned the implementation is bespoke and the “outcome-based pricing” probably won’t survive contact with renewal cycles. Others doubt AI agents can handle the complex cases that make people pick up the phone in the first place.

Cerebras goes public. AI chipmaker Cerebras is pricing 28 million shares at $115–$125 to raise about $3.5B at a $26.6B market cap, TechCrunch reports — the largest tech IPO of 2026 if it lands. The company sells its Wafer-Scale Engine 3 against GPU competitors on inference speed and power. Notable on the cap table: Sam Altman, Greg Brockman, and Ilya Sutskever as angel investors, alongside Benchmark, Fidelity, Tiger, and G42.

White House floats pre-release vetting for AI models. The NYT reports the White House is considering a process to review AI models before public release. The r/LocalLLaMA reaction was uniformly hostile, with commenters reading the proposal as a regulatory moat for incumbents, a free-speech problem, or a content-control mechanism dressed up as safety. One European user shrugged that they were going to use Chinese or local models anyway. Worth watching how this is scoped — open-weights releases would be the obvious pressure point.

Llama.cpp ships MTP speculative decoding. A beta PR adding Multi-Token Prediction merged into llama.cpp, letting models with built-in MTP heads do speculative decoding without a separate draft model. Tests on Qwen3.6 27B show ~75% token acceptance and over 2x throughput — 7 tok/s baseline up to 21.6 tok/s on a DGX Spark. The catches: it needs MTP-enabled GGUFs, and the speedup may invert on VRAM-constrained rigs that lean on CPU offloading. One commenter called it potentially the biggest performance change llama.cpp has ever shipped.

That’s the slate — a courtroom, a chip IPO, a billion for customer-service agents, and a runtime upgrade that costs nothing. More tomorrow.