Good morning. OpenAI is making a security play today — a new Cyber-tuned model and a cash-and-compute commitment to open source maintainers drowning in AI slop CVEs. Meanwhile, an MIT paper is making the uncomfortable argument that prompt injection isn’t really fixable as long as roles are just text, and small models keep punching above their weight in ways that are either impressive or overstated, depending who you ask.
OpenAI’s “Patch the Planet” and GPT-5.5-Cyber. OpenAI announced Daybreak, a sprawling security initiative anchored by a new GPT-5.5-Cyber model, partnerships with Trail of Bits and HackerOne to give open-source maintainers free security consulting, and 20 trillion subsidized tokens for its Codex Security scanner. The framing is partly a response to the flood of low-quality AI-generated vulnerability reports burying volunteer maintainers — and partly, as Wired notes, a direct shot at Anthropic’s Mythos, which has dominated the offensive-security narrative since the NSA breach claim.
Prompt injection is role confusion, and roles are just vibes. A new MIT paper accepted at ICML 2026 reframes prompt injection as a structural problem: role tags like <system> and <user> are formatting conventions, not security boundaries, and the model has no real way to tell which tokens it should trust. The team notes that frontier models score near-perfectly on standard injection benchmarks while human red-teamers still hit ~100% success rates against them. One HN commenter sketched a plausible fix — bake an unspoofable role embedding into each token rather than relying on text markers — though that requires architectural changes nobody’s shipped yet. Another commenter put it more bluntly: LLMs provide no security boundaries, full stop, and pretending otherwise leads to bad design.
Small models doing suspiciously big things. VibeThinker-3B claims to match Gemini 3 Pro and DeepSeek V3.2 on reasoning benchmarks (94.3 on AIME26, 80.2 on LiveCodeBench v6) through a curriculum SFT plus multi-domain RL pipeline, with the authors arguing verifiable reasoning compresses much better than general knowledge. Reception is mixed: one HN commenter reports practical success using it for code security review on an RTX 3090, while another tested it against a corpus of Mythos-discovered bugs and it found zero. Separately, Moebius, a 0.22B image inpainting model from HUST and VIVO, claims to match FLUX.1-Fill-Dev (11.9B) at 15x the speed; Simon Willison already has it running in the browser via ONNX, though testers say inpainted regions look visibly smoother than their surroundings on harder images.
GLM-5.2, both locally and against Opus. Z.ai’s 744B-parameter (40B active) MIT-licensed model is now runnable locally via Unsloth’s Dynamic GGUFs, with 2-bit quants fitting on a 256GB Mac at the cost of ~18% accuracy, and one HN user reporting 6 tok/s on a $4K-ish 512GB DDR4 + dual 3090 build. A separate head-to-head against Claude Opus 4.8 had Opus producing a cleaner WebGL platformer in half the time, but GLM-5.2 at roughly a fifth the cost — capability-per-dollar near Haiku for output near Opus. The HN thread tore apart the methodology (one-shot prompts, different harnesses, no documented thinking effort), but multiple commenters separately said GLM-5.2 is now their daily driver alongside Claude.
Groq raises $650M after Nvidia walks off with its CEO and IP. Six months after Nvidia licensed Groq’s LPU tech and hired away founder Jonathan Ross along with key executives — now shipping the IP as the “Groq 3 LPX” inference card — Groq has confirmed a $650M round led by Disruptive and Infinitum. The company is now leaning entirely on its neocloud, 13 data centers and 5M developers, while trying to outcompete the company that owns its core technology. Speaking of neoclouds, SpaceX signed a $150M/month compute deal with open-source lab Reflection AI through 2029 — small next to the Anthropic ($1.25B/mo) and Google ($920M/mo) contracts, but notable given Reflection’s open-weight bet and a reported US government ban on Anthropic’s closed models.
DeepMind buys into A24. Google DeepMind put $75M into A24 for a partnership on AI filmmaking tools, with the indie studio providing artist feedback in the loop. It joins Netflix’s recent acquisition of Affleck’s AI film outfit and Amazon MGM’s new AI production unit — Hollywood is clearly past the “if” phase.
That’s it for today. The Daybreak/role-confusion pairing is the one to sit with: OpenAI is selling AI as the fix for security at the same week researchers are arguing the security model underneath every LLM is mostly a polite fiction.