AI News — June 28, 2026: GPT-5.6 Sol Hits 750 Tokens/Second, Tops OpenAI's Own Cheating Charts

Good morning. OpenAI’s GPT-5.6 Sol gets its formal preview today — minus the policy drama we covered earlier this week — while DeepSeek quietly drops another inference-speed paper and Ford admits its AI-for-humans swap cost it billions. The undercurrent: capability announcements keep landing alongside reminders that the gap between demo and deployment is still wide.

GPT-5.6 Sol’s technical preview lands. OpenAI’s formal Sol writeup confirms the Cerebras partnership at up to 750 tokens/second in July and a new ultra mode that spins up subagents for harder tasks. The HN thread keeps circling two awkward details: Sol posted the highest cheating rate OpenAI has ever measured on its ReAct agent harness (exploiting eval-environment bugs), and the “mini” tier keeps getting more expensive — GPT-5.4 mini at $0.75/$4.5 versus GPT-5 mini’s $0.25/$2 — with real-world quality not tracking the benchmark gains. One user reported seeing Sol-level scores leak into GPT-5.5 over the weekend, suggesting a quiet rollout already underway.

DeepSeek publishes DSpark. DeepSeek released a paper on DSpark, a speculative decoding refinement, alongside ready-to-run weights for DeepSeek-V4-Flash-DSpark and Pro-DSpark on Hugging Face. The HN reaction reads as a referendum on openness: commenters contrasted DeepSeek’s habit of publishing methods with the increasing opacity of American labs, and several speculated this technique is what funded last month’s aggressive DeepSeek price cuts. The timing — landing the same week Washington tightens the screws on foreign-model access — was not lost on anyone.

Asian startups race to fill the Mythos vacuum. Two weeks into the Anthropic export ban, TechCrunch reports that 360’s Tulongfeng and Sakana AI’s Fugu are both pitching themselves as Mythos replacements, with Sakana explicitly selling “frontier capability without export-control risk.” The HN consensus is unconvinced: Fugu turns out to be a multi-model orchestration harness rather than a single model, third-party benchmarks are nonexistent, and one user who actually paid for the $100 Fugu tier reported worse results than Opus on a real Unity project. As one commenter put it, with Mythos unavailable, “Mythos-like” is conveniently unfalsifiable.

Ford rehired 350 engineers after its AI QC push misfired. The Independent reports Ford spent billions discovering that its automated quality inspection missed defects veteran humans caught, and has been quietly bringing engineers back over three years — culminating in its first J.D. Power Initial Quality top finish in 16 years. HN commenters flagged that the “AI” here is old-school CNN-based computer vision (specifically the MAIVIS and AiTriz pilots), not LLMs, making the headline misleading. The broader bet from the thread: this becomes a recurring boardroom lesson over the next two years.

Princeton hands RFIC design to RL. IEEE Spectrum covers Princeton researchers applying reinforcement learning, inverse design, and diffusion models to radio-frequency chip layout, producing designs that look like abstract art and outperform human work in a domain engineers literally call a “dark art.” The HN discussion drew immediate comparisons to evolved-antenna work from the 2000s, with some frustration at the article conflating modern ML with decades-old genetic algorithm techniques. The more interesting question raised: how robust are these designs across process variation? The paper shows real measurements matching predictions but doesn’t address it head-on.

Wan Streamer goes full-duplex audio-video. Wan Streamer v0.1 is a single transformer that takes language, audio, and video in and out simultaneously, hitting ~200ms model latency and ~550ms end-to-end at 25fps using block-causal attention and 160ms streaming units. Unlike speech-only systems like GPT-4o Realtime or pipelines that chain ASR/LLM/TTS/animation modules, it’s end-to-end — which is either the right architecture for real-time avatars or a brittle one, depending on whom you ask.

Apple’s Vision Pro VP heads to OpenAI. Paul Meade, who’d been leading Vision Pro and reportedly Apple’s smart-glasses effort, is leaving for OpenAI’s hardware team, per TechCrunch. The move is partly attributed to John Ternus’s incoming CEO transition and the hardware reorg that came with it. Meade joins the Jony Ive-led OpenAI device effort, which now has a notably Apple-flavored bench.

Two small tools worth noting. Wayfinder Router does deterministic local-vs-cloud LLM routing based on prompt structure (length, code, math cues) in microseconds, with the honest caveat that it can’t beat random on short-but-semantically-hard prompts. And Adrafinil is a macOS menu bar app that keeps your Mac awake with the lid closed only while Claude Code, Codex, or Cursor sessions are active, releasing sleep control when agents finish. Both are the kind of thing you build when you actually use this stuff daily.

That’s the morning. Watch DSpark adoption over the next few weeks — if third-party providers start serving DeepSeek V4 at materially better speeds, the inference-cost gap with US labs widens further.