AI News — April 17, 2026: Claude Opus 4.7 Ships With Filter Frustration, Qwen3.6 Fits in 20GB

Today feels like a day of leapfrogging. Two major model drops, a browser that hacks TVs, and a coding agent war heating up — the labs are clearly watching each other’s release calendars.

Anthropic shipped Claude Opus 4.7 yesterday, and the reception has been… complicated. On paper it’s Anthropic’s most powerful publicly available model, with improvements to coding, vision, and self-verification. In practice, The Verge reports the company openly admits it doesn’t advance their capability frontier — Mythos Preview still outperforms it on every internal eval. Opus 4.7 is essentially a testbed for new cybersecurity safeguards, which Anthropic deliberately dialed down from Mythos levels before broader release. Community reaction on HN was rough: multiple users report no meaningful difference from 4.6, and the new cybersecurity filters are drawing real frustration — one commenter described the model refusing legitimate bug bounty work even after it fetched and acknowledged the program guidelines itself. Security professionals can apply to a new Cyber Verification Program for enhanced access, though the bar for getting there remains unclear.

Meanwhile, Alibaba’s Qwen team dropped Qwen3.6-35B-A3B, a 35-billion-parameter mixture-of-experts model that activates only about 3 billion parameters at inference time. It reportedly outperforms the dense Qwen3.5-27B on coding benchmarks and — according to the Qwen blog — matches Claude Sonnet 4 on vision-language tasks, all while fitting in around 20GB quantized. The LocalLLaMA community is enthusiastic, with users noting it runs on consumer hardware and praising a new “Thinking Preservation” feature that retains reasoning context across messages to improve cache efficiency. There’s also quiet relief that the Qwen team is still publishing open weights following recent internal turbulence at Alibaba. The blog hints at more Qwen3.6 variants coming, and people are already calling for a 122B release.

OpenAI updated Codex in ways clearly aimed at Claude Code, adding background desktop agents for macOS that can open apps, click, and type independently while you work, plus in-app browser control, GitLab and Microsoft Suite integrations, and persistent memory across sessions. The Verge and TechCrunch both frame it as a direct shot at Anthropic’s growing developer mindshare. On HN, several commenters noted Claude Desktop already does much of this — though others pushed back, saying the Codex terminal experience remains the best UX for file-editing workflows. One outstanding concern: an unresolved GitHub issue about Codex reading sensitive file system data without prompting is still open.

On the subject of Codex and security, researchers at Calif gave Codex a shell on a Samsung Smart TV, asked it to escalate privileges to root, and it pulled it off — auditing the vendor’s kernel driver source, identifying a physical-memory vulnerability, and chaining an exploit that bypassed Samsung’s Unauthorized Execution Prevention. HN commenters were quick to note the significant assists involved: Codex had full firmware source code and an existing foothold. Still, watching an AI autonomously build ARM binaries and iterate through an exploit chain is a different kind of demo than a coding benchmark.

Physical Intelligence published results on π0.7, their new robotics model that can combine skills from different training contexts to handle tasks it was never explicitly taught — including operating an air fryer it had barely seen in training data. TechCrunch covered it as potential evidence that robot capabilities might start scaling with data the way language models did. The Qwen community and the Physical Intelligence researchers are both pointing in the same direction this week: compositional ability — doing new things from learned pieces — is where the interesting bets are being placed right now.