AI News — April 24, 2026

Good morning. It’s a heavy model-release day: OpenAI shipped GPT-5.5, DeepSeek dropped a 1.6T-parameter V4 Pro with open weights, Tencent added Hunyuan3 to the pile, and Qwen3.6-27B keeps picking up steam on the agency benchmarks. Anthropic also published a postmortem that finally explains why Claude Code felt off for the past several weeks. And from Washington, a memo on “adversarial distillation” is making the rounds — with predictable skepticism about whose interests it actually serves.

GPT-5.5 arrives, with an interesting infrastructure footnote. OpenAI rolled out GPT-5.5 gradually across ChatGPT and Codex, posting 82.7% on Terminal-bench-2.0 and 93.6% on GPQA Diamond — competitive with Anthropic’s gated Mythos model on some benchmarks, behind on others. The more interesting detail, flagged by several HN commenters, is that OpenAI used Codex to analyze weeks of production traffic and write custom GPU load-balancing heuristics, with outsized impact on token generation efficiency. The Verge has a straightforward writeup if you want the marketing angle, and the system card is up too. One line from the announcement is getting mocked: an NVIDIA engineer reportedly said losing access “feels like I’ve had a limb amputated” — a quote OpenAI apparently thought was a good idea to include.

DeepSeek V4 lands with weights on Hugging Face. DeepSeek released V4 with two variants, flash and pro, and pushed the 1.6T-parameter Pro base model to Hugging Face. Pricing via OpenRouter is $1.74/M input and $3.48/M output for Pro — one HN commenter pointed out this makes the “frontier labs are subsidizing inference at a loss” narrative harder to square, given DeepSeek is serving a 1.6T MoE profitably at these rates. Early testers are calling it better than Opus 4.6 at a fraction of the cost, though the usual concerns about Chinese-origin models are present in the thread.

Tencent’s Hunyuan3 rounds out the week. Tencent released Hy3, a 295B MoE with 21B active parameters and a size tuned to fit within AM5 consumer memory limits (256GB RAM plus a GPU). The license is restrictive enough that one r/LocalLLaMA commenter dryly called it “weights available” rather than open-weights, and the native 16-bit format is awkward at this scale. Still, a released base model at this size is noteworthy — Qwen hasn’t shipped base models for its 27B, 122B, or 397B releases.

Qwen3.6-27B ties Sonnet 4.6 on Artificial Analysis agency. Building on yesterday’s release, Qwen3.6-27B has now tied Claude Sonnet 4.6 on Artificial Analysis’s agency benchmarks. Users are running it at 85 tok/s on dual 3090s with 180K context, and the anticipated 122B version has the local LLM crowd visibly excited. One r/LocalLLaMA commenter put the obvious question: a 27B model scoring higher than a 670B model from less than a year ago — how? The answer thread is a decent primer on why parameter count has become a poor proxy for usefulness: training methodology, task specificity, and post-training refinement now carry more weight, though larger models still win on world knowledge and long-context coherence.

Anthropic explains the Claude Code regression. Anthropic published a detailed postmortem covering three separate bugs that degraded Claude Code between March and April: a silent reasoning-effort downgrade from high to medium on March 4 (to mask a frozen-UI issue), a session memory bug that re-triggered on every turn instead of once per session, and an overly aggressive verbosity-reduction prompt from April. All three are fixed, and Anthropic is resetting usage limits for subscribers. HN reaction is split — some appreciate the transparency, others point out this is the second such postmortem in roughly six months with similar root causes (undocumented application-layer changes that saved compute at quality’s expense). One commenter said the reset brought their weekly usage from 25% to 7%, which didn’t feel like adequate compensation for weeks of frustration.

A US memo on “adversarial distillation” lights up r/LocalLLaMA. A government memo frames distilling outputs from frontier models as a national security issue, potentially setting up restrictions on open-weight releases. The r/LocalLLaMA response is close to unanimous: this reads as groundwork for regulatory capture by OpenAI and Anthropic, and the hypocrisy of companies that scraped the entire internet now invoking IP concerns about model outputs is not going unnoticed. Whether the memo becomes policy is another question, but the framing is worth watching given how fast open-weight Chinese models are closing the gap.

That’s a lot of weights to download. If you’re picking one to try this weekend, DeepSeek V4 Pro and Qwen3.6-27B are the two worth the disk space — for very different reasons.