The big story threading through today’s news is AI and cybersecurity — both who gets to use these tools and how dangerous they actually are. Anthropic is at the center of multiple headlines, and the competitive dynamics between the major labs are getting harder to ignore.
Anthropic’s Mythos draws government attention on two fronts. The UK’s AI Security Institute published an independent evaluation of Mythos finding it broadly comparable to other frontier models on individual tasks — but it became the first AI to complete AISI’s most demanding test, a 32-step simulated corporate network infiltration, succeeding 3 out of 10 attempts. Meanwhile, Anthropic co-founder Jack Clark confirmed the company briefed the Trump administration on Mythos, framing it as a national security necessity despite the company’s ongoing lawsuit against the DOD — which Clark called a “narrow contracting dispute.” On the employment front, Clark also walked back Dario Amodei’s more alarming unemployment predictions, saying they currently only see “some potential weakness in early graduate employment” in select industries.
OpenAI is taking a parallel approach to the offense/defense tension. The company is expanding its Trusted Access for Cyber program with GPT-5.4-Cyber, a fine-tuned model with reduced refusal thresholds and binary reverse engineering capabilities, rolling out to thousands of verified security researchers. It’s a controlled bet that capability-in-the-right-hands beats capability-withheld — though the tiered access model and zero-data retention caveats suggest OpenAI is being cautious about how much rope it’s handing out.
Anthropic’s investor story is shifting too. A TechCrunch report details how Anthropic’s revenue jump from $9B to $30B annualized in a single quarter — mostly driven by coding tool demand — is making some OpenAI investors nervous about whether OpenAI’s $852B valuation holds up. One investor put it bluntly: an IPO would need to price at $1.2 trillion to justify current numbers. Sapphire Ventures’ Jai Das went further, comparing OpenAI to Netscape.
On the developer tools side, Anthropic launched Claude Code Routines, which lets users schedule Claude Code to run autonomously on cloud infrastructure via cron-style triggers, GitHub events, or API calls. Community reaction was dry — one commenter noted they should call this new discipline “software engineering.” The timing is also awkward given recent reported reductions to Claude Code usage limits, with some questioning whether the feature is practically only available to Max plan subscribers.
Google had two notable releases. Gemini Robotics-ER 1.6 landed with mixed community reception — the gauge-reading demos drew skepticism from people who noted existing machine vision or a $50 digital gauge does the same job, and the broader question of whether robots can meet consumer zero-tolerance standards for physical mistakes remains unresolved. Separately, Google’s Gemma 4 is now running fully offline on iPhones via the AI Edge Gallery app — decode speeds are modest at ~16 tokens/sec over Metal rather than Apple’s Neural Engine, and App Store policy (rule 2.5.2) is reportedly blocking third-party developers from bundling local LLMs in their own apps.
MiniMax updated the license on their M2.7 model, adding a “Permitted Free Uses” clause to clarify personal and academic use. The community response was largely unimpressed — the model is still non-commercial at its core, the license is widely seen as poorly drafted compared to established options like Polyform or Prosperity, and the “MIT-style” framing is drawing particular criticism as misleading. One commenter pointed out that where they live, making money legally requires registering a company, making the individual/commercial distinction legally incoherent.
For local inference enthusiasts, DFlash is generating real excitement on r/LocalLLaMA — the technique doubles token generation speed for Qwen3.5 27B on Apple M5 Max by trading extra VRAM for throughput via speculative decoding, with one user confirming it scales down to lower quantizations (14 to 28 tokens/sec). Combined with DDTree, the theoretical ceiling is a 3x speedup over stock. And finally, the IRS’s Palantir-built SNAP audit selection tool got some attention — the $1.8M pilot aims to consolidate 100+ legacy audit systems, though community reaction ranged from skeptical to hostile, with freelancers in particular unimpressed that algorithmic pattern-matching is being applied to a system they say already disproportionately targets them.
It’s a day where the gap between “what AI can technically do” and “whether anyone should be doing it” keeps coming up in different shapes — robotics, cybersecurity, tax enforcement. That tension isn’t going anywhere.