AI News — May 16, 2026: ChatGPT Taps 12,000 Banks via Plaid, arXiv Bans Year-Long Slop Offenders

Good morning. OpenAI wants into your bank account and is reshuffling its executive deck to get there faster, while arXiv has finally lost patience with researchers who can’t be bothered to delete “would you like me to make any changes?” from their PDFs. On the open-weights side, a curious paper claims 7.8× more tokens per forward pass on Qwen3-8B with the original output distribution intact — if you ignore the asterisks.

ChatGPT wants your bank login. OpenAI is rolling out personal finance tools that let ChatGPT Pro subscribers connect accounts from over 12,000 institutions via Plaid, with spending analysis, portfolio tracking, and planned Intuit integration for taxes and credit. The Verge notes this continues OpenAI’s push into sensitive personal data after January’s ChatGPT Health launch, and TechCrunch reports the feature draws on the recently acquired Hiro team and runs on GPT-5.5. Perplexity is reportedly building something similar.

Brockman gets the keys. OpenAI made Greg Brockman’s interim product role permanent and merged ChatGPT, Codex, and the developer API into a single team under him, Wired reports. The Verge frames it as four product pillars — enterprise, consumer, infrastructure, core platform — aimed at building a “single agentic platform” ahead of a possible IPO. The internal musical chairs continues, though one Redditor’s grudging note that “at least Brockman actually ships things” captures the prevailing mood.

arXiv loses patience with slop. Researchers caught submitting papers with obviously unedited LLM output — hallucinated citations, leftover meta-comments like “would you like me to make any changes?” — will be banned for a year and required to publish through peer review before posting to arXiv again, The Verge reports. The policy is narrow by design, targeting only incontrovertible negligence rather than AI use generally, and bans can be appealed. It’s a low bar, but apparently a necessary one.

7.8× tokens per forward pass, with conditions. Orthrus-Qwen3-8B claims up to 7.8× more tokens per forward pass on a frozen Qwen3-8B backbone, with a provably identical output distribution. Then come the caveats: greedy sampling only (temperature=0), tested only to 2048 context. One commenter went from “we will feast on tokens” to “never-fucking mind greedy sampling only???” inside a single edit. The community is asking the obvious questions — does it work with MoE, what about llama.cpp, how does it scale to 400B+ — and someone is already proposing a group-fund effort to apply it to Qwen 3.6 27B.

Runway bets against language models. The $5.3B video startup is pivoting toward “world models” trained on video and observational data rather than text, TechCrunch reports, arguing this will outperform LLMs by learning how reality works rather than how humans describe it. Runway claims $40M in ARR added in Q2 2026 and has partnerships with Lionsgate and AMC, but the bet pits it directly against Google with far less capital. The first world model shipped in December; the next few will say whether the thesis holds.

70% of Americans don’t want a data center nearby. A new Gallup poll has opposition to local AI data centers jumping from 47% in late 2025 to 70% now, PC Guide reports, making them less popular than nuclear plants. Concerns center on electricity, water, and utility costs. Reddit’s response was mostly to point out that data centers also run Netflix, Zoom, and games — which consume substantially more energy than AI inference — and that opposition to any large industrial building tends to poll around this number regardless.

Two open-weight releases worth a look. InternLM dropped Intern-S2-Preview, a 35B MoE built around “task scaling” — harder training problems rather than more parameters — with crystal structure generation and real-valued scientific prediction baked in. Reception on r/LocalLLaMA is warm, with people waiting on GGUFs. ByteDance also released Cola-DLM, a diffusion language model, though commenters quickly flagged an MMLU score of 19 — below the 25% random-guessing baseline — and an undisclosed parameter count.

That’s the morning. The Orthrus claim and the Intern-S2 release are both worth keeping an eye on as the community kicks the tires this week.