AI Advances: SpaceX's Grok 4.5 Beta, Coinbase Cost Cuts, Etched AI Racks & Anthropic's Sonnet 5
July 5, 2026
This Week in All Things AI surveys developments across models, agents, tools, infrastructure, and policy from late June into early July 2026, emphasizing speed, cost, and agentic improvements in open-weight ecosystems and deployment economics.
Grok 4.5 has moved into private beta at SpaceX and Tesla, built on a 1.5‑trillion-parameter V9 foundation with Cursor data, with early evaluations placing its performance at or above Claude Opus levels.
Coinbase demonstrated cost optimization by cutting AI spend in half through defaulting to cheaper open-weight models, improving routing and cache hit rates from 5% to 60% and using an LLM gateway to optimize model selection.
Notable mentions include WeChat’s AI strategy via mini-programs, Arm’s AI CPU demand, LangChain’s OpenWiki uptake, and community commentary on Anthropic’s Sonnet 5 performance plots.
Speculative decoding and inference efficiency saw DSpark for V4 Flash & Pro delivering per-user speedups from 51% to 400%, with DSpark and DeepSpec open-sourced to support faster inference and training of speculative decoding.
Etched announced custom AI inference racks backed by over $1 billion in contracts and $800 million raised, focusing on low-voltage inference and clustered memory, with the first racks due this summer.
Open-source and ecosystem shifts highlight modular interfaces, commoditized silicon, and shared recipes that drive rapid progress in open-weight ecosystems, with token costs far lower than closed-model counterparts.
Agent frameworks and tooling advanced, including Nous Hermes for fast web reading, Vercel eve for portable skills, OpenClaw’s phone gateway nodes, WeChat mini-app agents, multi-skill coding harness from obra/superpowers, and LangChain’s OpenWiki for codebase memory.
Anthropic Claude Sonnet 5 debuted as the most agentic mid-tier model with a 1M context window, broad app and API integrations, and introductory pricing, though community notes highlighted inconsistencies in published performance charts.
The piece blends concise explanations with sourced references, interviews, and links to provide a forward-looking snapshot of AI industry dynamics during the week.
Summary based on 1 source
Get a daily email with more AI stories
Source

This Week in All Things AI • Jul 5, 2026
This Week in All Things AI - Week 27-2026