AI Advances: SpaceX's Grok 4.5 Beta, Coinbase Cost Cuts, Etched AI Racks & Anthropic's Sonnet 5

July 5, 2026
AI Advances: SpaceX's Grok 4.5 Beta, Coinbase Cost Cuts, Etched AI Racks & Anthropic's Sonnet 5
  • This Week in All Things AI surveys developments across models, agents, tools, infrastructure, and policy from late June into early July 2026, emphasizing speed, cost, and agentic improvements in open-weight ecosystems and deployment economics.

  • Grok 4.5 has moved into private beta at SpaceX and Tesla, built on a 1.5‑trillion-parameter V9 foundation with Cursor data, with early evaluations placing its performance at or above Claude Opus levels.

  • Coinbase demonstrated cost optimization by cutting AI spend in half through defaulting to cheaper open-weight models, improving routing and cache hit rates from 5% to 60% and using an LLM gateway to optimize model selection.

  • Notable mentions include WeChat’s AI strategy via mini-programs, Arm’s AI CPU demand, LangChain’s OpenWiki uptake, and community commentary on Anthropic’s Sonnet 5 performance plots.

  • Speculative decoding and inference efficiency saw DSpark for V4 Flash & Pro delivering per-user speedups from 51% to 400%, with DSpark and DeepSpec open-sourced to support faster inference and training of speculative decoding.

  • Etched announced custom AI inference racks backed by over $1 billion in contracts and $800 million raised, focusing on low-voltage inference and clustered memory, with the first racks due this summer.

  • Open-source and ecosystem shifts highlight modular interfaces, commoditized silicon, and shared recipes that drive rapid progress in open-weight ecosystems, with token costs far lower than closed-model counterparts.

  • Agent frameworks and tooling advanced, including Nous Hermes for fast web reading, Vercel eve for portable skills, OpenClaw’s phone gateway nodes, WeChat mini-app agents, multi-skill coding harness from obra/superpowers, and LangChain’s OpenWiki for codebase memory.

  • Anthropic Claude Sonnet 5 debuted as the most agentic mid-tier model with a 1M context window, broad app and API integrations, and introductory pricing, though community notes highlighted inconsistencies in published performance charts.

  • The piece blends concise explanations with sourced references, interviews, and links to provide a forward-looking snapshot of AI industry dynamics during the week.

Summary based on 1 source


Get a daily email with more AI stories

Source

This Week in All Things AI - Week 27-2026

This Week in All Things AI • Jul 5, 2026

This Week in All Things AI - Week 27-2026

More Stories