OpenAI Dominates AI Speed Rankings, Outpacing Competitors with Record-Breaking Token Performance
June 9, 2026
OpenAI still dominates the speed chart, occupying the top two spots with GPT-OSS 20B (HIGH) delivering 239 tokens per second in second place and GPT-OSS 20B (HIGH) at the forefront with 306 tokens per second, underscoring speed as a critical budgetary driver for AI deployments.
Google Gemini 3.1 Pro Preview and AWS Nova 2.0 Pro Preview demonstrate continued competition from Google and AWS, each emphasizing different model goals—speed for some, quality for others—and reinforcing enterprise ecosystem considerations.
The mid-pack is tightly clustered and Chinese models like Qwen3.7 Max show genuine competitiveness, indicating speed must be balanced with output quality to maximize enterprise ROI.
In the crowded middle tier, OpenAI GPT-5.4 Mini and NVIDIA Nemotron 3 Super illustrate a push for a balance of speed and efficiency to support scalable deployments.
Third to fifth places feature Google Gemini 3.5 Flash, Alibaba Qwen3.7 Max, and XAI Grok 4.3, highlighting strong performance across major labs and notable non-US entrants.
Overview: A June 2026 benchmark ranks the fastest AI models, with speed as a primary differentiator and OpenAI maintaining clear leadership at the top.
Summary based on 1 source
Get a daily email with more AI stories
Source
![These Are The Fastest AI Models [June 2026]](https://cdn.dev.brief.news/cdn-cgi/image/fit=contain,width=160/images/links/0327f35d8b3f983b030b4e2832c223c94c4a0a17857de82ceee6cf8234e70f4c752f2db63af0910f73bf1d8a4c057bcffdedf9ffcc3769397dea6addb6ef6b63.png)
OfficeChai • Jun 9, 2026
These Are The Fastest AI Models [June 2026]