Voice of India ASR Benchmark Reveals Global Systems' Struggles with Indian Languages
February 16, 2026
Voice of India is a national ASR benchmark evaluating 15 Indian languages using spoken real-world data from over 35,000 speakers, focusing on how well systems recognize speech across languages and dialects.
Developed by Josh Talks in collaboration with AI4Bharat at IIT Madras, the benchmark reveals a sizable performance gap between India-focused models and several global systems, especially for regional languages and dialects.
The project aims to assess Indian languages under real-world conditions and to push for better accuracy across linguistic diversity.
Meta’s 7B model shows only about 4% more accuracy than its 1B model across Indian languages, suggesting limited efficiency gains from scaling in this domain.
Global players struggle with regional Indian languages; Meta’s Tamil and Malayalam error rates can be two to three times higher than rivals like Sarvam and Google in some cases.
AI4Bharat argues current word error rate metrics can unfairly penalize code-mixed and multilingual speech, so the dataset includes curated spelling variants to focus on linguistic correctness rather than orthography.
Dravidian languages such as Tamil, Telugu, Malayalam, and Kannada exhibit higher error rates than Indo-Aryan languages, with dialects like Bhojpuri showing 20–30% word error rates versus under 10% for standard Hindi.
Sarvam Audio models consistently rank near the top across languages, while OpenAI models show substantial accuracy gaps, with some comparisons exceeding 50 percentage points in average accuracy.
The benchmark highlights voice as critical infrastructure for banking, healthcare, and government services, where high WER can misroute welfare applications, mis-transcribe medical symptoms, or misdirect user queries.
OpenAI’s transcription models perform poorly on Indian languages like Maithili and Tamil, with WERs over 55%, signaling real-world usability challenges.
Microsoft STT lacks support for six of the 15 tested languages, limiting its applicability in India.
The testing protocol covers about 2,000 speakers per language with district-level sampling and manual spelling variant curation to capture regional variation and code-switching.
Summary based on 3 sources
Get a daily email with more AI stories
Sources

Economic Times • Feb 16, 2026
Global speech AI struggles to understand India: Report
Business Standard • Feb 16, 2026
Global AI models struggles with Indian languages and dialects: Report
Economic Times • Feb 16, 2026
Global speech AI struggles to understand India: Report