Cohere Releases Open-Weight ASR Model 'Transcribe': 5.4% WER Beats Whisper and ElevenLabs
Cohere has released Transcribe, a 2-billion parameter open-weight automatic speech recognition (ASR) model that achieves a 5.42% word error rate (WER), outperforming industry leaders including OpenAI's Whisper and ElevenLabs' Scribe.
Performance Benchmarks
| Model | WER | License |
|---|---|---|
| Cohere Transcribe | 5.42% | Apache-2.0 |
| ElevenLabs Scribe v2 | 5.83% | Proprietary |
| Qwen3-ASR-1.7B | 5.76% | Open-weight |
| Whisper Large v3 | 7.44% | MIT |
Transcribe currently tops the Hugging Face ASR leaderboard.
Key Features
- 14 languages supported: English, French, German, Italian, Spanish, Greek, Dutch, Polish, Portuguese, Chinese, Japanese, Korean, Vietnamese, Arabic
- Self-hostable: Can run on local GPU infrastructure
- Commercial-ready: Licensed under Apache-2.0 for immediate commercial use
- 2B parameters: More manageable inference footprint than larger models
- Available via API or in Cohere's Model Vault as
cohere-transcribe-03-2026
Why This Matters
Enterprise transcription has been a trade-off:
- Closed APIs (Whisper via OpenAI, ElevenLabs) offered accuracy but locked in data and incurred ongoing costs
- Open models offered control but lagged on performance
Transcribe breaks this tradeoff by delivering best-in-class accuracy with the ability to run on-premises.
Use Cases
- Voice-powered workflow automation
- Transcription pipelines
- Audio search workflows
- RAG pipelines with audio inputs
- Agent workflows with voice interfaces
Analysis
Cohere's Transcribe is strategically significant for several reasons. First, it demonstrates that open-weight models can match or exceed proprietary alternatives — a trend we're seeing across AI modalities. Second, the Apache-2.0 license means any company can deploy it without vendor lock-in. Third, by running on local infrastructure, it solves the data residency and latency concerns that prevent many enterprises from using cloud-based transcription APIs.
For teams building AI agent pipelines with voice interfaces, Transcribe provides a production-ready path to transcription without the compromises of closed APIs.