Service

GitHub - QwenLM/Qwen3-ASR: Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

QwenLM

2026.03.11

·GitHub·by 이호민

#Alibaba Cloud#ASR#Multilingual#Qwen#Speech Recognition

Key Points

1Qwen3-ASR introduces a new family of all-in-one speech recognition models (0.6B and 1.7B versions) supporting language identification and ASR for 52 languages and dialects.
2The release also includes Qwen3-ForcedAligner-0.6B, a novel non-autoregressive model capable of precise text-speech alignment and timestamp prediction in 11 languages.
3Evaluations show Qwen3-ASR-1.7B achieves state-of-the-art performance among open-source ASR models and is competitive with proprietary commercial APIs, with comprehensive inference toolkits and vLLM integration for efficient deployment.

dtype=torch.bfloat16

Service

QwenLM

2026.03.11

·GitHub·by 이호민

#Alibaba Cloud#ASR#Multilingual#Qwen#Speech Recognition

1Qwen3-ASR introduces a new family of all-in-one speech recognition models (0.6B and 1.7B versions) supporting language identification and ASR for 52 languages and dialects.
2The release also includes Qwen3-ForcedAligner-0.6B, a novel non-autoregressive model capable of precise text-speech alignment and timestamp prediction in 11 languages.
3Evaluations show Qwen3-ASR-1.7B achieves state-of-the-art performance among open-source ASR models and is competitive with proprietary commercial APIs, with comprehensive inference toolkits and vLLM integration for efficient deployment.

dtype=torch.bfloat16