
Whisper-large-v3-turbo is one of the most powerful speech recognition models in OpenAI’s Whisper series, combining the high accuracy of large-scale models with optimized inference speed. Based on Whisper-large-v3, this version is designed to offer faster response times and reduced computational resource usage while maintaining excellent multilingual recognition and robustness. Whisper-large-v3-turbo supports complex tasks such as speech-to-text transcription, real-time captioning, and speech translation, making it suitable for deployment on high-performance servers and cloud platforms to deliver stable and efficient speech processing for advanced applications.
The source model can be found here
Supported Languages |
---|
Chinese |
English |
Japanese |
Korean |
French |
Thai |
Note: In the performance reference section on the right, the RTF values for each language are shown based on the current audio input length. Since the model uses fixed input dimensions (non-dynamic input), the RTF value may slightly increase when the audio length is shorter than the reference length.
To be released