
Whisper-distill-large-v3 is a distilled version of OpenAI’s Whisper-large-v3, optimized through knowledge distillation techniques. The model significantly reduces the number of parameters and computational demands while maintaining high recognition accuracy. This results in faster inference speed and lower latency. Whisper-distill-large-v3 is ideal for resource-constrained environments that still require strong recognition performance, such as mobile devices, edge computing, and real-time transcription services. It supports multilingual speech recognition and various speech processing tasks, balancing efficiency and effectiveness.
The source model can be found here
Supported Languages |
---|
English |
Note: In the performance reference section on the right, the RTF values for each language are shown based on the current audio input length. Since the model uses fixed input dimensions (non-dynamic input), the RTF value may slightly increase when the audio length is shorter than the reference length.
To be released