Whisper-base
ASR
W8A16
post
Whisper-base: ASR

Whisper-base is the base version of OpenAI’s Whisper series, offering a balance between lightweight performance and improved accuracy over the tiny variant. It supports tasks such as speech recognition, speech-to-text transcription, and speech translation across multiple languages. With a relatively small model size and fast inference speed, Whisper-base is suitable for deployment on mobile devices and edge platforms where efficiency and reasonable accuracy are required. It is widely used in real-time transcription, voice search, and voice-driven applications.

The source model can be found here

Performance Reference

Device

Language
Precision
Audio Duration
RTF
File Size
Supported Language
Supported Languages
Chinese
English
Japanese
Korean
French
Thai

Note: In the performance reference section on the right, the RTF values for each language are shown based on the current audio input length. Since the model uses fixed input dimensions (non-dynamic input), the RTF value may slightly increase when the audio length is shorter than the reference length.

Inference with AidASR SDK

To be released

License
Source Model:MIT
Deployable Model:APLUX-MODEL-FARM-LICENSE
Performance Reference

Device

Language
Precision
Audio Duration
RTF
File Size