Whisper-base

Whisper-base: ASR

Whisper-base is the base version of OpenAI’s Whisper series, offering a balance between lightweight performance and improved accuracy over the tiny variant. It supports tasks such as speech recognition, speech-to-text transcription, and speech translation across multiple languages. With a relatively small model size and fast inference speed, Whisper-base is suitable for deployment on mobile devices and edge platforms where efficiency and reasonable accuracy are required. It is widely used in real-time transcription, voice search, and voice-driven applications.

The source model can be found here

Performance Reference

Device

Language

Precision

Audio Duration

RTF

File Size

Supported Language

Supported Languages
Chinese
English
Japanese
Korean
French
Thai

Note: In the performance reference section on the right, the RTF values for each language are shown based on the current audio input length. Since the model uses fixed input dimensions (non-dynamic input), the RTF value may slightly increase when the audio length is shorter than the reference length.

Inference with AidASR SDK

To be released

License

Source Model:MIT

Deployable Model:APLUX-MODEL-FARM-LICENSE