
Whisper-tiny.en is the smallest English-only speech recognition model in the OpenAI Whisper series, designed for efficient speech-to-text processing in resource-constrained and low-power environments. Built on a Transformer encoder-decoder architecture, it retains the advantages of end-to-end recognition while drastically reducing parameter size and computational requirements.
With only ~39M parameters, Whisper-tiny.en runs efficiently on mobile devices, embedded systems, and edge platforms, delivering low-latency and fast responses. Although its accuracy is lower compared to medium or large models, it remains practical and reliable for everyday use cases such as conversational transcription, voice assistants, and subtitle generation.
The model is deployable with mainstream inference frameworks and can be integrated with streaming speech pipelines, making it ideal for applications where speed and lightweight design are top priorities.
The source model can be found here
Supported Languages |
---|
English |
Note: In the performance reference section on the right, the RTF values for each language are shown based on the current audio input length. Since the model uses fixed input dimensions (non-dynamic input), the RTF value may slightly increase when the audio length is shorter than the reference length.
To be released