
MeloTTS-English is a high-quality multilingual text-to-speech (TTS) model jointly developed by MIT and MyShell.ai, supporting various English accents, including American, British, Indian, Australian, and default accents. The model leverages advanced Transformer architecture, integrating technologies such as VITS, VITS2, and Bert-VITS2, aiming to provide natural and fluent speech synthesis.
Source model
Source model repository: MeloTTS-English
Key Features
- Multi-accent Support: Includes American, British, Indian, Australian, and default accents.
- Real-time Inference: Optimized for real-time inference on CPUs without the need for GPU acceleration.
- High-Quality Speech Output: Generates natural and clear speech suitable for various applications.
- Easy Integration: Provides a Python API for seamless integration into applications.
- Open Source License: Licensed under MIT, supporting both commercial and non-commercial use.
Technical Architecture
MeloTTS-English is based on Transformer architecture, combined with advanced technologies such as VITS, VITS2, and Bert-VITS2, enabling the generation of high-quality speech output.
Model Farm provides optimized model resources and test code, which can be obtained through the following two methods:
Obtain via Model Farm page: Click Models & Test Code in the Performance Reference section on the right to obtain model resources and code packages.
Obtain via command line (Recommand): Users with APLUX development boards can obtain model resources and code packages through the built-in MMS tool.
# Search Models
mms list [model name]
# Get Models
mms get -m [model name] -p [precision] -c [soc] -b [backend] -d [file path]
For MMS usage, please refer to: MMS Usage & Access to Preview Models
To be released
To be released