
Qwen2.5 is the latest series of Qwen large language models. Qwen2.5 releases a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2:
- Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains.
- Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots.
- Long-context Support up to 128K tokens and can generate up to 8K tokens.
- Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
Model Farm provides optimized model resources and test code, which can be obtained through the following two methods:
Obtain via Model Farm page: Click Models & Test Code in the Performance Reference section on the right to obtain model resources and code packages.
Obtain via command line (Recommand): Users with APLUX development boards can obtain model resources and code packages through the built-in MMS tool.
# Search Models
mms list [model name]
# Get Models
mms get -m [model name] -p [precision] -c [soc] -b [backend] -d [file path]
For MMS usage, please refer to: MMS Usage & Access to Preview Models
- Type: Causal Language Models
- Training Stage: Pretraining & Post-training
- Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings
- Number of Parameters: 0.49B
- Number of Paramaters (Non-Embedding): 0.36B
- Number of Layers: 24
- Number of Attention Heads (GQA): 14 for Q and 2 for KV
- Context Length: Full 32,768 tokens and generation 8192 tokens
For more details, please refer to our blog, GitHub, and Documentation.
Note: This table showed source model instead of quantized model evaluation. Source Model Evaluation refer to Qwen2.5-0.5B-Instruct Evaluation Result
| Datasets | Qwen2-0.5B-Instruct | Qwen2.5-0.5B-Instruct | Qwen2-1.5B-Instruct | Qwen2.5-1.5B-Instruct |
|---|---|---|---|---|
| MMLU-Pro | 14.4 | 15.0 | 22.9 | 32.4 |
| MMLU-redux | 12.9 | 24.1 | 41.2 | 50.7 |
| GPQA | 23.7 | 29.8 | 21.2 | 29.8 |
| MATH | 13.9 | 34.4 | 25.3 | 55.2 |
| GSM8K | 40.1 | 49.6 | 61.6 | 73.2 |
| HumanEval | 31.1 | 35.4 | 42.1 | 61.6 |
| MBPP | 39.7 | 49.6 | 44.2 | 63.2 |
| MultiPL-E | 20.8 | 28.5 | 38.5 | 50.4 |
| LiveCodeBench 2305-2409 | 1.6 | 5.1 | 4.5 | 14.8 |
| LiveBench 0831 | 7.4 | 12.6 | 12.4 | 18.8 |
| IFeval strict-prompt | 14.6 | 27.9 | 29.0 | 42.5 |
Users can run large language models on Qualcomm chips using either of the following methods:
Run large models with APLUX AidGen: Please refer to the APLUX AidGen Developer Documentation
Run large models with Qualcomm Genie: Please refer to the Qualcomm Genie Documentation