Phi-2

Text Generation

W4A16

Phi-2

Phi-2 is a Transformer with 2.7 billion parameters. It was trained using the same data sources as Phi-1.5, augmented with a new data source that consists of various NLP synthetic texts and filtered websites (for safety and educational value). When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-2 showcased a nearly state-of-the-art performance among models with less than 13 billion parameters.

Phi-2 hasn't been fine-tuned through reinforcement learning from human feedback. The intention behind crafting this open-source model is to provide the research community with a non-restricted small model to explore vital safety challenges, such as reducing toxicity, understanding societal biases, enhancing controllability, and more.

Performance Reference

Device

Backend

Precision

TTFT

Prefill

Decode

Context Size

File Size

Model Resource Acquisition

Model Farm provides optimized model resources and test code, which can be obtained through the following two methods:

Obtain via Model Farm page: Click Models & Test Code in the Performance Reference section on the right to obtain model resources and code packages.
Obtain via command line (Recommand): Users with APLUX development boards can obtain model resources and code packages through the built-in MMS tool.

# Search Models
mms list [model name]

# Get Models
mms get -m [model name] -p [precision] -c [soc] -b [backend] -d [file path]

For MMS usage, please refer to: MMS Usage & Access to Preview Models

Model Details

Architecture: a Transformer-based model with next-word prediction objective
Context length: 2048 tokens
Dataset size: 250B tokens, combination of NLP synthetic data created by AOAI GPT-3.5 and filtered web data from Falcon RefinedWeb and SlimPajama, which was assessed by AOAI GPT-4.
Training tokens: 1.4T tokens
GPUs: 96xA100-80G
Training time: 14 days

Source Model Evaluation

Direct adoption for production tasks without evaluation is out of scope of this project. As a result, the Phi-2 model has not been tested to ensure that it performs adequately for any production-level application. Please refer to the limitation sections.

Model Inference

Users can run large language models on Qualcomm chips using either of the following methods:

Run large models with APLUX AidGen: Please refer to the APLUX AidGen Developer Documentation
Run large models with Qualcomm Genie: Please refer to the Qualcomm Genie Documentation

License

Source Model:MIT

Deployable Model:APLUX-MODEL-FARM-LICENSE