Phi-2
Text Generation
W4A16
post
Phi-2

Phi-2 is a Transformer with 2.7 billion parameters. It was trained using the same data sources as Phi-1.5, augmented with a new data source that consists of various NLP synthetic texts and filtered websites (for safety and educational value). When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-2 showcased a nearly state-of-the-art performance among models with less than 13 billion parameters.

Phi-2 hasn't been fine-tuned through reinforcement learning from human feedback. The intention behind crafting this open-source model is to provide the research community with a non-restricted small model to explore vital safety challenges, such as reducing toxicity, understanding societal biases, enhancing controllability, and more.

Performance Reference

Device

Backend
Precision
TTFT
Prefill
Decode
Context Size
File Size
Model Resource Acquisition

Model Farm provides optimized model resources and test code, which can be obtained through the following two methods:

  • Obtain via Model Farm page: Click Models & Test Code in the Performance Reference section on the right to obtain model resources and code packages.

  • Obtain via command line (Recommand): Users with APLUX development boards can obtain model resources and code packages through the built-in MMS tool.

# Search Models
mms list [model name]

# Get Models
mms get -m [model name] -p [precision] -c [soc] -b [backend] -d [file path]

For MMS usage, please refer to: MMS Usage & Access to Preview Models

Model Details
  • Architecture: a Transformer-based model with next-word prediction objective

  • Context length: 2048 tokens

  • Dataset size: 250B tokens, combination of NLP synthetic data created by AOAI GPT-3.5 and filtered web data from Falcon RefinedWeb and SlimPajama, which was assessed by AOAI GPT-4.

  • Training tokens: 1.4T tokens

  • GPUs: 96xA100-80G

  • Training time: 14 days

Source Model Evaluation

Direct adoption for production tasks without evaluation is out of scope of this project. As a result, the Phi-2 model has not been tested to ensure that it performs adequately for any production-level application. Please refer to the limitation sections.

Model Inference

Users can run large language models on Qualcomm chips using either of the following methods:

License
Source Model:MIT
Deployable Model:APLUX-MODEL-FARM-LICENSE
Performance Reference

Device

Backend
Precision
TTFT
Prefill
Decode
Context Size
File Size