AaryanK/Solar-Open-100B-GGUF · Hugging Face
Key Points
- 1This repository offers GGUF formatted files for Upstage's Solar-Open-100B, a massive 102B-parameter Mixture-of-Experts (MoE) model.
- 2Trained on an extensive 19.7 trillion tokens, Solar Open provides significant knowledge capacity while efficiently using only 12B active parameters during inference.
- 3The model weights are licensed under the Solar-Apache License 2.0, with recommended `llama.cpp` parameters including a temperature of 0.8, top-p of 0.95, and top-k of 50.
The paper describes Solar-Open-100B-GGUF, a repository containing GGUF format model files for Upstage's Solar-Open-100B. This model is a massive 102 billion-parameter Mixture-of-Experts (MoE) model. The core methodology of the Solar-Open-100B model is its MoE architecture, which enables it to achieve a vast knowledge capacity by incorporating a large number of parameters (102B total parameters) while optimizing inference efficiency. Specifically, during inference, the model dynamically activates only a subset of these parameters, utilizing just 12 billion active parameters. This architectural design provides a unique balance, combining the expressive power of a large model with the computational speed and efficiency typically associated with smaller models.
The model was trained from scratch on an extensive dataset comprising 19.7 trillion tokens. The GGUF format facilitates deployment across various hardware setups, offering multiple quantization levels, including Q2\_K (37.7 GB), Q3\_K\_S (44.5 GB), Q3\_K\_M (49.3 GB), Q3\_K\_L (53.4 GB), Q4\_0 (58 GB), Q4\_1 (64.4 GB), Q4\_K\_S (58.6 GB), Q4\_K\_M (62.3 GB), Q5\_0 (70.8 GB), Q5\_1 (77.1 GB), Q5\_K\_S (70.8 GB), Q5\_K\_M (72.9 GB), Q6\_K (84.3 GB), and Q8\_0 (109 GB).
For optimal inference performance with llama.cpp, Upstage recommends specific sampling parameters: a temperature of , a Top-P value of , and a Top-K value of . Example command-line interface (CLI) usage for inference is provided as ./llama-cli -m Solar-Open-100B.Q4_K_M.gguf -c 8192 --temp 0.8 --top-p 0.95 --top-k 50 -p "User: Who are you?\nAssistant:" -cnv. The model is licensed under the Solar-Apache License 2.0.