upstage/Solar-Open-100B · Hugging Face
Service

upstage/Solar-Open-100B · Hugging Face

2026.01.04
·Hugging Face·by 이호민
#LLM#AI#Open Source

Key Points

  • 1Solar Open 100B is Upstage AI's new 102B-parameter Mixture-of-Experts (MoE) large language model, featuring 12B active parameters for efficiency.
  • 2Pre-trained on a massive 19.7 trillion tokens, this model excels in reasoning, instruction-following, and agentic capabilities, offering enterprise-grade performance.
  • 3Benchmark results demonstrate its strong performance across various Korean and English tasks, positioned as a transparent and customizable open-source solution licensed under Solar-Apache License 2.0.

Solar Open is Upstage AI's flagship 102B-parameter large language model, developed entirely from scratch and released under the Solar-Apache License 2.0. It employs a Mixture-of-Experts (MoE) architecture, featuring a total of 102.6 billion parameters but maintaining an active parameter count of only 12 billion per token during inference. This design leverages the extensive knowledge capacity of a massive model while ensuring the inference speed and cost-efficiency typically associated with much smaller models.

The core methodology of Solar Open is its MoE architecture, specifically configured with 129 distinct experts. For each token, the model routes input through a specific subset of these experts. More precisely, it utilizes a "top 8 among 128 Routed + 1 Shared" expert routing mechanism. This implies that for every processing step, 8 experts are dynamically selected from a pool of 128 specialized experts, and these are augmented by a single shared expert that processes all tokens. This selective activation of experts is key to its efficiency, allowing it to perform computations with a smaller active set of parameters than its total size suggests.

Solar Open was pre-trained on an extensive dataset of 19.7 trillion tokens, contributing to its broad knowledge coverage and robust reasoning abilities across diverse domains. It supports a context length of 128k tokens, enabling the processing of very long inputs. The model was trained on NVIDIA B200 GPUs. For inference, it minimally requires 4x NVIDIA A100 (80GB) GPUs.

Performance benchmarks demonstrate Solar Open's competitive capabilities across both Korean and English language tasks. In Korean benchmarks, it shows strong performance in general understanding (KMMLU, CLIcK, HAE-RAE v1.1), finance (KBankMMLU), law (KBL), medical (KorMedMCQA), and instruction following (Ko-IFEval), often outperforming or closely matching other high-parameter models like gpt-oss-120b and GLM-4.5-Air. For English benchmarks, Solar Open achieves high scores on MMLU (88.2), MMLU-Pro (80.4), and IFEval (88.0). While it generally performs well, some benchmarks like GPQA-Diamond, HLE, and code-related tasks show varying results compared to its counterparts. Its agentic capabilities are assessed via Tau² benchmarks, showing competitive but not leading performance across Airline, Telecom, and Retail scenarios.

For inference, recommended generation parameters include a temperature of 0.8, top_p of 0.95, and top_k of 50. Deployment options include transformers for direct model loading and vLLM for optimized serving. The transformers quickstart code involves loading the model with torchdtype=torch.bfloat16torch_dtype=torch.bfloat16 and devicemap="auto"device_map="auto", and utilizing tokenizer.apply_chat_template for message formatting. vLLM deployment, highly recommended via Docker, uses specific arguments such as --trust-remote-code, --enable-auto-tool-choice, --tool-call-parser solar_open, --reasoning-parser solar_open, and custom logits processors (ParallelToolCallLogitsProcessor, SolarOpenTemplateLogitsProcessor) to leverage the model's architectural specificities, including tool calling and reasoning. For multi-GPU setups, --tensor-parallel-size is used.