Paper

EXAONE Deep: Reasoning Enhanced Language Models

Sunkyoung Kim

2025.06.22

·Arxiv·by Anonymous

#LLM#Reasoning#Chain-of-Thought#Fine-tuning#Deep Learning

Key Points

1EXAONE Deep introduces a series of language models (2.4B, 7.8B, 32B) from LG AI Research, specifically fine-tuned and optimized for superior performance in various reasoning tasks, including math and coding.
2These models were developed by fine-tuning EXAONE 3.5 Instruct models on a reasoning-specialized dataset incorporating long chain-of-thought processes, utilizing Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (Online RL).
3Evaluation shows EXAONE Deep's smaller models outperform comparable sizes, while the largest 32B model achieves competitive performance against leading open-weight reasoning models, with all models being openly available for research.

<thought>

Paper

Sunkyoung Kim

2025.06.22

·Arxiv·by Anonymous

#LLM#Reasoning#Chain-of-Thought#Fine-tuning#Deep Learning

1EXAONE Deep introduces a series of language models (2.4B, 7.8B, 32B) from LG AI Research, specifically fine-tuned and optimized for superior performance in various reasoning tasks, including math and coding.
2These models were developed by fine-tuning EXAONE 3.5 Instruct models on a reasoning-specialized dataset incorporating long chain-of-thought processes, utilizing Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (Online RL).
3Evaluation shows EXAONE Deep's smaller models outperform comparable sizes, while the largest 32B model achieves competitive performance against leading open-weight reasoning models, with all models being openly available for research.

<thought>