Service

GitHub - deepseek-ai/DualPipe: A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

deepseek-ai

2025.03.08

·GitHub·by Anonymous

#LLM#Pipeline Parallelism#Deep Learning#Distributed Training#Algorithm

Key Points

1DualPipe is a bidirectional pipeline parallelism algorithm that achieves full overlap of forward and backward computation-communication phases, reducing pipeline bubbles in DeepSeek V3/R1 training.
2DualPipeV is a concise, V-shaped schedule derived from DualPipe using a "cut-in-half" procedure, offering an efficient alternative.
3Both methods aim to optimize large-scale model training by significantly reducing pipeline bubbles and improving resource utilization compared to prior pipeline parallelism techniques.

F\&B

Service

deepseek-ai

2025.03.08

·GitHub·by Anonymous

#LLM#Pipeline Parallelism#Deep Learning#Distributed Training#Algorithm

1DualPipe is a bidirectional pipeline parallelism algorithm that achieves full overlap of forward and backward computation-communication phases, reducing pipeline bubbles in DeepSeek V3/R1 training.
2DualPipeV is a concise, V-shaped schedule derived from DualPipe using a "cut-in-half" procedure, offering an efficient alternative.
3Both methods aim to optimize large-scale model training by significantly reducing pipeline bubbles and improving resource utilization compared to prior pipeline parallelism techniques.

F\&B