Paper

End-to-End Test-Time Training for Long Context

Marcel Rød

2026.01.31

·Arxiv·by 네루

#LLM#Continual Learning#Test-Time Training#Transformer#Long Context

Key Points

1This paper proposes TTT-E2E, an end-to-end Test-Time Training method that reframes long-context language modeling as a continual learning problem.
2TTT-E2E operates by continually learning through next-token prediction at test time, and its initialization is meta-learned during training to optimize for this test-time adaptation.
3This approach achieves performance scaling with context length comparable to full attention while maintaining constant inference latency, resulting in significant speed improvements for very long contexts.

\mathbf{x}_0, \mathbf{x}_1, \ldots, \mathbf{x}_T

Paper

Marcel Rød

2026.01.31

·Arxiv·by 네루

#LLM#Continual Learning#Test-Time Training#Transformer#Long Context

1This paper proposes TTT-E2E, an end-to-end Test-Time Training method that reframes long-context language modeling as a continual learning problem.
2TTT-E2E operates by continually learning through next-token prediction at test time, and its initialization is meta-learned during training to optimize for this test-time adaptation.
3This approach achieves performance scaling with context length comparable to full attention while maintaining constant inference latency, resulting in significant speed improvements for very long contexts.

\mathbf{x}_0, \mathbf{x}_1, \ldots, \mathbf{x}_T