Paper

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Xingkai Yu

2026.01.14

·Arxiv·by 이호민

#LLM#Sparsity#Conditional Memory#MoE#N-gram

Key Points

1This paper introduces Engram, a novel conditional memory module that modernizes N-gram embeddings for O(1) knowledge lookup, aiming to provide a native primitive for static information retrieval in large language models.
2By formulating a Sparsity Allocation problem, the authors discover a U-shaped scaling law, demonstrating that Engram acts as an essential complement to Mixture-of-Experts (MoE) by optimizing the trade-off between neural computation and static memory.
3Large-scale experiments show Engram achieves superior performance across diverse benchmarks, including general reasoning and long-context tasks, while its deterministic addressing allows for infrastructure-aware efficiency through runtime prefetching from host memory.

O(1)

Paper

Xingkai Yu

2026.01.14

·Arxiv·by 이호민

#LLM#Sparsity#Conditional Memory#MoE#N-gram

1This paper introduces Engram, a novel conditional memory module that modernizes N-gram embeddings for O(1) knowledge lookup, aiming to provide a native primitive for static information retrieval in large language models.
2By formulating a Sparsity Allocation problem, the authors discover a U-shaped scaling law, demonstrating that Engram acts as an essential complement to Mixture-of-Experts (MoE) by optimizing the trade-off between neural computation and static memory.
3Large-scale experiments show Engram achieves superior performance across diverse benchmarks, including general reasoning and long-context tasks, while its deterministic addressing allows for infrastructure-aware efficiency through runtime prefetching from host memory.

O(1)