MoM: Linear Sequence Modeling with Mixture-of-MemoriesFeb 19, 2025·Jusen DuWeigao Sun,Disen Lan,Jiaxi Hu,Yu Cheng· 0 min read PDF Cite CodeLast updated on Feb 19, 2025 AuthorsWeigao SunYoung Scientist ← Liger: Linearizing Large Language Models to Gated Recurrent Structures Mar 3, 2025LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid Feb 11, 2025 →