About Me
Papers
News

Recent & Upcoming Talks
- Example Talk
Papers
Projects
Projects
Experience
Teaching
- Learn JavaScript
- Learn Python
Blog

LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Nov 24, 2024·

Xiaoye Qu

,

Daize Dong

,

Xuyang Hu

,

Tong Zhu

Weigao Sun

Weigao Sun

,

Yu Cheng

· 0 min read

Last updated on Nov 24, 2024

Weigao Sun

Authors

Young Scientist

← Minimax-01: Scaling Foundation Models with Lightning Attention Jan 14, 2025

Scaling Laws for Linear Complexity Language Models Jun 24, 2024 →

© 2025 Me. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.