I am a Young Scientist at Shanghai AI Laboratory. Currently, I fortunately collaborate with Prof. Yu Cheng, work on Efficient AI for LLMs/VLMs, including algorithm & system co-design on Efficient Architectures, e.g., Linear Attention and Mixture-of-Experts, and Large-Scale Distributed Training.
See our research projects: Linear-MoE, LASP-2, Linearization (Liger), MoM, NHA and Comba for technical details. My previous work Lightning Attention and LASP series are the key techniques in MiniMax-01 456B LLM and VLM.
🔥 I am looking for talented interns to work with me on above projects and beyond. Please feel free to hit me up with your CV or any questions if you are interested.
From 2020 to 2022, I was an AI Researcher at Linx Lab, 2012 Lab, Huawei, supervised by Jiashu Lin and Dr. Heng Liao, worked on large scale distributed training algorithms. I earned my PhD degree from Huazhong University of Science and Technology (HUST) at 2020.04, co-supervised by Prof. Hai-Tao Zhang and Prof. Ye Yuan, and jointly trained at School of Artificial Intelligence and Automation (AIA) and HUST Innovation Institute (with the First-class Grant).
(Updated at 2025.05)