ServiceNow IA recherche

Apriel-SSM: Converting Pre-Trained Transformer LLMs Into Subquadratic Hybrid Models Through Iterative End-to-End Distillation

Résumé

Large Language Models achieve their success through transformer architectures with attention mechanisms that compute token representations as weighted combinations of all preceding tokens. However, transformers suffer from quadratic complexity in the attention module and require caching key-value representations during inference, severely limiting throughput. State-space models (SSM) such as Mamba-2 offer linear complexity and constant memory footprint through recurrent paradigms with fixed-size hidden states. We propose converting pre-trained transformer LLMs into efficient hybrid architectures via end-to-end distillation. Applied to the recently released Apriel models (5B Apriel-Instruct and 15B Apriel-Nemotron-Thinker), our method demonstrates significant throughput improvements and increased maximum batch sizes with minimal performance degradation.

Publication
NOW AI
Oleksiy Ostapenko
Oleksiy Ostapenko
Research Scientist

Research Scientist at AI Foundation Model located at Montreal, QC, Canada.

Shambhavi Mishra
Shambhavi Mishra
Visiting Researcher

Visiting Researcher at AI Foundation Model located at Montreal, QC, Canada.

Luke Kumar
Luke Kumar
Applied Research Scientist

Applied Research Scientist at AI Research Deployment​ located at Toronto, ON, Canada.

Denis Kocetkov
Denis Kocetkov
AI Developer

AI Developer at AI Foundation Model located at London, United Kingdom.

Raymond Li
Raymond Li
AI Developer

AI Developer at AI Foundation Model located at Montreal, QC, Canada.

Joel Lamy Poirier
Joel Lamy Poirier
Applied Research Scientist

Applied Research Scientist at AI Foundation Model located at Montreal, QC, Canada.

Sébastien Paquet
Sébastien Paquet
Research Manager

Research Manager at AI Research Deployment​ located at Montreal, QC, Canada.

Torsten Scholak
Torsten Scholak
Research Lead

Research Lead at AI Foundation Model located at Montreal, QC, Canada.