Accueil
Équipe
Publications
Évènements
Blog
Carrières
Nous joindre
Français
Français
English
ServiceNow
ServiceNow IA recherche
Tags
Parallelism
ServiceNow IA recherche
Parallelism
Layered gradient accumulation and modular pipeline parallelism: fast and efficient training of large language models
The advent of the transformer has sparked a quick growth in the size of language models, far outpacing hardware improvements. (Dense) …
Joel Lamy Poirier
ArXiv, 2024.
Article
Citation
Citation
×