Tempo Accelerating Transformer-Based Model Training through Memory Footprint Reduction Muralidhar Andoorveedu1 Zhanda Zhu23 Bojian Zheng13 Gennady Pekhimenko13
Tempo:AcceleratingTransformer-BasedModelTrainingthroughMemoryFootprintReductionMuralidharAndoorveedu1,ZhandaZhu2,3,BojianZheng1,3,GennadyPekhimenko1,31UniversityofToronto,Toronto,Canada2ShanghaiJiaoTongUniversity,Shanghai,China3VectorInstitute,Toronto,Canada{andoorve,zhanda,bojian,pekhimenko}@cs.tor...
2025-05-02
2.72MB 24 页 1
0
10玖币