Tempo Accelerating Transformer-Based Model Training through Memory Footprint Reduction Muralidhar Andoorveedu1 Zhanda Zhu23 Bojian Zheng13 Gennady Pekhimenko13
                                    Tempo:AcceleratingTransformer-BasedModelTrainingthroughMemoryFootprintReductionMuralidharAndoorveedu1,ZhandaZhu2,3,BojianZheng1,3,GennadyPekhimenko1,31UniversityofToronto,Toronto,Canada2ShanghaiJiaoTongUniversity,Shanghai,China3VectorInstitute,Toronto,Canada{andoorve,zhanda,bojian,pekhimenko}@cs.tor...
                                    
                                        2025-05-02
                                        2.72MB                                        24 页                                        1
                                                                                0
                                                                                                                        10玖币