Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning Yuchong Sun1 Hongwei Xue2 Ruihua Song1y Bei Liu3y Huan Yang3 Jianlong Fu3
                                    Long-FormVideo-LanguagePre-TrainingwithMultimodalTemporalContrastiveLearningYuchongSun1,HongweiXue2,RuihuaSong1y,BeiLiu3y,HuanYang3,JianlongFu31RenminUniversityofChina,Beijing,China,2UniversityofScienceandTechnologyofChina,Hefei,China,3MicrosoftResearch,Beijing,China,1{ycsun,rsong}@ruc.edu.cn,2gh0...
                                    
                                        2025-04-27
                                        1.65MB                                        20 页                                        0
                                                                                0
                                                                                                                        10玖币