Video Summarization Overview Suggested Citation Mayu Otani Yale Song and Yang Wang 2022 Video Summa- rization Overview Vol. 13 No. 4 pp 284335. DOI 10.15610600000099.

2025-05-06 0 0 606.61KB 55 页 10玖币
侵权投诉
Video Summarization Overview
Suggested Citation:
Mayu Otani, Yale Song and Yang Wang (2022), “Video Summa-
rization Overview”, : Vol. 13, No. 4, pp 284–335. DOI: 10.1561/0600000099.
Mayu Otani
CyberAgent, Inc.
otani_mayu@cyberagent.co.jp
Yale Song
Microsoft Research
yalesong@microsoft.com
Yang Wang
University of Manitoba
ywang@cs.umanitoba.ca
This article may be used only for the purpose of research, teaching,
and/or private study. Commercial use or systematic downloading
(by robots or other automatic processes) is prohibited without ex-
plicit Publisher approval. Boston — Delft
arXiv:2210.11707v1 [cs.CV] 21 Oct 2022
Contents
1 Introduction 285
2 Taxonomy of Video Summarization 288
2.1 VideoDomains ....................... 289
2.2 Purposes of Video Summaries . . . . . . . . . . . . . . . . 293
2.3 Output Format of Video Summarization . . . . . . . . . . 294
3 Video Summarization Approaches 297
3.1 Heuristic Approaches . . . . . . . . . . . . . . . . . . . . 300
3.2 Machine learning-based Approaches . . . . . . . . . . . . 304
3.3 Personalizing Video Summaries . . . . . . . . . . . . . . . 309
4 Benchmarks and Evaluation 317
4.1 Dataset ........................... 317
4.2 Evaluation Measures . . . . . . . . . . . . . . . . . . . . . 319
4.3 Limitations of Evaluation . . . . . . . . . . . . . . . . . . 322
5 Open Challenges 324
6 Conclusion 326
References 327
Video Summarization Overview
Mayu Otani1, Yale Song2and Yang Wang3
1CyberAgent, Inc.; otani_mayu@cyberagent.co.jp
2Microsoft Research; yalesong@microsoft.com
3University of Manitoba; ywang@cs.umanitoba.ca
ABSTRACT
With the broad growth of video capturing devices and ap-
plications on the web, it is more demanding to provide
desired video content for users efficiently. Video summariza-
tion facilitates quickly grasping video content by creating a
compact summary of videos. Much effort has been devoted
to automatic video summarization, and various problem
settings and approaches have been proposed. Our goal is to
provide an overview of this field. This survey covers early
studies as well as recent approaches which take advantage of
deep learning techniques. We describe video summarization
approaches and their underlying concepts. We also discuss
benchmarks and evaluations. We overview how prior work
addressed evaluation and detail the pros and cons of the
evaluation protocols. Last but not least, we discuss open
challenges in this field.
Mayu Otani, Yale Song and Yang Wang (2022), “Video Summarization Overview”, :
Vol. 13, No. 4, pp 284–335. DOI: 10.1561/0600000099.
1
Introduction
The wide spread use of internet and affordable video capturing devices
have dramatically changed the landscape of video creation and con-
sumption. In particular, user-created videos are more prevalent than
ever with the evolution of video streaming services and social networks.
The rapid growth of video creation necessitates advanced technologies
that enable efficient consumption of desired video content. The scenar-
ios include enhancing user experience for viewers on video streaming
services, enabling quick video browsing for video creators who need to
go through a massive amount of video rushes, and for security teams
who need to monitor surveillance videos.
Video summarization facilitates quickly grasping video content by
creating a compact summary of videos. One naive way to achieve video
summarization would be to increase the playback speed or to sample
short segments with uniform intervals. However, the former degrades the
audio quality and distorts the motion (Benaim et al.,2020), while the
latter might miss important content due to the random sampling nature
of the method. Rather than these naive solutions, video summarization
aims to extract the information desired by viewers for more effective
video browsing.
285
286 Introduction
The purpose of video summaries varies considerably depending on
application scenarios. For sports, viewers want to see moments that
are critical to the outcome of a game, whereas for surveillance, video
summaries need to contain scenes that are unusual and noteworthy.
The application scenarios grow as more videos are created, e.g., we
are beginning to see new types of videos such as video game live
streaming and video blogs (vlogs). This has led to a new problem of video
summarization as different types of videos have different characteristics
and viewers have particular demands for summaries. Such variety of
applications have stimulated heterogeneous research in this field.
Video summarization addresses two principal problems: “what is
the nature of a desirable video summary” and “how can we model
video content.” The answers depend on application scenarios. While
these are still open problems for most application scenarios, many
promising ideas have been proposed in the literature. Early work made
various assumptions about requirements for video summaries, e.g.,
uniqueness (less-redundancy), diversity, and interestingness. Some works
focused on creating video summaries that are relevant to user’s intention
and involve user interactions. Recent research focuses more on data-
driven approaches that from annotated datasets to learn desired video
summaries.
Computational modeling of desirable video content is also an impor-
tant challenge in video summarization. Starting with low-level features,
various feature representations have been applied, such as face recogni-
tion and visual saliency. Recently, feature extraction using deep neural
networks has been mainly adopted. Some applications further utilize
auxiliary information such as subtitles for documentary videos, game
logs for sports videos, and brain waves for egocentric videos captured
with wearable cameras.
The goal of this survey is to provide a comprehensive overview of the
video summarization literature. We review various video summarization
approaches and compare their underlying concepts and assumptions.
We start with early works that proposed seminal concepts for video
summarization, and also cover recent data-driven approaches that take
advantage of end-to-end deep learning. By categorizing the diverse
research in terms of application scenarios and techniques employed, we
摘要:

VideoSummarizationOverviewSuggestedCitation:MayuOtani,YaleSongandYangWang(2022),VideoSumma-rizationOverview,:Vol.13,No.4,pp284335.DOI:10.1561/0600000099.MayuOtaniCyberAgent,Inc.otani_mayu@cyberagent.co.jpYaleSongMicrosoftResearchyalesong@microsoft.comYangWangUniversityofManitobaywang@cs.umanitoba...

展开>> 收起<<
Video Summarization Overview Suggested Citation Mayu Otani Yale Song and Yang Wang 2022 Video Summa- rization Overview Vol. 13 No. 4 pp 284335. DOI 10.15610600000099..pdf

共55页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:55 页 大小:606.61KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 55
客服
关注