Video Summarization Overview Suggested Citation Mayu Otani Yale Song and Yang Wang 2022 Video Summa- rization Overview Vol. 13 No. 4 pp 284335. DOI 10.15610600000099.

2025-05-06 0 0 606.61KB 55 页 10玖币

侵权投诉

Video Summarization Overview

Suggested Citation:

Mayu Otani, Yale Song and Yang Wang (2022), “Video Summa-

rization Overview”, : Vol. 13, No. 4, pp 284–335. DOI: 10.1561/0600000099.

Mayu Otani

CyberAgent, Inc.

otani_mayu@cyberagent.co.jp

Yale Song

Microsoft Research

yalesong@microsoft.com

Yang Wang

University of Manitoba

ywang@cs.umanitoba.ca

This article may be used only for the purpose of research, teaching,

and/or private study. Commercial use or systematic downloading

(by robots or other automatic processes) is prohibited without ex-

plicit Publisher approval. Boston — Delft

arXiv:2210.11707v1 [cs.CV] 21 Oct 2022

Contents

1 Introduction 285

2 Taxonomy of Video Summarization 288

2.1 VideoDomains ....................... 289

2.2 Purposes of Video Summaries . . . . . . . . . . . . . . . . 293

2.3 Output Format of Video Summarization . . . . . . . . . . 294

3 Video Summarization Approaches 297

3.1 Heuristic Approaches . . . . . . . . . . . . . . . . . . . . 300

3.2 Machine learning-based Approaches . . . . . . . . . . . . 304

3.3 Personalizing Video Summaries . . . . . . . . . . . . . . . 309

4 Benchmarks and Evaluation 317

4.1 Dataset ........................... 317

4.2 Evaluation Measures . . . . . . . . . . . . . . . . . . . . . 319

4.3 Limitations of Evaluation . . . . . . . . . . . . . . . . . . 322

5 Open Challenges 324

6 Conclusion 326

References 327

Video Summarization Overview

Mayu Otani1, Yale Song2and Yang Wang3

1CyberAgent, Inc.; otani_mayu@cyberagent.co.jp

2Microsoft Research; yalesong@microsoft.com

3University of Manitoba; ywang@cs.umanitoba.ca

ABSTRACT

With the broad growth of video capturing devices and ap-

plications on the web, it is more demanding to provide

desired video content for users eﬃciently. Video summariza-

tion facilitates quickly grasping video content by creating a

compact summary of videos. Much eﬀort has been devoted

to automatic video summarization, and various problem

settings and approaches have been proposed. Our goal is to

provide an overview of this ﬁeld. This survey covers early

studies as well as recent approaches which take advantage of

deep learning techniques. We describe video summarization

approaches and their underlying concepts. We also discuss

benchmarks and evaluations. We overview how prior work

addressed evaluation and detail the pros and cons of the

evaluation protocols. Last but not least, we discuss open

challenges in this ﬁeld.

Mayu Otani, Yale Song and Yang Wang (2022), “Video Summarization Overview”, :

Vol. 13, No. 4, pp 284–335. DOI: 10.1561/0600000099.

Introduction

The wide spread use of internet and aﬀordable video capturing devices

have dramatically changed the landscape of video creation and con-

sumption. In particular, user-created videos are more prevalent than

ever with the evolution of video streaming services and social networks.

The rapid growth of video creation necessitates advanced technologies

that enable eﬃcient consumption of desired video content. The scenar-

ios include enhancing user experience for viewers on video streaming

services, enabling quick video browsing for video creators who need to

go through a massive amount of video rushes, and for security teams

who need to monitor surveillance videos.

Video summarization facilitates quickly grasping video content by

creating a compact summary of videos. One naive way to achieve video

summarization would be to increase the playback speed or to sample

short segments with uniform intervals. However, the former degrades the

audio quality and distorts the motion (Benaim et al.,2020), while the

latter might miss important content due to the random sampling nature

of the method. Rather than these naive solutions, video summarization

aims to extract the information desired by viewers for more eﬀective

video browsing.

285

286 Introduction

The purpose of video summaries varies considerably depending on

application scenarios. For sports, viewers want to see moments that

are critical to the outcome of a game, whereas for surveillance, video

summaries need to contain scenes that are unusual and noteworthy.

The application scenarios grow as more videos are created, e.g., we

are beginning to see new types of videos such as video game live

streaming and video blogs (vlogs). This has led to a new problem of video

summarization as diﬀerent types of videos have diﬀerent characteristics

and viewers have particular demands for summaries. Such variety of

applications have stimulated heterogeneous research in this ﬁeld.

Video summarization addresses two principal problems: “what is

the nature of a desirable video summary” and “how can we model

video content.” The answers depend on application scenarios. While

these are still open problems for most application scenarios, many

promising ideas have been proposed in the literature. Early work made

various assumptions about requirements for video summaries, e.g.,

uniqueness (less-redundancy), diversity, and interestingness. Some works

focused on creating video summaries that are relevant to user’s intention

and involve user interactions. Recent research focuses more on data-

driven approaches that from annotated datasets to learn desired video

summaries.

Computational modeling of desirable video content is also an impor-

tant challenge in video summarization. Starting with low-level features,

various feature representations have been applied, such as face recogni-

tion and visual saliency. Recently, feature extraction using deep neural

networks has been mainly adopted. Some applications further utilize

auxiliary information such as subtitles for documentary videos, game

logs for sports videos, and brain waves for egocentric videos captured

with wearable cameras.

The goal of this survey is to provide a comprehensive overview of the

video summarization literature. We review various video summarization

approaches and compare their underlying concepts and assumptions.

We start with early works that proposed seminal concepts for video

summarization, and also cover recent data-driven approaches that take

advantage of end-to-end deep learning. By categorizing the diverse

research in terms of application scenarios and techniques employed, we

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

VideoSummarizationOverviewSuggestedCitation:MayuOtani,YaleSongandYangWang(2022),VideoSumma-rizationOverview,:Vol.13,No.4,pp284335.DOI:10.1561/0600000099.MayuOtaniCyberAgent,Inc.otani_mayu@cyberagent.co.jpYaleSongMicrosoftResearchyalesong@microsoft.comYangWangUniversityofManitobaywang@cs.umanitoba...

展开>> 收起<<

Video Summarization Overview Suggested Citation Mayu Otani Yale Song and Yang Wang 2022 Video Summa- rization Overview Vol. 13 No. 4 pp 284335. DOI 10.15610600000099..pdf

共55页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Video Summarization Overview Suggested Citation Mayu Otani Yale Song and Yang Wang 2022 Video Summa- rization Overview Vol. 13 No. 4 pp 284335. DOI 10.15610600000099.

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: