FAST MULTI-ENCODING TO REDUCE THE COST OF VIDEO STREAMING

2025-05-06 0 0 1.03MB 12 页 10玖币
侵权投诉
FAST MULTI-ENCODING TO REDUCE THE COST OF VIDEO
STREAMING
Hadi Amirpour1, Vignesh V Menon1, Ekrem etinkaya1, Adithyan
Ilangovan2, Christian Feldmann2, Martin Smole2, and Christian Timmerer1,2
1Christian Doppler Laboratory ATHENA, Alpen-Adria-Universitt, Klagenfurt, Austria
{firstname.lastname}@aau.at
2Bitmovin, Klagenfurt, Austria
{firstname.lastname}@bitmovin.com
ABSTRACT
The growth in video Internet traffic and advancements in video attributes such as
framerate, resolution, and bit-depth boost the demand to devise a large-scale, highly
efficient video encoding environment. This is even more essential for Dynamic
Adaptive Streaming over HTTP (DASH)-based content provisioning as it requires
encoding numerous representations of the same video content. High Efficiency
Video Coding (HEVC) is one standard video codec that significantly improves
encoding efficiency over its predecessor Advanced Video Coding (AVC). This
improvement is achieved at the expense of significantly increased time complexity,
which is a challenge for content and service providers. As various representations
are the same video content encoded at different bitrates or resolutions, the encoding
analysis information from the already encoded representations can be shared to
accelerate the encoding of other representations. Several state-of-the-art schemes
first encode a single representation, called a reference representation. During this
encoding, the encoder creates analysis metadata with information such as the slice-
type decisions, CU, PU, TU partitioning, and the HEVC bitstream itself. The
remaining representations, called dependent representations, analyze the above
metadata and then reuse it to skip searching some partitioning, thus, reducing the
computational complexity. With the emergence of cloud-based encoding services,
video encoding is accelerated by utilizing an increased number of resources, i.e.,
with multi-core CPUs, multiple representations can be encoded in parallel. This
paper presents an overview of a wide range of multi-encoding schemes with and
without the support of machine learning approaches integrated into the HEVC Test
Model (HM) and x265, respectively. Seven multi-encoding schemes are presented,
and their performance in encoding time complexity and bitrate overhead compared
to the state-of-the-art approaches are shown. Enabling fast multi-encoding for HAS
in modern Over-the-top (OTT) workflows will reduce time-to-market and costs
immensely.
INTRODUCTION
HTTP Adaptive Streaming (HAS) is the de-facto standard in delivering videos over the
internet to a variety of devices. The main idea behind HAS is to divide the video content into
segments and encode each segment at various bitrates and resolutions, called
representations, which are stored in plain HTTP servers as shown in Figure 1. These
representations are stored in order to continuously adapt the video delivery to the network
conditions and device capabilities of the client. To meet the high demand for streaming high-
quality video content over the Internet and overcome the associated challenges in HAS, the
Moving Picture Experts Group (MPEG) has developed a standard called Dynamic Adaptive
Streaming over HTTP (MPEG-DASH) [1]. The increase in video traffic and improvements in
video characteristics such as resolution, framerate, and bit-depth raise the need to develop
a large-scale, highly efficient video encoding environment [2]. This is even more crucial for
DASH-based content provisioning as it requires encoding multiple representations of the
same video content [3].
High Efficiency Video Coding (HEVC) [4] is one standard video codec that is widely being
used in content production nowadays. Based on Bitmovin’s video developer report in 2021
[5], HEVC is used in 49% of productions in 2021 and it is expected to be added to more than
25% of extra productions in 2022. HEVC significantly improves coding efficiency over its
predecessor Advanced Video Coding (AVC) [6]. This improvement is achieved at the cost
of significantly increased runtime complexity, which is a challenge for content and service
providers. As various representations of the same video content are encoded at different
bitrates or resolutions, the encoding analysis information from the
already encoded representations can be shared to accelerate the encoding of other
representations.
Several state-of-the-art schemes [612], first encode a single representation, called a
reference representation. The encoder creates analysis metadata (file) with information such
as the slice-type decisions [13], CU, PU, TU partitioning [14], and the HEVC bitstream itself
during this encoding. The remaining representations, called dependent representations,
analyze the above metadata and then reuse it to skip searching some partitioning, thus,
reducing the computational complexity. With the emergence of cloud-based encoding
services [15] and in live applications video encoding is accelerated by utilizing an increased
number of resources, i.e., with multi-core CPUs, multiple representations can be encoded
in parallel.
In this paper, the schemes are analyzed for both serial and parallel encoding environments.
The term multi-rate is used when all representations are encoded at a single resolution but
at different bitrates. Multi-encoding is used when a single video is provided at various
resolutions, and each resolution is encoded at different bitrates.
BACKGROUND AND RELATED WORK
In HAS, the same video content is encoded at multiple bitrates to continuously adapt the
video delivery to the users’ need, resulting in a significant increase in the encoding cost.
However, as the same content is encoded multiple times, the encoder analysis information
from the already encoded representation(s) can be reused to speed up the encoding
process of the remaining representations. In HEVC, frames are first divided into slices, and
then they are further divided into square regions called Coding Tree Units (CTU) [4], which
are the main building blocks of HEVC. To encode each CTU, it is recursively divided into
smaller square regions called Coding Units (CUs) (see Figure 2). Depth values from 0 to 3
are assigned to CU sizes from   to    pixels. Therefore, to find the best CTU
partitioning, 85 CUs including one   CU, four   CUs, sixteen   CUs, and
sixty-four  CUs are searched.
Figure 2. In HEVC, frames are divided into CTUs, and each CTU is then divided into CUs. Each CU is subdivided
into PUs, and the prediction residuals of CUs are partitioned into TUs. The optimal CTU partitioning is found after
an exhaustive search process through all CUs, PUs, and TUs.
The inter-prediction modes comprise of Merge/Skip 2N×2N, Inter 2N×2N, Symmetric Motion
Partition (SMP, including Inter 2N×N and Inter N×2N), Asymmetric Motion Partition (AMP,
including Inter 2N×nU, Inter 2N×nD, Inter nL×2N, and Inter nR×2N), and Inter N×N. In
contrast, the intra-prediction modes involve Intra 2N×2N and Intra N×N. The best PU mode
is selected according to all modes’ minimum rate-distortion cost (RD-cost). Furthermore, for
transform coding of the prediction residuals, each CU can be partitioned into multiple
Figure 1. Encoded video representations as used in DASH. and denote the frame width and frame height,
respectively. denotes the number of resolutions stored.
  and   denote the lowest and highest
resolutions, respectively. denotes the number of bitrate representations in each resolution.  to  represent
the target bitrates in the ascending order for the representations in each resolution.
摘要:

FASTMULTI-ENCODINGTOREDUCETHECOSTOFVIDEOSTREAMINGHadiAmirpour1,VigneshVMenon1,EkremÇetinkaya1,AdithyanIlangovan2,ChristianFeldmann2,MartinSmole2,andChristianTimmerer1,21ChristianDopplerLaboratoryATHENA,Alpen-Adria-Universität,Klagenfurt,Austria{firstname.lastname}@aau.at2Bitmovin,Klagenfurt,Austri...

展开>> 收起<<
FAST MULTI-ENCODING TO REDUCE THE COST OF VIDEO STREAMING.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:1.03MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注