Improving The Reconstruction Quality by Overﬁtted Decoder Bias in Neural Image Compression Oussama Jourairi

2025-05-08 0 0 524.45KB 5 页 10玖币

侵权投诉

Improving The Reconstruction Quality by Overﬁtted

Decoder Bias in Neural Image Compression

Oussama Jourairi

InterDigital, Inc.

Rennes, France

Muhammet Balcilar

InterDigital, Inc.

Rennes, France

Anne Lambert

InterDigital, Inc.

Rennes, France

Franc¸ois Schnitzler

InterDigital, Inc.

Rennes, France

Abstract—End-to-end trainable models have reached the per-

formance of traditional handcrafted compression techniques on

videos and images. Since the parameters of these models are

learned over large training sets, they are not optimal for any

given image to be compressed. In this paper, we propose an

instance-based ﬁne-tuning of a subset of decoder’s bias to improve

the reconstruction quality in exchange for extra encoding time

and minor additional signaling cost. The proposed method is

applicable to any end-to-end compression methods, improving the

state-of-the-art neural image compression BD-rate by 3−5%.

Keywords—Learning based image coding, Overﬁtting, Fine-

tuning.

I. INTRODUCTION

Image and video compression are an important part of

our everyday life. These technologies have been reﬁned over

decades by experts. Nowadays, compression algorithms, such

as those developed by MPEG, consist in ﬁne-tuned handcrafted

techniques. Recently, deep learning models have been used

to develop end-to-end trainable compression algorithms. State

of the art neural architectures now compete with traditional

compression methods (H.266/VVC [1]) even in terms of peak

signal-to-noise ratio (PSNR) for single image compression [2].

One of the main research directions for end-to-end com-

pression focuses on Rate-Distortion Autoencoder [3], a partic-

ular type of Variational Autoencoder (VAE) models [4]. Opti-

mizing such a model amounts to minimizing the mean square

error (MSE) of the decompressed image and the bitlength of

encoded latent values, estimated by their entropy w.r.t their

priors [5]. In practice, these latents are ﬁrst quantized and

then typically encoded by an entropy encoder such as range

or arithmetic coding [6]. These encoders exploit the prior

distributions over the encoded values (here, the latents) to

achieve close to optimal compression rates. The priors are also

trainable and can themselves have hyperpriors [5], [7]–[12].

As usual with deep learning, these models are typically

trained on large datasets and ﬁxed whereas traditional encoders

can adapt to a particular image by for example optimizing

the quadtree decomposition. So, any resulting neural model is

likely to be suboptimal for any single image, a problem called

the amortization gap [13]. In the compression context, this

can be leveraged to improve the rate-distortion trade-off, for

example by ﬁne-tuning the encoder or the latent codes [14]–

[17]. These approaches improve distortion without degrading

the rate. Another class of methods ﬁne-tunes the decoder and

the entropy model, improving distortion further but degrading

the rate, as modiﬁed parameters must be transmitted as well

[18]–[20]. Because of this added cost, these approaches have

not been applicable to single image compression but only to

set of images [20] or video [21], where the rate increase is

amortized over many images. Another solution is to select

one set of parameter values out of predeﬁned sets [18]. This

decreases encoding time and signaling cost but has again

limited gain compared to strong baselines.

In this paper, we achieve decoder ﬁne-tuning to improve

the reconstruction quality for single image compression which

was found infeasible in the literature so far. This is made pos-

sible thanks to our three contributions: 1) selection of subset

of parameters to be ﬁne-tuned, 2) learning the quantization

parameter of updates jointly and 3) using a new loss function

based on interpolation of the baseline model’s performances.

In our experiment, we show 3−5% BD-rate gain for any given

baseline end-to-end image compression model in exchange for

extra encoding complexity.

II. NEURAL IMAGE COMPRESSION

An input image to be compressed, x∈Rn×n×3is ﬁrst

processed by a deep encoder y=ga(x;φ).y∈Rm×m×ois

called the latent and is smaller than x. This latent is converted

into a bitstream by going through a quantizer, ˆy =Q(y),

and then through an entropy coder exploiting a prior pf(ˆy|Ψ)

in [8]. pfcan also depend on some side information z=

ha(y)∈Rk×k×fto better model spatial dependencies. ha,

another neural network, is also trained. We denote by ˆz =

Q(z)the quantization of z. Both ˆy are ˆz are encoded and

the encoders respectively use the hyperpriors ph(ˆy|ˆz; Θ), and

pf(ˆz|Ψ). The latent can be processed by a deep decoder ˆx =

gs(ˆy;θ)to obtain the decompressed image ˆx. The parameters

φ, θ, Ψ,Θare trained using the following rate-distortion loss:

L=E

x∼px

∼U

[−log(ph(ˆy|ˆz,Θ)) −log(pf(ˆz|Ψ)) + λd(x,ˆx)] ,(1)

where, d(., .)denotes a distortion loss such as MSE and λ

controls the trade-off between compression ratio and quality.

Note that during training Q(.)is relaxed into Q(x) = x+,

∼ U(−0.5,0.5).

Typically, pf(ˆz|Ψ) is factorized in findependent slices

of size k×k. Each slice has its own trainable cumulative

distribution function (cdf): ¯p(c)

Ψ(.). c = 1 . . . f, From the cdf

arXiv:2210.04898v1 [eess.IV] 10 Oct 2022

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ImprovingTheReconstructionQualitybyOverttedDecoderBiasinNeuralImageCompressionOussamaJourairiInterDigital,Inc.Rennes,FranceMuhammetBalcilarInterDigital,Inc.Rennes,FranceAnneLambertInterDigital,Inc.Rennes,FranceFranc¸oisSchnitzlerInterDigital,Inc.Rennes,FranceAbstractEnd-to-endtrainablemodelshavere...

展开>> 收起<<

Improving The Reconstruction Quality by Overﬁtted Decoder Bias in Neural Image Compression Oussama Jourairi.pdf

共5页,预览1页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Improving The Reconstruction Quality by Overﬁtted Decoder Bias in Neural Image Compression Oussama Jourairi

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: