From Face to Natural Image Learning Real Degradation for Blind Image Super-Resolution Xiaoming Li15 Chaofeng Chen2 Xianhui Lin3

2025-04-27 0 0 8.39MB 17 页 10玖币
侵权投诉
From Face to Natural Image: Learning Real
Degradation for Blind Image Super-Resolution
Xiaoming Li1,5, Chaofeng Chen2, Xianhui Lin3,
Wangmeng Zuo1,4(), and Lei Zhang5
1Faculty of Computing, Harbin Institute of Technology, China
2S-Lab, Nanyang Technological University, Singapore
3DAMO Academy, Alibaba Group, Shenzhen, China
4Peng Cheng Lab, Shenzhen, China
5Department of Computing, The Hong Kong Polytechnic University
{csxmli, chaofenghust, xhlin129}@gmail.com, wmzuo@hit.edu.cn,
cslzhang@comp.polyu.edu.hk
Abstract
How to design proper training pairs is critical for super-
resolving real-world low-quality (LQ) images, which suffers from the
difficulties in either acquiring paired ground-truth high-quality (HQ)
images or synthesizing photo-realistic degraded LQ observations. Re-
cent works mainly focus on modeling the degradation with handcrafted
or estimated degradation parameters, which are however incapable to
model complicated real-world degradation types, resulting in limited
quality improvement. Notably, LQ face images, which may have the same
degradation process as natural images, can be robustly restored with
photo-realistic textures by exploiting their strong structural priors. This
motivates us to use the real-world LQ face images and their restored
HQ counterparts to model the complex real-world degradation (namely
ReDegNet), and then transfer it to HQ natural images to synthesize their
realistic LQ counterparts. By taking these paired HQ-LQ face images as in-
puts to explicitly predict the degradation-aware and content-independent
representations, we could control the degraded image generation, and
subsequently transfer these degradation representations from face to nat-
ural images to synthesize the degraded LQ natural images. Experiments
show that our ReDegNet can well learn the real degradation process
from face images. The restoration network trained with our synthetic
pairs performs favorably against SOTAs. More importantly, our method
provides a new way to handle the real-world complex scenarios by learning
their degradation representations from the facial portions, which can be
used to significantly improve the quality of non-facial areas. The source
code is available at https://github.com/csxmli2016/ReDegNet.
Keywords: real world degradation, blind image super-resolution
1 Introduction
It is widely known that Convolutional Neural Networks (CNNs) are proficient
in handling the data they have seen, but perform inferior on these deviating
arXiv:2210.00752v2 [cs.CV] 14 Oct 2022
2 Xiaoming Li, et al.
Figure 1: (a): A real-world LQ image. (b)
(g): Restoration comparisons with inverse
halftone method [43], RealSR [19], Real-ESRGAN [48], BSRGAN [56], BSRGAN*
fine-tuned with halftone degradation [13], and Ours
s
that is specifically trained with
the synthetic pairs in (i). (h): Face restoration result by GPEN [53]. (i): Our synthetic
LQ sample with the degradation representation from (h).
from the training sets. This property makes the blind image super-resolution
networks difficult to handle the real-world LQ images which are usually corrupted
with complex and unsynthesizable degradation. However, building these pairs
of real-world LQ and HQ datasets is neither feasible nor practical, because the
real-world degradation types are too diverse and some of them are not brought by
the imaging system. Figure 1 (a) shows a real-world LQ image that is degraded
with halftone related artifacts. One can see that the synthetic LQ image (on the
top-left of (b)) by the inverse halftoning method [13] is hardly consistent with the
complex real-world degradation, which makes these types of restoration methods
(e.g., [43]) fail to generate photo-realistic result (see (b)).
To alleviate the difficulties in restoring the real-world LQ images, some works
attempt to predict the degradation parameters [16,17,19, 20, 34] and then handle
the LQ input with the non-blind restoration works. However, the real degradation
usually combines with various corruption types, each of which has lost its intrinsic
characteristics. This inevitably makes these methods sensitive to the prediction
errors of the degradation parameters, and consequently makes them fail to handle
the real-world LQ image (see (c) in Figure 1).
Recently, data-driven methods are suggested to design a practical degradation
model by handcrafting the complex combinations of blur, downsampling, noise
and JPEG compression with random [56] or high orders [48]. Albeit these methods
have more diverse degradation types [10, 31, 57] and show great generalization in
handling the real-world LQ images in most cases, they still fail to cover some
Learning Real Degradation from Face to Natural Images 3
complex real degradation which cannot be well synthesized (see (d) and (e) in
Figure 1). By incorporating the synthetic halftone degradation [13], BSRGAN*
has slight improvement (see (f)), but still contains obvious linearity artifacts.
In contrast, face image has specific and strong structure prior, and can be
better restored while exhibiting great generalization ability on real-world LQ
images in most cases [27, 47, 53]. Although the image is corrupted by intractable
degradation, the face restoration result is very plausible and photo-realistic (see
Figure 1 (h)). Since the face and non-face (natural) regions in an image share the
same degradation, once we have known the degradation process on face regions,
transferring it to natural HQ images would bring considerable benefits, e.g., we
can apply this degradation process on the HQ natural image to synthesize these
types of natural image pairs (see (i)) for training restoration network (see (g)).
In this paper, we make the first attempt to explore the
re
al
deg
radation with
ReDegNet, which contains (i) learning the real degradation from the pairs of
real-world LQ and pseudo HQ face images with DegNet, and (ii) transferring it
to HQ natural images to synthesizing their realistic LQ ones with SynNet. As for
(i), instead of taking a single LQ image to predict its degradation parameters [19],
our DegNet takes the real-world LQ and its pseudo HQ face images as input to
generate the degradation representation, which models the degradation process
of how the HQ image is degraded to the LQ one. To disentangle the image
content and degradation type, we adopt two manners, i.e., a) carefully designed
framework by predicting the degradation representation through several fully
connected layers to generate the convolution weights which can be regarded as
the styles in StyleGANs [22, 23], and b) contrastive loss [46] by minimizing the
representation distance between the pairs with different content but degraded
with the same degradation parameters, and meanwhile maximizing these with
the same content but different degradation. This process is fully supervised by
the paired LQ/HQ face images. As for (ii), our SynNet synthesizes the realistic
LQ natural images with these degradation representations extracted from face
images, which can help us to learn the real-world restoration mapping. Note that
our method may perform limited on scenarios without faces. By extending the
degradation space with face images share the similar degradation, our model
would be further improved. The main contributions are summarized as follows:
We propose the ReDegNet to explore the real degradation from face im-
ages by explicitly learning the degradation-aware and content-independent
representations which control the degraded image generation.
We transfer these real-world degradation representations to HQ natural
images to generate their realistic LQ ones for supervised real restoration.
We provide a new manner for handling intractable degraded images by
learning their degradation from face regions within them, which can be used
for synthesizing these types of LQ natural images for specifically fine-tuning.
Experimental results demonstrate that our ReDegNet can well learn the
degradation representations from face images and can effectively transfer
to natural ones, contributing to the comparable performance on general
restoration and superior performance in specific scenarios against the SOTAs.
4 Xiaoming Li, et al.
2 Related Work
2.1 Blind Face Restoration
Different from the complex textures in natural images, the specific structure in face
images make it feasible to well handle the real-world LQ face images [5,7,8,18,24,
60]. To alleviate the sensibility for the unknown degradation, reference images or
component features are suggested for guiding the blind restoration process [27
29].
Most recently, generative face prior [22,23] based methods [4,47,53] are proposed
to improve and stabilize the restoration quality, which can robustly restore
the real-world LQ face images in most scenarios. Their great generalization on
face images inspires us to explore the possibility of extending the restoration
performance from the local region (i.e., face) to the whole image.
2.2 Degradation Estimation Based Blind Image Super-Resolution
The real-world LQ images are mainly corrupted with unknown degradation
parameters, so some works focus on estimating these degradation parameters
and then apply non-blind restoration methods to recover it. Bell-Kligler et al. [2]
firstly propose the image-specific KernelGAN to predict the blur kernels and feed
them to ZSSR [41] for non-blind restoration. Gu et al. [16] introduce iterative
kernel correction method to estimate the blur kernel which further benefits the
restoration results. Luo et al. [34] alternate the optimization of restoring HQ
images with the predicted kernel and estimating the blur kernel with the restored
results, both of which can compensate each other. Wang et al. [46] suggest a
degradation-aware super-resolution network that learn the degradation related
parameters to guide the restoration process. However, real-world LQ images
usually have high frequency noises or compression artifacts, and these methods
are sensitive with them, which brings adverse effect for parameter prediction.
2.3 Data-driven Based Blind Image Super-Resolution
The main challenge of blind image super-resolution task can be ascribed to
the lack of suitable training pairs. So a straightforward way is to collect the
real-world LQ and HQ pairs. Cai et al. [3] adjust the focal length of the digital
cameras to capture the paired LQ/HQ images on the same scene. Wei et al. [50]
build a larger dataset with a large-scale diverse benchmark by zooming the
digital cameras. Except for the cumbersome capturing process, the spatial and
brightness misalignment easily leads to uncontrollable errors. Moreover, although
these images are realistic, they are more suitable for the specific super-resolution
task that has the similar capturing scenarios. These types of collecting data
occupies very few of these complex real-world degraded images, resulting in the
failure cases when handling other real degradation, e.g., noise or compression.
To alleviate the difficulties in synthesizing real-world LQ images, recent works
tend to learn the restoration mapping with unpaired LQ and HQ images. Yuan et
al. [54] suggest a Cycle-in-Cycle network by firstly mapping the LQ input to
摘要:

FromFacetoNaturalImage:LearningRealDegradationforBlindImageSuper-ResolutionXiaomingLi1,5,ChaofengChen2,XianhuiLin3,WangmengZuo1,4( ),andLeiZhang51FacultyofComputing,HarbinInstituteofTechnology,China2S-Lab,NanyangTechnologicalUniversity,Singapore3DAMOAcademy,AlibabaGroup,Shenzhen,China4PengChengLab,S...

展开>> 收起<<
From Face to Natural Image Learning Real Degradation for Blind Image Super-Resolution Xiaoming Li15 Chaofeng Chen2 Xianhui Lin3.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:17 页 大小:8.39MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注