Single Image Super-Resolution via a Dual Interactive Implicit Neural Network Quan H. Nguyen and William J. Beksi The University of Texas at Arlington

2025-05-03 0 0 3.31MB 10 页 10玖币
侵权投诉
Single Image Super-Resolution via a Dual Interactive Implicit Neural Network
Quan H. Nguyen and William J. Beksi
The University of Texas at Arlington
Arlington, TX, USA
quan.nguyen4@mavs.uta.edu, william.beksi@uta.edu
Abstract
In this paper, we introduce a novel implicit neural net-
work for the task of single image super-resolution at arbi-
trary scale factors. To do this, we represent an image as a
decoding function that maps locations in the image along
with their associated features to their reciprocal pixel at-
tributes. Since the pixel locations are continuous in this
representation, our method can refer to any location in an
image of varying resolution. To retrieve an image of a par-
ticular resolution, we apply a decoding function to a grid of
locations each of which refers to the center of a pixel in the
output image. In contrast to other techniques, our dual in-
teractive neural network decouples content and positional
features. As a result, we obtain a fully implicit represen-
tation of the image that solves the super-resolution prob-
lem at (real-valued) elective scales using a single model.
We demonstrate the efficacy and flexibility of our approach
against the state of the art on publicly available benchmark
datasets.
1. Introduction
Single image super-resolution (SISR) is a fundamental
low-level computer vision problem that aims to recover a
high-resolution (HR) image from its low-resolution (LR)
counterpart. There are two main reasons for performing
SISR: (i) to enhance the visual quality of an image for hu-
man consumption, and (ii) to improve the representation of
an image for machine perception. SISR has many practi-
cal applications including robotics, remote sensing, satel-
lite imaging, thermal imaging, medical imaging, and much
more [40, 39]. Despite being a challenging and ill-posed
subject, SISR has remained a crucial area of study in the
research community.
Recent deep learning approaches have provided high-
quality SISR results [3, 37]. In perception systems, images
are represented as 2D arrays of pixels whose quality, sharp-
ness, and memory footprint are controlled by the resolution
of the image. Consequently, the scale of the generated HR
Implicit
Neural
Representation
40 x 40
80 x 80 100 x 100 120 x 120
Figure 1: Our proposed dual interactive implicit neural net-
work (DIINN) is capable of producing images of arbitrary
resolution, using a single trained model, by capturing the
underlying implicit representation of the input image.
image is fixed depending on the training data. For exam-
ple, if a neural network is trained to recover HR images of
×2scale then it only performs well on what it is trained for
(i.e., performance will be poor on ×3,×4, or other scales).
Thus, instead of training multiple models for various reso-
lutions, it can be extremely useful in terms of practicality to
have a single SISR architecture that handles arbitrary scale
factors. This is especially true for embedded vision plat-
forms (e.g., unmanned ground/aerial vehicles) with multi-
ple on-board cameras that must execute difficult tasks using
limited computational resources.
The notion of an implicit neural representation, also
known as coordinate-based representation, is an active field
of research that has yielded substantial results in modeling
3D shapes [2, 5, 18, 30, 31]. Inspired by these successes,
learning implicit neural representations of 2D images is a
natural solution to the SISR problem since an implicit sys-
tem can produce output at arbitrary resolutions. While this
idea has been touched upon in several works [4, 33, 6, 26],
in this paper we propose a more expressive neural network
for SISR with significant improvements over the existing
arXiv:2210.12593v1 [cs.CV] 23 Oct 2022
state of the art, Figure 1. Our contributions are summarized
as follows.
We develop a novel dual interactive implicit neural net-
work (DIINN) for SISR that handles image content
features in a modulation branch and positional features
in a synthesis branch, while allowing for interactions
between the two.
We learn an implicit neural network with a pixel-level
representation, which allows for locally continuous
super-resolution synthesis with respect to the nearest
LR pixel.
We demonstrate the effectiveness of our proposed net-
work by setting new benchmarks on public datasets.
Our source code is available at [10]. The remainder of
this paper is organized as follows. Related research is dis-
cussed in Section 2. In Section 3, we present a detailed de-
scription of our model for SISR at arbitrary scales using an
implicit representation. Experimental results are presented
in Section 4. The paper concludes in Section 5 and dis-
cusses future work.
2. Related Work
This section highlights pertinent literature on the task of
SISR. First, we discuss deep learning techniques for SISR.
Then, we provide an overview of implicit neural represen-
tations. Lastly, we cite the nascent domain of implicit rep-
resentations for images. The SISR problem is an ill-defined
problem in the sense that there are many possible HR im-
ages that can be downsampled to a single LR image. In this
work, we focus on learning deterministic mappings, rather
than stochastic mappings (i.e., generative models). In gen-
eral, the input to an SISR system is an LR image, and the
output is a super-resolved (SR) image that may or may not
have the same resolution as a target HR image.
2.1. Deep Learning for SISR
Existing work on SISR typically utilizes convolutional
neural networks (CNNs) coupled with upsampling opera-
tors to increase the resolution of the input image.
2.1.1 Upscaling +Refinement
SRCNN [11], VDSR [20], and DRCN [21] first interpolate
an LR image to a desired resolution using bicubic interpola-
tion, followed by a CNN-based neural network to enhance
the interpolated image and produce an SR image. The re-
fining network acts as a nonlinear mapping, which aims to
improve the quality of the interpolation. These methods can
produce SR images at arbitrary scales, but the performance
is severely affected by the noise introduced during the inter-
polation process. The refining CNNs also have to operate at
the desired resolution, thus leading to a longer runtime.
2.1.2 Learning Features +Upscaling
Methods following this approach first feed an LR image
through a CNN to obtain a deep feature map at the same
resolution. In this way, the CNNs are of cheaper cost since
they are applied at LR, which allows for deeper architec-
tures. Next, an upscaling operator is used to produce an SR
image. The most common upscaling operators are decon-
volution (FSRCNN [12], DBPN [14]), and sub-pixel con-
volution (ESPCN [32], EDSR [23]). It is also possible to
perform many iterations of learning features +upscaling
and explicitly exploit the relationship between intermediate
representations [14]. These methods only work with integer
scale factors and produce fixed-sized outputs.
EDSR [23] attempts to mitigate these problems by train-
ing a separate upscaling head for each scale factor. On the
other hand, Meta-SR [15] is among the first attempts to
solve SISR at arbitrary real-valued scale factors via a soft
version of the sub-pixel convolution. To predict the sig-
nal at each pixel in the SR image, Meta-SR uses a meta-
network to determine the weights for features of a (3 ×3)
window around the nearest pixel in the LR image. Effec-
tively, each channel of the predicted pixel in the SR image
is a weighted sum of a (C×3×3) volume, where Cis the
number of channels in the deep feature map. While Meta-
SR has a limited generalization capability to scale factors
larger than its training scales, it can be viewed as a hybrid
implicit/explicit model.
2.2. Implicit Neural Representations
Implicit neural representations are an elegant way to pa-
rameterize signals continuously in comparison to conven-
tional representations, which are usually discrete. Chen et
al. [7], Mescheder et al. [27], and Park et al. [28] are among
the first to show that implicit neural representations outper-
form 3D representations (e.g., meshes, voxels, and point
clouds) in 3D modeling. Many works that achieve state-
of-the-art results in 3D computer vision have followed. For
example, Chabra et al. [5] learned local shape priors for the
reconstruction of 3D surfaces coupled with a deep signed
distance function. A new implicit representation for 3D
shape learning called a neural distance field was proposed
by Chibane et al. [9]. Jiang et al. [18] leveraged voxel
representations to enable implicit functions to fit large 3D
scenes, and Peng et al. [30] increased the expressiveness of
3D scenes with various convolutional models. It also is pos-
sible to condition the implicit neural representations on the
input signals [5, 8, 18, 30], which can be considered as a
hybrid implicit/explicit model.
2
摘要:

SingleImageSuper-ResolutionviaaDualInteractiveImplicitNeuralNetworkQuanH.NguyenandWilliamJ.BeksiTheUniversityofTexasatArlingtonArlington,TX,USAquan.nguyen4@mavs.uta.edu,william.beksi@uta.eduAbstractInthispaper,weintroduceanovelimplicitneuralnet-workforthetaskofsingleimagesuper-resolutionatarbi-trary...

展开>> 收起<<
Single Image Super-Resolution via a Dual Interactive Implicit Neural Network Quan H. Nguyen and William J. Beksi The University of Texas at Arlington.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:10 页 大小:3.31MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注