1 LMQFormer A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal

2025-04-28 0 0 6.13MB 11 页 10玖币
侵权投诉
1
LMQFormer: A Laplace-Prior-Guided Mask Query
Transformer for Lightweight Snow Removal
Junhong Lin, Nanfeng Jiang, Zhentao Zhang, Weiling Chen, Member, IEEE and Tiesong Zhao, Senior
Member, IEEE
Abstract—Snow removal aims to locate snow areas and recover
clean images without repairing traces. Unlike the regularity
and semitransparency of rain, snow with various patterns and
degradations seriously occludes the background. As a result,
the state-of-the-art snow removal methods usually retains a
large parameter size. In this paper, we propose a lightweight
but high-efficient snow removal network called Laplace Mask
Query Transformer (LMQFormer). Firstly, we present a Laplace-
VQVAE to generate a coarse mask as prior knowledge of snow.
Instead of using the mask in dataset, we aim at reducing both
the information entropy of snow and the computational cost of
recovery. Secondly, we design a Mask Query Transformer (MQ-
Former) to remove snow with the coarse mask, where we use two
parallel encoders and a hybrid decoder to learn extensive snow
features under lightweight requirements. Thirdly, we develop a
Duplicated Mask Query Attention (DMQA) that converts the
coarse mask into a specific number of queries, which constraint
the attention areas of MQFormer with reduced parameters.
Experimental results in popular datasets have demonstrated the
efficiency of our proposed model, which achieves the state-of-the-
art snow removal quality with significantly reduced parameters
and the lowest running time. Codes and models are available at
https://github.com/StephenLinn/LMQFormer.
Index Terms—Lightweight snow removal, Laplace operator,
mask query transformer, image denoising, image enhancement.
I. INTRODUCTION
SNOW seriously affects the visibility of scenes and objects.
It usually leads to poor visual qualities and severe perfor-
mance degradations in high-level computer vision tasks such
as object detection and semantic understanding. However, it
is difficult to capture unified patterns in snowy scenes due to
their different patterns and transparency. Unlike other types of
image noises [1]–[4], snow seriously obscures the background
and thus is difficult to be removed. How to recover clean
images from snowy scenes is still a challenging issue.
We divide existing snow removal methods into two types:
traditional methods and deep-learning-based methods. Tra-
ditional methods are based on artificial prior knowledge to
This work was supported in part by the National Natural Science Foundation
of China (Grant No. 62171134) and in part by Natural Science Foundation
of Fujian Province, China (Grants No. 2022J02015 and 2022J05117). (Cor-
responding author: Tiesong Zhao.)
J. Lin, N. Jiang, Z. Zhang and W. Chen are with the Fujian Key Lab
for Intelligent Processing and Wireless Transmission of Media Informa-
tion, College of Physics and Information Engineering, Fuzhou University,
Fuzhou 350108, China (E-mails: jhlin study@163.com, jnfrock@gmail.com,
211120091@fzu.edu.cn, weiling.chen@fzu.edu.cn).
T. Zhao is with the Fujian Key Lab for Intelligent Processing and Wireless
Transmission of Media Information, College of Physics and Information
Engineering, Fuzhou University, Fuzhou 350108, China and also with the Peng
Cheng Laboratory, Shenzhen 518055, China (e-mail: t.zhao@fzu.edu.cn).
(c) TKL (d) Ours (e) SSIM vs Parameters (log)
(b) DesnowNet(a) Input
(c) TKL (d) Ours (e) SSIM vs Parameters (log)
(b) DesnowNet(a) Input
Fig. 1. Our proposed method achieves the state-of-the-art snow removal
quality with the lowest computational complexity. (a) A typical real-world
snowy image. (b)-(d) the outputs of our method and its peers. (e) the average
performances under Snow100K.
model snowy layers, such as HOG and MoG model [5],
dictionary learning [6], color assumptions [7] and Hamiltonian
quaternions [8]. Deep-learning-based methods take advantages
of deep neural networks to remove undesired snow in im-
ages, such as DesnowNet [9], CGANs [10], JSTASR [11],
HDCWNet [12] and DDMSNet [13].
Despite these great efforts, there are still critical issues
to be further addressed. The existing works usually retain a
large number of parameter size for better visual qualities, but
inevitably, their computational workloads are also remarkably
increased. This fact limits their applications in real-world
scenarios. Besides, the repairing traces still remain in their
results, as shown in the red traffic signs of Fig. 1(b)(c).
Therefore, it is essential to design a lightweight but high-
efficient network for this task.
It is noted that existing rain removal methods cannot
well address the snow problem due to their apparent visual
differences. From Fig. 2(a), the rain drops and streaks are
densely distributed while the snowflakes vary in patterns. The
image backgrounds are also more sensible for rainy images.
The snowy backgrounds are seriously obscured at different
degrees even in the same scene. These differences make it
difficult to use existing rain removal methods (e.g. [14], [15])
or lightweight methods (e.g. [16], [17]) to process snowy
images. How to locate snow areas and recover clean images
are still important challenges in snow removal.
To solve the above problems, we propose a lightweight
architecture called Laplace-prior-guided Mask Query Trans-
former (LMQFormer). We observe and demonstrate that
Laplace operator can remove redundant information while
preserving high-frequency snow edge information. Thus, we
arXiv:2210.04787v4 [cs.CV] 6 Apr 2023
2
(a) Scenes (b) Mask(a) Scenes (b) Mask
Fig. 2. Examples of snowy and rainy images. (a) typical snowy and rainy
scenes [9], [18]. (b) typical snowy and rainy masks [9], [19].
take advantage of the lightweight Vector Quantised Variational
AutoEncoder (VQVAE) and combine it with the Laplace
operator to design a Laplace-VQVAE sub-network. This sub-
network generates a coarse mask of snow area, which can
be treated as a prior, namely Laplace prior, to guide snow
removal. Furthermore, we design a lightweight sub-network
called Mask Query Transformer (MQFormer), which uses
this coarse mask to obtain high-quality snow-free images
with fewer parameters. A Duplicated Mask Query Attention
(DMQA) is designed to integrate the coarse mask and con-
centrate the MQFormer to snow areas.
With this lightweight design, our LMQFormer attains the
state-of-the-art visual performance with the smallest parameter
size, as demonstrated in Fig. 1(d)(e). The main contributions
are summarized as follows:
We propose a Laplace-VQVAE sub-network to generate a
coarse mask as the prior of snow areas. The effectiveness
and low entropy of this Laplace prior inspire us to design
a lightweight network without training on masks.
We design an MQFormer sub-network with DMQA mod-
ules for snow removal. The DMQA module employs
duplicated mask queries to effectively combine the coarse
mask, thus the MQFormer can focus on snow areas and
avoid over-enhancement on other regions.
Extensive experiments on popular benchmarks demon-
strate that the proposed LMQFormer network has supe-
rior performance and high robustness. Our method also
outperforms most state-of-the-arts in terms of processing
speed.
II. RELATED WORK
A. Traditional Snow Removal Methods
Prior knowledge has been used to guide traditional methods
for snow removal. Bossu et al. [8] employed MoG to separate
the foreground and used HOG features to detect snow from
foreground and recover clean images. Wang et al. [6] com-
bined image decomposition with dictionary learning to model
a three-layer hierarchical snow removal scheme. Zheng et al.
[20] exploited the difference between snow and background
and used the multi-guided filter to remove snow. Pei et al. [7]
extracted snow features on saturation and visibility in HSV
color space for snow area detection and removal. Rajderkar
et al. [21] applied bilateral filter to decompose a snowy
image into low frequency and high frequency parts. Then they
decomposed the high frequency part into snowy foreground
and clean background by using dictionary learning and sparse
coding. Voronin et al. [5] employed anisotropic gradients of
Hamiltonian quaternions to detect snow areas and recover
clean images. Although these methods have certain effects,
they only modeled specific characteristics of snow, which led
to poor generalization ability in real-world scenarios.
B. Deep-learning-based Snow Removal Methods
Recently, deep learning has been successfully applied in
single image snow removal task. Liu et al. [9] proposed
the first deep-learning-based method called DesnowNet. They
applied a two-stage network to learn the mapping from snowy
images to mask and recover clean images. Li et al. [10]
used Generative Adversarial Network (GAN) to recover clean
images. Chen et al. [11] proposed a JSTASR model to generate
three different snow masks with differentiable dark channel
prior layer and guide image recovery with these masks. Li
et al. [22] created architectural search and proposed all-in-
one network for snow, rain and fog removal. Chen et al. [12]
proposed HDCWNet, which removed snow with hierarchical
dual-tree complex wavelet representation and contradict chan-
nel loss. Zhang et al. [13] designed DDMSNet and introduced
semantic and geometric images as prior knowledge to guide
snow removal. Jaw et al. [23] presented an efficient and highly
modularized network called DesnowGAN. Quan et al. [24]
proposed an Invertible Neural Network (INN) to predict a
latent image and a snowflake layer for snow removal. Chen
et al. [25] used a two-stage knowledge distillation learning
to recover all bad-weather images, which is called TKL in
this paper. In summary, these methods achieve promising
performances in snow removal but at costs of a large parameter
size. Besides, their recovered images may remain repairing
traces, as shown in Fig. 1(b)(c).
Different from these methods, we aim to explore a prior of
coarse mask to guide a lightweight but high-efficient network.
The proposed model also generates images with high visual
qualities which will be validated in experimental results.
C. Other Image Denoising Tasks
There are some other similar denoising tasks, including
rain removal (e.g. [26]), haze removal (e.g. [2]), low-light
enhancement (e.g. [27]) and underwater image enhancement
(e.g. [28]). For example, Zhang et al. [29] proposed a novel
single image deraining method called image de-raining con-
ditional generative adversarial network (ID-CGAN). Zhang et
al. [30] proposed an Enhanced Spatio-Temporal Integration
Network (ESTINet) to exploit spatio-temporal information for
rain streak removal. Agrawal et al. [31] presented a single
image dehazing method based on a superpixel, nonlinear
transformation. Wang et al. [32] introduced a normalizing flow
model for low-light enhancement. Jiang et al. [33] exploited
the potential of lightweight network that benefits both effec-
tiveness and efficiency for underwater image enhancement.
3
(a) Snow (b) Clean (c) Mask (d) Residual(a) Snow (b) Clean (c) Mask (d) Residual
Fig. 3. Snow images processed by Laplace operator. (a) a snowy image before
and after being processed by Laplace operator; (b) the clean background
images of (a); (c) the snow masks of (a); (d) the residuals of (a)-(b).
III. METHODOLOGY
A. Problem Statement
As discussed in Section II, deep-learning-based snow re-
moval methods take advantage of CNNs to recover clean
images. Their deep neural networks learns a mapping from
the input snowy image Isnow and its output Iclean:
Iclean = Fm(Isnow;θ),(1)
where Fm(·)represents a deep neural network for mapping, θ
represents the parameters of Fm(·). Considering the compli-
cated patterns of snowflakes, existing methods tend to employ
complicated networks with a large parameter size. Although
they achieve successes in recovering clean images under snow,
their parameter volumes limit their use in practice scenarios
e.g. surveillance, videos, etc. The snow removal task is still
calling for an effective model with a low parameter size and
a high processing speed.
To address this issue, we revisit the physical model of snowy
image as shown in [9]:
Isnow =RM+C(1 M),(2)
where R,Cand Mare the chromatic aberration map, the
latent clean image and the mask image of Isnow, respectively.
This decomposition inspires us to design a mask-based snow
removal approach. However, the snow mask is not available in
practical use. Instead, we interpret a coarse mask and further
utilize it to form a unified prior that benefits both snow location
and removal. This task is thus defined as:
Iprior = Gn(Isnow;θA),
Iclean =Isnow Fn(Isnow, Iprior;θB),(3)
where Iprior represents this unified prior. Gn(·)and Fn(·)
represent the sub-networks to generate Iprior and recover clean
images with Isnow and Iprior, respectively.
Based on the above analyses, we attempt to construct a
unified prior of snow via an interpreted coarse mask and
further utilize it in recovering clean images. As shown in Eq.
(3), the functions of our prior are twofold. First, it is utilized to
coarsely locate snow areas. Second, it is combined with Isnow
to estimate the residuals between the input and output of our
model. With a guidance of this prior, we can achieve better
snow removal results under the lightweight requirement.
B. Laplace Prior
This paper calculates the Iprior with the sub-network Gn(·)
and Laplace operator. It is commonly known that lower
entropy of data makes ease of feature extraction of neural
network, that is, the network is easier to learn latent rules of
data. Here we define an optimization of the relative entropy
between Iprior and Isnow:
min
qθ
DKL (qθ(Iprior |Isnow)kp(Iprior))
s.t. Iprior qθ(Iprior |Isnow)Iprior = Gn(Isnow;θA),
(4)
where p(Iprior)and qθ(Iprior |Isnow)represent the prior and
posterior probabilities of Iprior, respectively. This minimiza-
tion process can help to design the network Gn(·)and its
learning parameters θA.
In the designing of network Gn(·), we amplify the snow
features and eliminate the irrelevant information to reduce
DKL. This can be achieved with Laplace operator:
Ilapsnow =2Isnow.(5)
Correspondingly, the clean image Iclean outputs Ilapclean with
Laplace operator. From Fig. 3, it is extremely easy to distin-
guish Isnow Iclean (i.e. the fourth image in the first row) and
the snow mask (i.e. the third image in the first row). However,
after being processed by Laplace operator, the corresponding
images are quite similar, as shown in the second row of Fig.
3(c)(d). Thus, we are allowed to use Ilapsnow Ilapclean as a
coarse mask for snow removal.
With Laplace operator, Gn(Isnow;θA)is replaced by
GL(Isnow;θA). Its training process can be expressed as:
min
GL(·)kGL(Isnow;θA)(Ilapsnow Ilapclean)k2,(6)
where we attempt to train a Laplace prior to approximate the
coarse mask without explicit knowledge of Fig. 3(c).
C. Lightweight Design
With Laplace operator, we can approximate the coarse mask
as a prior, as shown in Eqs. (3) and (6) and further utilize
it to guide the recovery process, as shown in Eq. (3). This
operation allows us to compute more efficiently with low
entropy data. The complete model is designed as a lightweight
network, which is benefited from the following methods. First,
we introduce spatial attention and Codebook [34] for GL(·)to
effectively extract spatial features for the coarse mask. Second,
we introduce Mask Query Transformer Module (MQTM) to
make Fn(·)concentrate on snow areas, thus, we call our
recovery network as FMQ(·). Third, we design an efficient
framework consisting of parallel transformer and convolutional
encoders as well as a hybrid decoder to obtain more rep-
resentative features of snow at low scales. This framework
greatly reduce computational workloads while remaining high-
performance. By considering all these issues, our complete
model is designed as:
Iclean =Isnow FMQ(Isnow,GL(Isnow;θA); θB),(7)
where GL(·)and FMQ(·)represent the coarse mask generator
and the recovery network, respectively.
摘要:

1LMQFormer:ALaplace-Prior-GuidedMaskQueryTransformerforLightweightSnowRemovalJunhongLin,NanfengJiang,ZhentaoZhang,WeilingChen,Member,IEEEandTiesongZhao,SeniorMember,IEEEAbstract—Snowremovalaimstolocatesnowareasandrecovercleanimageswithoutrepairingtraces.Unliketheregularityandsemitransparencyofrain,s...

展开>> 收起<<
1 LMQFormer A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:6.13MB 格式:PDF 时间:2025-04-28

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注