1 LMQFormer A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal

2025-04-28 0 0 6.13MB 11 页 10玖币

侵权投诉

LMQFormer: A Laplace-Prior-Guided Mask Query

Transformer for Lightweight Snow Removal

Junhong Lin, Nanfeng Jiang, Zhentao Zhang, Weiling Chen, Member, IEEE and Tiesong Zhao, Senior

Member, IEEE

Abstract—Snow removal aims to locate snow areas and recover

clean images without repairing traces. Unlike the regularity

and semitransparency of rain, snow with various patterns and

degradations seriously occludes the background. As a result,

the state-of-the-art snow removal methods usually retains a

large parameter size. In this paper, we propose a lightweight

but high-efﬁcient snow removal network called Laplace Mask

Query Transformer (LMQFormer). Firstly, we present a Laplace-

VQVAE to generate a coarse mask as prior knowledge of snow.

Instead of using the mask in dataset, we aim at reducing both

the information entropy of snow and the computational cost of

recovery. Secondly, we design a Mask Query Transformer (MQ-

Former) to remove snow with the coarse mask, where we use two

parallel encoders and a hybrid decoder to learn extensive snow

features under lightweight requirements. Thirdly, we develop a

Duplicated Mask Query Attention (DMQA) that converts the

coarse mask into a speciﬁc number of queries, which constraint

the attention areas of MQFormer with reduced parameters.

Experimental results in popular datasets have demonstrated the

efﬁciency of our proposed model, which achieves the state-of-the-

art snow removal quality with signiﬁcantly reduced parameters

and the lowest running time. Codes and models are available at

https://github.com/StephenLinn/LMQFormer.

Index Terms—Lightweight snow removal, Laplace operator,

mask query transformer, image denoising, image enhancement.

I. INTRODUCTION

SNOW seriously affects the visibility of scenes and objects.

It usually leads to poor visual qualities and severe perfor-

mance degradations in high-level computer vision tasks such

as object detection and semantic understanding. However, it

is difﬁcult to capture uniﬁed patterns in snowy scenes due to

their different patterns and transparency. Unlike other types of

image noises [1]–[4], snow seriously obscures the background

and thus is difﬁcult to be removed. How to recover clean

images from snowy scenes is still a challenging issue.

We divide existing snow removal methods into two types:

traditional methods and deep-learning-based methods. Tra-

ditional methods are based on artiﬁcial prior knowledge to

This work was supported in part by the National Natural Science Foundation

of China (Grant No. 62171134) and in part by Natural Science Foundation

of Fujian Province, China (Grants No. 2022J02015 and 2022J05117). (Cor-

responding author: Tiesong Zhao.)

J. Lin, N. Jiang, Z. Zhang and W. Chen are with the Fujian Key Lab

for Intelligent Processing and Wireless Transmission of Media Informa-

tion, College of Physics and Information Engineering, Fuzhou University,

Fuzhou 350108, China (E-mails: jhlin study@163.com, jnfrock@gmail.com,

211120091@fzu.edu.cn, weiling.chen@fzu.edu.cn).

T. Zhao is with the Fujian Key Lab for Intelligent Processing and Wireless

Transmission of Media Information, College of Physics and Information

Engineering, Fuzhou University, Fuzhou 350108, China and also with the Peng

Cheng Laboratory, Shenzhen 518055, China (e-mail: t.zhao@fzu.edu.cn).

(b) DesnowNet(a) Input

Fig. 1. Our proposed method achieves the state-of-the-art snow removal

quality with the lowest computational complexity. (a) A typical real-world

snowy image. (b)-(d) the outputs of our method and its peers. (e) the average

performances under Snow100K.

model snowy layers, such as HOG and MoG model [5],

dictionary learning [6], color assumptions [7] and Hamiltonian

quaternions [8]. Deep-learning-based methods take advantages

of deep neural networks to remove undesired snow in im-

ages, such as DesnowNet [9], CGANs [10], JSTASR [11],

HDCWNet [12] and DDMSNet [13].

Despite these great efforts, there are still critical issues

to be further addressed. The existing works usually retain a

large number of parameter size for better visual qualities, but

inevitably, their computational workloads are also remarkably

increased. This fact limits their applications in real-world

scenarios. Besides, the repairing traces still remain in their

results, as shown in the red trafﬁc signs of Fig. 1(b)(c).

Therefore, it is essential to design a lightweight but high-

efﬁcient network for this task.

It is noted that existing rain removal methods cannot

well address the snow problem due to their apparent visual

differences. From Fig. 2(a), the rain drops and streaks are

densely distributed while the snowﬂakes vary in patterns. The

image backgrounds are also more sensible for rainy images.

The snowy backgrounds are seriously obscured at different

degrees even in the same scene. These differences make it

difﬁcult to use existing rain removal methods (e.g. [14], [15])

or lightweight methods (e.g. [16], [17]) to process snowy

images. How to locate snow areas and recover clean images

are still important challenges in snow removal.

To solve the above problems, we propose a lightweight

architecture called Laplace-prior-guided Mask Query Trans-

former (LMQFormer). We observe and demonstrate that

Laplace operator can remove redundant information while

preserving high-frequency snow edge information. Thus, we

arXiv:2210.04787v4 [cs.CV] 6 Apr 2023

(a) Scenes (b) Mask(a) Scenes (b) Mask

Fig. 2. Examples of snowy and rainy images. (a) typical snowy and rainy

scenes [9], [18]. (b) typical snowy and rainy masks [9], [19].

take advantage of the lightweight Vector Quantised Variational

AutoEncoder (VQVAE) and combine it with the Laplace

operator to design a Laplace-VQVAE sub-network. This sub-

network generates a coarse mask of snow area, which can

be treated as a prior, namely Laplace prior, to guide snow

removal. Furthermore, we design a lightweight sub-network

called Mask Query Transformer (MQFormer), which uses

this coarse mask to obtain high-quality snow-free images

with fewer parameters. A Duplicated Mask Query Attention

(DMQA) is designed to integrate the coarse mask and con-

centrate the MQFormer to snow areas.

With this lightweight design, our LMQFormer attains the

state-of-the-art visual performance with the smallest parameter

size, as demonstrated in Fig. 1(d)(e). The main contributions

are summarized as follows:

•We propose a Laplace-VQVAE sub-network to generate a

coarse mask as the prior of snow areas. The effectiveness

and low entropy of this Laplace prior inspire us to design

a lightweight network without training on masks.

•We design an MQFormer sub-network with DMQA mod-

ules for snow removal. The DMQA module employs

duplicated mask queries to effectively combine the coarse

mask, thus the MQFormer can focus on snow areas and

avoid over-enhancement on other regions.

•Extensive experiments on popular benchmarks demon-

strate that the proposed LMQFormer network has supe-

rior performance and high robustness. Our method also

outperforms most state-of-the-arts in terms of processing

speed.

II. RELATED WORK

A. Traditional Snow Removal Methods

Prior knowledge has been used to guide traditional methods

for snow removal. Bossu et al. [8] employed MoG to separate

the foreground and used HOG features to detect snow from

foreground and recover clean images. Wang et al. [6] com-

bined image decomposition with dictionary learning to model

a three-layer hierarchical snow removal scheme. Zheng et al.

[20] exploited the difference between snow and background

and used the multi-guided ﬁlter to remove snow. Pei et al. [7]

extracted snow features on saturation and visibility in HSV

color space for snow area detection and removal. Rajderkar

et al. [21] applied bilateral ﬁlter to decompose a snowy

image into low frequency and high frequency parts. Then they

decomposed the high frequency part into snowy foreground

and clean background by using dictionary learning and sparse

coding. Voronin et al. [5] employed anisotropic gradients of

Hamiltonian quaternions to detect snow areas and recover

clean images. Although these methods have certain effects,

they only modeled speciﬁc characteristics of snow, which led

to poor generalization ability in real-world scenarios.

B. Deep-learning-based Snow Removal Methods

Recently, deep learning has been successfully applied in

single image snow removal task. Liu et al. [9] proposed

the ﬁrst deep-learning-based method called DesnowNet. They

applied a two-stage network to learn the mapping from snowy

images to mask and recover clean images. Li et al. [10]

used Generative Adversarial Network (GAN) to recover clean

images. Chen et al. [11] proposed a JSTASR model to generate

three different snow masks with differentiable dark channel

prior layer and guide image recovery with these masks. Li

et al. [22] created architectural search and proposed all-in-

one network for snow, rain and fog removal. Chen et al. [12]

proposed HDCWNet, which removed snow with hierarchical

dual-tree complex wavelet representation and contradict chan-

nel loss. Zhang et al. [13] designed DDMSNet and introduced

semantic and geometric images as prior knowledge to guide

snow removal. Jaw et al. [23] presented an efﬁcient and highly

modularized network called DesnowGAN. Quan et al. [24]

proposed an Invertible Neural Network (INN) to predict a

latent image and a snowﬂake layer for snow removal. Chen

et al. [25] used a two-stage knowledge distillation learning

to recover all bad-weather images, which is called TKL in

this paper. In summary, these methods achieve promising

performances in snow removal but at costs of a large parameter

size. Besides, their recovered images may remain repairing

traces, as shown in Fig. 1(b)(c).

Different from these methods, we aim to explore a prior of

coarse mask to guide a lightweight but high-efﬁcient network.

The proposed model also generates images with high visual

qualities which will be validated in experimental results.

C. Other Image Denoising Tasks

There are some other similar denoising tasks, including

rain removal (e.g. [26]), haze removal (e.g. [2]), low-light

enhancement (e.g. [27]) and underwater image enhancement

(e.g. [28]). For example, Zhang et al. [29] proposed a novel

single image deraining method called image de-raining con-

ditional generative adversarial network (ID-CGAN). Zhang et

al. [30] proposed an Enhanced Spatio-Temporal Integration

Network (ESTINet) to exploit spatio-temporal information for

rain streak removal. Agrawal et al. [31] presented a single

image dehazing method based on a superpixel, nonlinear

transformation. Wang et al. [32] introduced a normalizing ﬂow

model for low-light enhancement. Jiang et al. [33] exploited

the potential of lightweight network that beneﬁts both effec-

tiveness and efﬁciency for underwater image enhancement.

(a) Snow (b) Clean (c) Mask (d) Residual(a) Snow (b) Clean (c) Mask (d) Residual

Fig. 3. Snow images processed by Laplace operator. (a) a snowy image before

and after being processed by Laplace operator; (b) the clean background

images of (a); (c) the snow masks of (a); (d) the residuals of (a)-(b).

III. METHODOLOGY

A. Problem Statement

As discussed in Section II, deep-learning-based snow re-

moval methods take advantage of CNNs to recover clean

images. Their deep neural networks learns a mapping from

the input snowy image Isnow and its output Iclean:

Iclean = Fm(Isnow;θ),(1)

where Fm(·)represents a deep neural network for mapping, θ

represents the parameters of Fm(·). Considering the compli-

cated patterns of snowﬂakes, existing methods tend to employ

complicated networks with a large parameter size. Although

they achieve successes in recovering clean images under snow,

their parameter volumes limit their use in practice scenarios

e.g. surveillance, videos, etc. The snow removal task is still

calling for an effective model with a low parameter size and

a high processing speed.

To address this issue, we revisit the physical model of snowy

image as shown in [9]:

Isnow =RM+C(1 −M),(2)

where R,Cand Mare the chromatic aberration map, the

latent clean image and the mask image of Isnow, respectively.

This decomposition inspires us to design a mask-based snow

removal approach. However, the snow mask is not available in

practical use. Instead, we interpret a coarse mask and further

utilize it to form a uniﬁed prior that beneﬁts both snow location

and removal. This task is thus deﬁned as:

Iprior = Gn(Isnow;θA),

Iclean =Isnow −Fn(Isnow, Iprior;θB),(3)

where Iprior represents this uniﬁed prior. Gn(·)and Fn(·)

represent the sub-networks to generate Iprior and recover clean

images with Isnow and Iprior, respectively.

Based on the above analyses, we attempt to construct a

uniﬁed prior of snow via an interpreted coarse mask and

further utilize it in recovering clean images. As shown in Eq.

(3), the functions of our prior are twofold. First, it is utilized to

coarsely locate snow areas. Second, it is combined with Isnow

to estimate the residuals between the input and output of our

model. With a guidance of this prior, we can achieve better

snow removal results under the lightweight requirement.

B. Laplace Prior

This paper calculates the Iprior with the sub-network Gn(·)

and Laplace operator. It is commonly known that lower

entropy of data makes ease of feature extraction of neural

network, that is, the network is easier to learn latent rules of

data. Here we deﬁne an optimization of the relative entropy

between Iprior and Isnow:

min

qθ

DKL (qθ(Iprior |Isnow)kp(Iprior))

s.t. Iprior ∼qθ(Iprior |Isnow)⇐⇒ Iprior = Gn(Isnow;θA),

(4)

where p(Iprior)and qθ(Iprior |Isnow)represent the prior and

posterior probabilities of Iprior, respectively. This minimiza-

tion process can help to design the network Gn(·)and its

learning parameters θA.

In the designing of network Gn(·), we amplify the snow

features and eliminate the irrelevant information to reduce

DKL. This can be achieved with Laplace operator:

Ilapsnow =∇2Isnow.(5)

Correspondingly, the clean image Iclean outputs Ilapclean with

Laplace operator. From Fig. 3, it is extremely easy to distin-

guish Isnow −Iclean (i.e. the fourth image in the ﬁrst row) and

the snow mask (i.e. the third image in the ﬁrst row). However,

after being processed by Laplace operator, the corresponding

images are quite similar, as shown in the second row of Fig.

3(c)(d). Thus, we are allowed to use Ilapsnow −Ilapclean as a

coarse mask for snow removal.

With Laplace operator, Gn(Isnow;θA)is replaced by

GL(Isnow;θA). Its training process can be expressed as:

min

GL(·)kGL(Isnow;θA)−(Ilapsnow −Ilapclean)k2,(6)

where we attempt to train a Laplace prior to approximate the

coarse mask without explicit knowledge of Fig. 3(c).

C. Lightweight Design

With Laplace operator, we can approximate the coarse mask

as a prior, as shown in Eqs. (3) and (6) and further utilize

it to guide the recovery process, as shown in Eq. (3). This

operation allows us to compute more efﬁciently with low

entropy data. The complete model is designed as a lightweight

network, which is beneﬁted from the following methods. First,

we introduce spatial attention and Codebook [34] for GL(·)to

effectively extract spatial features for the coarse mask. Second,

we introduce Mask Query Transformer Module (MQTM) to

make Fn(·)concentrate on snow areas, thus, we call our

recovery network as FMQ(·). Third, we design an efﬁcient

framework consisting of parallel transformer and convolutional

encoders as well as a hybrid decoder to obtain more rep-

resentative features of snow at low scales. This framework

greatly reduce computational workloads while remaining high-

performance. By considering all these issues, our complete

model is designed as:

Iclean =Isnow −FMQ(Isnow,GL(Isnow;θA); θB),(7)

where GL(·)and FMQ(·)represent the coarse mask generator

and the recovery network, respectively.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1LMQFormer:ALaplace-Prior-GuidedMaskQueryTransformerforLightweightSnowRemovalJunhongLin,NanfengJiang,ZhentaoZhang,WeilingChen,Member,IEEEandTiesongZhao,SeniorMember,IEEEAbstractSnowremovalaimstolocatesnowareasandrecovercleanimageswithoutrepairingtraces.Unliketheregularityandsemitransparencyofrain,s...

展开>> 收起<<

1 LMQFormer A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 LMQFormer A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: