1
Perceptual Multi-Exposure Fusion
Xiaoning Liu
Abstract—As an ever-increasing demand for high dynamic
range (HDR) scene shooting, multi-exposure image fusion (MEF)
technology has abounded. In recent years, multi-scale exposure
fusion approaches based on detail-enhancement have led the
way for improvement in highlight and shadow details. Most
of such methods, however, are too computationally expensive
to be deployed on mobile devices. This paper presents a per-
ceptual multi-exposure fusion method that not just ensures fine
shadow/highlight details but with lower complexity than detail-
enhanced methods. We analyze the potential defects of three
classical exposure measures in lieu of using detail-enhancement
component and improve two of them, namely adaptive Well-
exposedness (AWE) and the gradient of color images (3-D
gradient). AWE designed in YCbCr color space considers the
difference between varying exposure images. 3-D gradient is
employed to extract fine details. We build a large-scale multi-
exposure benchmark dataset suitable for static scenes, which
contains 167 image sequences all told. Experiments on the con-
structed dataset demonstrate that the proposed method exceeds
existing eight state-of-the-art approaches in terms of visually
and MEF-SSIM value. Moreover, our approach can achieve a
better improvement for current image enhancement techniques,
ensuring fine detail in bright light.
Index Terms—High dynamic range, multi-scale fusion, multi-
exposure fusion, Laplacian pyramid, image enhancement.
I. INTRODUCTION
HIGH dynamic range (HDR) imaging technique could
make the image captured in extremely bright or dark
condition crispy and faithfully accessible to a real-world
scene. It has increasingly catered to mobile devices, videos,
autonomous vehicles and so on [1], [2]. Unfortunately, its wide
range of applications is affected by expensive equipment cost
and visualizing on a standard display with a limited dynamic
range which usually relies on tone mapping operations [3]–
[8]. To mitigate these limitations, multi-exposure image fu-
sion (MEF) technology, also termed exposure bracketing [9],
merges multiple exposure images captured from same scenes
with different exposure time into a spectacular HDR image
abounded with desirable detail information. Since MEF tech-
nology simplifies the HDR imaging pipeline, it has recently
accommodated smart cameras, particularly smartphones. MEF
techniques, however, inevitably lead to unwelcome artifacts
like ghosting and tearing when encountering moving objects
or camera shake [9]–[12]. To overcome this challenge, many
HDR deghosting algorithms [13]–[20] have been proposed.
Among them, Tursun et al.carried out an in-depth survey of
HDR deghosting [16] and proposed an objective deghosting
quality metric to avoid the bias of subjective evaluations [17].
Although ghost-free methods have made significant headway
over the past decade, removing ghosting artifacts is still the
X. Liu is with the School of Information and Communication Engineering,
University of Electronic Science and Technology of China, Chengdu, 611731,
China. E-mail: liuxiaoning2016@sina.com
greatest challenge to MEF and HDR imaging for dynamic
scenes [21]. This work assumes input images to be all well-
aligned for static scenes. The past two decades have seen
a significant amount of work in MEF community. They are
generally classified into four categories: multi-scale transform-
based methods, statistical model-based methods, patch-based
methods, and deep learning-based methods.
A. Multi-Scale Transform-Based Methods
Initiatively guided by three intrinsic image quality metrics
namely contrast, saturation and well-exposedness, Mertens
et al.[22] constructed the weight maps to blend multiple
exposure images in the framework of Laplacian pyramid (LP)
[23]. Since multi-scale technique can reduce unpleasant halo
artifacts around edges and alleviate the seam problem across
object boundaries to some extent, multi-scale transform-based
MEF approaches [22], [24]–[27], especially those based on
LP [22], [25]–[27], have gained ground in popularity. Two
novel exposure measures, visibility and consistency in [28],
are developed based on the scrupulous observation that the
gradient magnitude will gradually decrease among over/under-
exposure areas and the gradient direction will change as
the object moves. In order to reduce the loss of detail in
multi-scale fusion, Li et al.[29] introduced a new quadratic
optimization scheme in the gradient field. The sharper image
is finally synthesized by combining the extracted detail with
an intermediate image generated by the MEF method [22]. It
is unfortunate that the computational efficiency in [29] cannot
deploy mobile devices due to the need of solving a quadratic
optimization problem by means of an iterative method. Shen
et al.[30] derived a novel boosting Laplacian pyramid which
boosts the structure of detail and base layers, respectively, and
designed a hybrid exposure weight. As the optimal weights
[30] computed by a global optimization may over-smoothing
the final weight map, Li et al.[31], [32] used edge-preserving
filters, namely recursive filter [33] and guided filter [34], to
refine the resulting weight map under a two-scale framework.
Furthermore, to ensure high levels of detail with well-
controlled artifacts even in low/high-light scenarios, the works
in [35]–[37] recently integrated multi-scale technique with
detail-enhanced smoothing pyramid relied on weighted guided
image filter [38], gradient domain guided image filter [39],
and fast weighted least square filter [40], respectively. Noted
that the detail-enhancement technology [35] is also beneficial
to low-light and back-light imaging. Because a fast weighted
least square based optimization problem is subjected to a
gradient constraint, the speed of detail extraction component in
[37] is significantly faster than that [29]. Experimental results
in [41] demonstrated that [36] ranks first according to quality
metric MEF-SSIM [42]. Even though [35]–[37] are capable of
arXiv:2210.09604v3 [cs.CV] 5 Mar 2025