Overexposure Mask Fusion Generalizable Reverse ISP Multi-Step Refinement Jinha Kim12 Jun Jiang2 and Jinwei Gu2

2025-04-29 0 0 9.12MB 15 页 10玖币
侵权投诉
Overexposure Mask Fusion: Generalizable
Reverse ISP Multi-Step Refinement
Jinha Kim1,2, Jun Jiang2, and Jinwei Gu2
1MIT, Cambridge, MA
jinhakim@mit.edu
2SenseBrain Technology, San Jose, CA
{jinhakim,jiangjun,gujinwei}@sensebrain.site
Abstract. With the advent of deep learning methods replacing the
ISP in transforming sensor RAW readings into RGB images, numer-
ous methodologies solidified into real-life applications. Equally potent is
the task of inverting this process which will have applications in enhanc-
ing computational photography tasks that are conducted in the RAW
domain, addressing lack of available RAW data while reaping from the
benefits of performing tasks directly on sensor readings. This paper’s pro-
posed methodology is a state-of-the-art solution to the task of RAW re-
construction, and the multi-step refinement process integrating an over-
exposure mask is novel in three ways: instead of from RGB to bayer,
the pipeline trains from RGB to demosaiced RAW allowing use of per-
ceptual loss functions; the multi-step processes has greatly enhanced the
performance of the baseline U-Net from start to end; the pipeline is a
generalizable process of refinement that can enhance other high perfor-
mance methodologies that support end-to-end learning.
Keywords: ISP, Reversed ISP, Demosaiced RAW, Multi-Step Refine-
ment, Overexposure Mask
1 Introduction
Image signal processor (ISP) denotes a collection of operations integrated in
today’s digital cameras that maps camera sensor readings into visually pleasing
RGB images. A popular area of research that has been explored in relation to
the ISP is the task of mapping from RAW data to RGB images with the use of
deep learning-based methodologies. With various applications such as in mobile
cameras which have small sensors and other limitations in hardware, various
methodologies [7,11,12] have been developed to address this task.
A problem that also relates to the ISP, which has equally potent applica-
tions as the task of mapping RAW data to RGB images, is the reversed task of
mapping from RGB images to RAW data, which is a novel problem in low-level
computer vision. Unlike RGB images, RAW data holds a linear relationship with
scene irradiance, which has led to improved performance in various computer vi-
sion tasks. Numerous works have addressed the task of RAW reconstruction with
arXiv:2210.11511v1 [cs.CV] 20 Oct 2022
2 J. Kim et al.
various methodologies with solutions ranging from utilizing canonical steps ap-
proximated by invertible functions [2], mapping RAW data to CIE-XYZ space
from sRGB images [1], a novel modular and differentiable ISP model with in-
terpretable parameters that is capable of end-to-end learning [3] among many
approaches [1,2,3,9,12,13]. With these inherent advantages that RAW data holds,
the task of reconstructing RAW data from RGB images has become exceedingly
relevant, especially with the lack of availability of RAW data due to factors such
as memory-related concerns or data storage processes that discard the RAW.
However, the task of RAW data reconstruction remains a novel area of re-
search with complexities and limitations that are yet to be fully addressed. For
instance, as noted by Conde et al. [3], approximations using inverse functions
for real-world ISPs show degradation in performance when a large portion of the
RGB images are close to overexposure. Our proposed methodology using overex-
posure mask fusion is a novel portion of our pipeline that specifically addresses
this issue by mapping overexposed and non-overexposed pixels separately and
fusing them together using an overexposure mask.
Among various AIM challenges with different research problems [6], for the
AIM Reversed ISP Challenge [4] where competing teams were given the task of
reconstructing RAW data from RGB images, our methodology is a top solution,
and therefore, evaluated as a state-of-the-art solution to the novel inverse prob-
lem. By mapping from RGB to demosaiced RAW by generating a demosaiced
RAW from the groundtruth bayer using Demosaic Net [5], we allow the use of
perceptual losses. With our novel overexposure mask fusion methodology, our
pipeline addresses the issue of overexposed pixels as mentioned by Conde et al.
[3]. It is most notable that the pipeline led to significant enhancement in fidelity
measures while keeping all neural networks within our pipeline as the U-Net
[10]. It is further notable that our methodology can incorporate other proposed
state-of-the-art solutions involving end-to-end learning after slight modifications
to map from RGB images to demosaiced RAW images. For instance, the model
proposed by Conde et al. [3] can be integrated with our refinement pipeline by
making small modifications such as removing the final mosaic step and generat-
ing demosaiced RAW groundtruth images for training in order to use perceptual
loss. We propose, to the best of our knowledge, the first generalizable, multi-
step refinement process for enhanced performance of other reversed ISPs while
addressing the issue of overexposure.
2 Related Works
Works such as [7,11,12] have addressed the task of mapping from RAW data to
RGB images, modeling the camera ISP. Schwartz et al. [11] proposes a full end-
to-end deep learning model of the ISP, which has demonstrated to be capable
of generating visually compelling RGB images from RAW data. Ignatov et al.
[7] proposes another end-to-end deep learning solution with the use of a novel
PyNET CNN architecture and Xing et al. [12] designed an invertible ISP that is
capable of generating visually pleasing RGB images from RAW data as well as
Overexposure Mask Fusion for Reverse ISP Refinement 3
RAW reconstruction. Another work is the CycleISP [13] which models the ISP
both in the forward and reverse directions.
There have also been various works addressing the task of RAW reconstruc-
tion from RGB images [1,2,3,9,12,13]. Brooks et al. [2] proposes an unprocessing
technique for RAW reconstruction by inverting the ISP pipeline with five canon-
ical steps that are approximated by invertible functions while CIE-XYZ Net [1]
recovers the RAW data to the CIE-XYZ space from sRGB images. Conde et al.
[3] proposed a novel modular and differentiable ISP model with interpretable pa-
rameters and canonical camera operations that is capable of end-to-end learning
of parameter representations. Punnappurath et al. [9] proposed modifications to
loss used for training neural network-based compression architectures to account
for both sRGB image fidelity and RAW reconstructions errors while modeling
sRGB-RAW mapping with the use of locally differentiable 3D lookup tables.
Previously mentioned for the task of mapping from RAW to RGB, CycleISP
[13] and the invertible ISP model proposed by Xing et al. [12] are also capable
of RAW reconstruction. Several works offering solutions to the task of RAW
reconstruction after integration of their approaches of RAW reconstruction have
noted improvements in performance for RAW image denoising [2,3,13] which
suggests further applications of RAW reconstruction.
In order to evaluate performance of different solutions on the task of RAW re-
construction for the AIM Reversed ISP Challenge [4], two datasets were used for
training which are the Samsung S7 dataset [11] and ETH Huawei P20 Pro dataset
dataset [7]. The Samsung S7 dataset [11] consists of 110 scenes of 3024 ×4032
resolution as JPEG images captured with a Samsung S7 rear camera where orig-
inal RAW images were saved as well. The ETH Huawei P20 Pro dataset [7] is
a large-scale dataset consisting of 20 thousand photos collected using a Huawei
P20 smartphone for capturing RAW images and the RGB images obtained with
Huawei’s built-in ISP (12.3 MP Sony Exmor IMX380). For both tracks, par-
ticipants were evaluated on fidelity measures, PSNR and SSIM, and were also
tested for generizability and robustness of proposed methods.
3 Methodology
3.1 Network Architecture
The schematic representation of the overall pipeline is outlined in Fig. 1. The
general structure of pipeline consists of unprocessing the input RGB image to its
original demosaiced RAW, after which a simple mosaic is performed to recover
to bayer. For training, the pipeline involves generating new groundtruth RGB
images by passing the groundtruth bayer through a pretrained Demosaic Net
[5] in order to reconstruct the demosaiced RAW. Notably, unlike methodologies
that map directly from RGB to bayer, the proposed pipeline maps initially from
RGB to demosaiced RAW, which enables the use of perceptual loss functions
[14].
A binary overexposure mask is constructed by computing the illuminance for
each pixel of the input RGB image. Based on a certain threshold, pixels are then
摘要:

OverexposureMaskFusion:GeneralizableReverseISPMulti-StepRefinementJinhaKim1,2,JunJiang2,andJinweiGu21MIT,Cambridge,MAjinhakim@mit.edu2SenseBrainTechnology,SanJose,CA{jinhakim,jiangjun,gujinwei}@sensebrain.siteAbstract.WiththeadventofdeeplearningmethodsreplacingtheISPintransformingsensorRAWreadingsin...

展开>> 收起<<
Overexposure Mask Fusion Generalizable Reverse ISP Multi-Step Refinement Jinha Kim12 Jun Jiang2 and Jinwei Gu2.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:9.12MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注