Overexposure Mask Fusion Generalizable Reverse ISP Multi-Step Refinement Jinha Kim12 Jun Jiang2 and Jinwei Gu2

2025-04-29 0 0 9.12MB 15 页 10玖币

侵权投诉

Overexposure Mask Fusion: Generalizable

Reverse ISP Multi-Step Reﬁnement

Jinha Kim1,2, Jun Jiang2, and Jinwei Gu2

1MIT, Cambridge, MA

jinhakim@mit.edu

2SenseBrain Technology, San Jose, CA

{jinhakim,jiangjun,gujinwei}@sensebrain.site

Abstract. With the advent of deep learning methods replacing the

ISP in transforming sensor RAW readings into RGB images, numer-

ous methodologies solidiﬁed into real-life applications. Equally potent is

the task of inverting this process which will have applications in enhanc-

ing computational photography tasks that are conducted in the RAW

domain, addressing lack of available RAW data while reaping from the

beneﬁts of performing tasks directly on sensor readings. This paper’s pro-

posed methodology is a state-of-the-art solution to the task of RAW re-

construction, and the multi-step reﬁnement process integrating an over-

exposure mask is novel in three ways: instead of from RGB to bayer,

the pipeline trains from RGB to demosaiced RAW allowing use of per-

ceptual loss functions; the multi-step processes has greatly enhanced the

performance of the baseline U-Net from start to end; the pipeline is a

generalizable process of reﬁnement that can enhance other high perfor-

mance methodologies that support end-to-end learning.

Keywords: ISP, Reversed ISP, Demosaiced RAW, Multi-Step Reﬁne-

ment, Overexposure Mask

1 Introduction

Image signal processor (ISP) denotes a collection of operations integrated in

today’s digital cameras that maps camera sensor readings into visually pleasing

RGB images. A popular area of research that has been explored in relation to

the ISP is the task of mapping from RAW data to RGB images with the use of

deep learning-based methodologies. With various applications such as in mobile

cameras which have small sensors and other limitations in hardware, various

methodologies [7,11,12] have been developed to address this task.

A problem that also relates to the ISP, which has equally potent applica-

tions as the task of mapping RAW data to RGB images, is the reversed task of

mapping from RGB images to RAW data, which is a novel problem in low-level

computer vision. Unlike RGB images, RAW data holds a linear relationship with

scene irradiance, which has led to improved performance in various computer vi-

sion tasks. Numerous works have addressed the task of RAW reconstruction with

arXiv:2210.11511v1 [cs.CV] 20 Oct 2022

2 J. Kim et al.

various methodologies with solutions ranging from utilizing canonical steps ap-

proximated by invertible functions [2], mapping RAW data to CIE-XYZ space

from sRGB images [1], a novel modular and diﬀerentiable ISP model with in-

terpretable parameters that is capable of end-to-end learning [3] among many

approaches [1,2,3,9,12,13]. With these inherent advantages that RAW data holds,

the task of reconstructing RAW data from RGB images has become exceedingly

relevant, especially with the lack of availability of RAW data due to factors such

as memory-related concerns or data storage processes that discard the RAW.

However, the task of RAW data reconstruction remains a novel area of re-

search with complexities and limitations that are yet to be fully addressed. For

instance, as noted by Conde et al. [3], approximations using inverse functions

for real-world ISPs show degradation in performance when a large portion of the

RGB images are close to overexposure. Our proposed methodology using overex-

posure mask fusion is a novel portion of our pipeline that speciﬁcally addresses

this issue by mapping overexposed and non-overexposed pixels separately and

fusing them together using an overexposure mask.

Among various AIM challenges with diﬀerent research problems [6], for the

AIM Reversed ISP Challenge [4] where competing teams were given the task of

reconstructing RAW data from RGB images, our methodology is a top solution,

and therefore, evaluated as a state-of-the-art solution to the novel inverse prob-

lem. By mapping from RGB to demosaiced RAW by generating a demosaiced

RAW from the groundtruth bayer using Demosaic Net [5], we allow the use of

perceptual losses. With our novel overexposure mask fusion methodology, our

pipeline addresses the issue of overexposed pixels as mentioned by Conde et al.

[3]. It is most notable that the pipeline led to signiﬁcant enhancement in ﬁdelity

measures while keeping all neural networks within our pipeline as the U-Net

[10]. It is further notable that our methodology can incorporate other proposed

state-of-the-art solutions involving end-to-end learning after slight modiﬁcations

to map from RGB images to demosaiced RAW images. For instance, the model

proposed by Conde et al. [3] can be integrated with our reﬁnement pipeline by

making small modiﬁcations such as removing the ﬁnal mosaic step and generat-

ing demosaiced RAW groundtruth images for training in order to use perceptual

loss. We propose, to the best of our knowledge, the ﬁrst generalizable, multi-

step reﬁnement process for enhanced performance of other reversed ISPs while

addressing the issue of overexposure.

2 Related Works

Works such as [7,11,12] have addressed the task of mapping from RAW data to

RGB images, modeling the camera ISP. Schwartz et al. [11] proposes a full end-

to-end deep learning model of the ISP, which has demonstrated to be capable

of generating visually compelling RGB images from RAW data. Ignatov et al.

[7] proposes another end-to-end deep learning solution with the use of a novel

PyNET CNN architecture and Xing et al. [12] designed an invertible ISP that is

capable of generating visually pleasing RGB images from RAW data as well as

Overexposure Mask Fusion for Reverse ISP Reﬁnement 3

RAW reconstruction. Another work is the CycleISP [13] which models the ISP

both in the forward and reverse directions.

There have also been various works addressing the task of RAW reconstruc-

tion from RGB images [1,2,3,9,12,13]. Brooks et al. [2] proposes an unprocessing

technique for RAW reconstruction by inverting the ISP pipeline with ﬁve canon-

ical steps that are approximated by invertible functions while CIE-XYZ Net [1]

recovers the RAW data to the CIE-XYZ space from sRGB images. Conde et al.

[3] proposed a novel modular and diﬀerentiable ISP model with interpretable pa-

rameters and canonical camera operations that is capable of end-to-end learning

of parameter representations. Punnappurath et al. [9] proposed modiﬁcations to

loss used for training neural network-based compression architectures to account

for both sRGB image ﬁdelity and RAW reconstructions errors while modeling

sRGB-RAW mapping with the use of locally diﬀerentiable 3D lookup tables.

Previously mentioned for the task of mapping from RAW to RGB, CycleISP

[13] and the invertible ISP model proposed by Xing et al. [12] are also capable

of RAW reconstruction. Several works oﬀering solutions to the task of RAW

reconstruction after integration of their approaches of RAW reconstruction have

noted improvements in performance for RAW image denoising [2,3,13] which

suggests further applications of RAW reconstruction.

In order to evaluate performance of diﬀerent solutions on the task of RAW re-

construction for the AIM Reversed ISP Challenge [4], two datasets were used for

training which are the Samsung S7 dataset [11] and ETH Huawei P20 Pro dataset

dataset [7]. The Samsung S7 dataset [11] consists of 110 scenes of 3024 ×4032

resolution as JPEG images captured with a Samsung S7 rear camera where orig-

inal RAW images were saved as well. The ETH Huawei P20 Pro dataset [7] is

a large-scale dataset consisting of 20 thousand photos collected using a Huawei

P20 smartphone for capturing RAW images and the RGB images obtained with

Huawei’s built-in ISP (12.3 MP Sony Exmor IMX380). For both tracks, par-

ticipants were evaluated on ﬁdelity measures, PSNR and SSIM, and were also

tested for generizability and robustness of proposed methods.

3 Methodology

3.1 Network Architecture

The schematic representation of the overall pipeline is outlined in Fig. 1. The

general structure of pipeline consists of unprocessing the input RGB image to its

original demosaiced RAW, after which a simple mosaic is performed to recover

to bayer. For training, the pipeline involves generating new groundtruth RGB

images by passing the groundtruth bayer through a pretrained Demosaic Net

[5] in order to reconstruct the demosaiced RAW. Notably, unlike methodologies

that map directly from RGB to bayer, the proposed pipeline maps initially from

RGB to demosaiced RAW, which enables the use of perceptual loss functions

[14].

A binary overexposure mask is constructed by computing the illuminance for

each pixel of the input RGB image. Based on a certain threshold, pixels are then

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OverexposureMaskFusion:GeneralizableReverseISPMulti-StepRefinementJinhaKim1,2,JunJiang2,andJinweiGu21MIT,Cambridge,MAjinhakim@mit.edu2SenseBrainTechnology,SanJose,CA{jinhakim,jiangjun,gujinwei}@sensebrain.siteAbstract.WiththeadventofdeeplearningmethodsreplacingtheISPintransformingsensorRAWreadingsin...

展开>> 收起<<

Overexposure Mask Fusion Generalizable Reverse ISP Multi-Step Refinement Jinha Kim12 Jun Jiang2 and Jinwei Gu2.pdf

共15页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Overexposure Mask Fusion Generalizable Reverse ISP Multi-Step Refinement Jinha Kim12 Jun Jiang2 and Jinwei Gu2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: