On-the-go Reectance Transformation Imaging with Ordinary Smartphones Mara Pistellato1000000016273290X

2025-05-02 0 0 7.83MB 17 页 10玖币

侵权投诉

On-the-go Reﬂectance Transformation Imaging

with Ordinary Smartphones

Mara Pistellato1[0000−0001−6273−290X]

and Filippo Bergamasco1[0000−0001−6668−1556]

DAIS, Universit`a Ca’Foscari Venezia

155, via Torino, Venezia Italy

{mara.pistellato,filippo.bergamasco}@unive.it

Abstract. Reﬂectance Transformation Imaging (RTI) is a popular tech-

nique that allows the recovery of per-pixel reﬂectance information by

capturing an object under diﬀerent light conditions. This can be later

used to reveal surface details and interactively relight the subject. Such

process, however, typically requires dedicated hardware setups to recover

the light direction from multiple locations, making the process tedious

when performed outside the lab.

We propose a novel RTI method that can be carried out by record-

ing videos with two ordinary smartphones. The ﬂash led-light of one

device is used to illuminate the subject while the other captures the re-

ﬂectance. Since the led is mounted close to the camera lenses, we can

infer the light direction for thousands of images by freely moving the

illuminating device while observing a ﬁducial marker surrounding the

subject. To deal with such amount of data, we propose a neural relight-

ing model that reconstructs object appearance for arbitrary light direc-

tions from extremely compact reﬂectance distribution data compressed

via Principal Components Analysis (PCA). Experiments shows that the

proposed technique can be easily performed on the ﬁeld with a result-

ing RTI model that can outperform state-of-the-art approaches involving

dedicated hardware setups.

Keywords: Reﬂectance Transformation Imaging; Neural Network; Cam-

era Pose Estimation; Interactive Relighting

1 Introduction

In Reﬂectance Transformation Imaging (RTI) an object is acquired with dif-

ferent known light conditions to approximate the per-pixel Bi-directional Re-

ﬂectance Distribution Function (BRDF) from a static viewpoint. Such process

is commonly used to produce relightable images for Cultural Heritage applica-

tions [19,6] or perform material quality analysis [4] and surface normal recon-

struction. The ﬂexibility of such method makes it suitable for several materials,

and the resulting images can unravel novel information about the object un-

der study such as manufacturing techniques, surface conditions or conservation

treatments. Among the variety of practical applications in Cultural Heritage

arXiv:2210.09821v1 [cs.CV] 18 Oct 2022

2 M. Pistellato et al.

ﬁeld, we can mention enhanced visualisation [6,21], documentation and preser-

vation [16,13,15] as well as surface analysis [3]. Moreover, RTI techniques can

be eﬀectively paired with other tools as 3D reconstruction [36,23,24,25] or mul-

tispectral imaging [8] to further improve the results.

In the majority of the cases, the acquisition of RTI data is carried out with

specialised hardware involving a light dome and other custom devices that need

complex initial calibration. Since the amount of processed data is signiﬁcant,

several compression methods have been proposed for RTI data representation

to obtain eﬃcient storage and interactive rendering [27,9]. In addition to that,

part of the proposals focus on the need of low-cost portable solutions [12,38,28],

including mobile devices [31] to perform the computation on the ﬁeld.

In this paper we ﬁrst propose a low-cost acquisition pipeline that requires a

couple of ordinary smartphones and a simple marker printed on a ﬂat surface.

During the process, both smartphones acquire two videos simultaneously: one

device acting as a static camera observing the object from a ﬁxed viewpoint,

while the other provides a trackable moving light source. The two videos are

synchronised and then the marker is used to recover the light position with

respect to a common reference frame, originating a sequence of intensity images

paired with light directions. The second contribution of our work is an eﬃcient

and accurate neural-network model to describe per-pixel reﬂectance based on

PCA-compressed intensity data. We tested the proposed relighting approach

both on a synthetic RTI dataset, involving diﬀerent surfaces and materials, and

on several real-world objects acquired on the ﬁeld.

2 Related Work

The literature counts a huge number of diﬀerent methods for both acquisition

and processing of RTI data for relighting. In [22] the authors give a compre-

hensive survey on Multi-Light Image Collections (MLICs) for surface analysis.

Many approaches employ the classical polynomial texture maps [14] to (i) deﬁne

the per-pixel light function, (ii) store a representation of the acquire data, and

(iii) dynamically render the image under new lights. Similar techniques are the

so-called Hemispherical Harmonics coeﬃcients [17] and Discrete Modal Decom-

position [26]. In [9] the authors propose a new method based on Radial Basis

Function (RBF) interpolation, while in [27] a compact representation for web

visualisation employing PCA is presented. The authors in [18] present the High-

light Reﬂectance Transformation Imaging (H-RTI) framework, where the light

direction is estimated by detecting its specular reﬂection on one or more spherical

objects captured in the scene. However, such setup involves several assumptions

such as constant light intensity and orthographic camera model, that in practice

make the model unstable. Other techniques that have been proposed to estimate

light directly from some scene features are [1,2], while in the authors [9] propose

a novel framework to expand the H-RTI technique.

Recently, neural networks have been employed successfully in several Com-

puter Vision tasks, including RTI. In particular, the encoder-decoder architec-

On-the-go RTI with Ordinary Smartphones 3

Syncrhonization &

Marker detection

Video acquisition MLIC

generation

Light vector

compression

Model

training Relighting

PCA y

Fig. 1. Complete mobile-based RTI acquisition and relighting pipeline.

ture is used in several applications for eﬀective data compression [33]. The work

in [30] presents a NN-based method to model light transport as a non-linear func-

tion of light position and pixel coordinates to perform image relighting. Other

related work using neural networks are [39], in which a subset of optimal light di-

rections is selected, and [29] where a convolutional approach is adopted. Authors

in [5] propose an autoencoder architecture to perform relighting of RTI data: the

architecture is composed by an encoder part where pixel-wise acquired values

are compressed, then the decoder part uses the light information to output the

expected pixel value. They also propose two benchmark datasets for evaluation.

3 Proposed Method

Our method follows the classical procedure employed in the vast majority of ex-

isting RTI applications: the whole pipeline is presented in Figure 1. First, several

images of the object under study are acquired varying the lighting conditions.

In our case, the operation uses the on-board cameras and ﬂash light of a pair

of ordinary smartphones while taking two videos. The two videos are then syn-

chronised and the smartphones positions with respect to the scene are recovered

using a ﬁducial marker: in this way we obtain light position and reﬂectance im-

age for each frame. Such data is processed to create a model that maps each

pair (pixel,light direction) to an observed reﬂectance value. Section 3.1 gives

a detailed description of this process. This results in a Multi-Light Image Col-

lection (MLIC), that is eﬃciently compressed by projecting light vectors to a

lower-dimensional space via PCA. Then, we designed a neural model deﬁned as

a small Multi-Layer Perceptron (MLP) to decode the compressed light vectors

and extrapolate the expected intensity of a pixel given a light direction. In Sec-

tion 3.2 the neural reﬂectance model and data compression are illustrated in

detail. Finally, the trained model is used to dynamically relight the object by

setting the light direction to any (possibly unseen) value.

3.1 Data Acquisition

Data acquisition is performed using two smartphones and a custom ﬁducial

marker as shown in Figure 2 (left). The object to acquire is placed at the centre

of a marker composed by a thick black square with a white dot at one corner.

4 M. Pistellato et al.

Static

device

Moving

device

Acquired

object

Flash

light on

Marker

Fig. 2. Left: Proposed RTI acquisition setup. Right: Example frames acquired by the

static and moving devices.

One device is located above the object, with the camera facing it frontally so

that it produces images as depicted in Figure 2 (top-right). This device, called

static, must not move throughout the acquisition, so we suggest to attach it to a

tripod. The second device, called moving, is manually moved around the object

with an orbiting trajectory. The ﬂash led-light located close to the backward-

facing camera must be kept on all the time to illuminate the object from diﬀerent

locations. This will allow the static device to observe how the reﬂectance of each

pixel changes while moving the light source.

Both the devices record a video during the acquisition. For now, let’s con-

sider those videos as just sequences of images perfectly synchronised in time. In

other words, the acquisition consists in a sequence of Mimages (Is

0, Is

1, . . . , Is

acquired from the static device paired with a sequence (Im

0, Im

1, . . . , Im

M) acquired

from the moving device at times t0, t1, . . . , tM.

After video acquisition, each image is processed to detect the ﬁducial marker.

For the static camera, this operation is needed to locate the 4 corners (c0, c1, c2, c3)

of the inner white square (i.e. the internal part of the marker inside the thick

black border). This region is then cropped to create a sequence of (I0,...,IN)

images composed by W×Hpixels commonly referred as Multi-light Image Col-

lection (MLIC). Note that Ncan be lower than Mbecause the ﬁducial marker

must be detected in both Is

iand Im

ito be added to the MLIC.

Each Iiis a single-channel grayscale image containing only the luminance

of the original Is

iimage. We decided to model only the reﬂectance intensity

(and not the wavelength) as a function of the light’s angle of incidence for two

reasons. First, we cannot change the colour of the light source and, second, it

is uncommon to have iridescent materials where the incindent angle aﬀects the

reﬂectance spectrum [11]. Therefore, we convert all the images to the YUV colour

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

On-the-goReectanceTransformationImagingwithOrdinarySmartphonesMaraPistellato1[000000016273290X]andFilippoBergamasco1[0000000166681556]DAIS,UniversitaCa'FoscariVenezia155,viaTorino,VeneziaItalyfmara.pistellato,filippo.bergamascog@unive.itAbstract.ReectanceTransformationImaging(RTI)isapopulartech-niq...

展开>> 收起<<

On-the-go Reectance Transformation Imaging with Ordinary Smartphones Mara Pistellato1000000016273290X.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

On-the-go Reectance Transformation Imaging with Ordinary Smartphones Mara Pistellato1000000016273290X

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: