On-the-go Reectance Transformation Imaging with Ordinary Smartphones Mara Pistellato1000000016273290X

2025-05-02 0 0 7.83MB 17 页 10玖币
侵权投诉
On-the-go Reflectance Transformation Imaging
with Ordinary Smartphones
Mara Pistellato1[000000016273290X]
and Filippo Bergamasco1[0000000166681556]
DAIS, Universit`a Ca’Foscari Venezia
155, via Torino, Venezia Italy
{mara.pistellato,filippo.bergamasco}@unive.it
Abstract. Reflectance Transformation Imaging (RTI) is a popular tech-
nique that allows the recovery of per-pixel reflectance information by
capturing an object under different light conditions. This can be later
used to reveal surface details and interactively relight the subject. Such
process, however, typically requires dedicated hardware setups to recover
the light direction from multiple locations, making the process tedious
when performed outside the lab.
We propose a novel RTI method that can be carried out by record-
ing videos with two ordinary smartphones. The flash led-light of one
device is used to illuminate the subject while the other captures the re-
flectance. Since the led is mounted close to the camera lenses, we can
infer the light direction for thousands of images by freely moving the
illuminating device while observing a fiducial marker surrounding the
subject. To deal with such amount of data, we propose a neural relight-
ing model that reconstructs object appearance for arbitrary light direc-
tions from extremely compact reflectance distribution data compressed
via Principal Components Analysis (PCA). Experiments shows that the
proposed technique can be easily performed on the field with a result-
ing RTI model that can outperform state-of-the-art approaches involving
dedicated hardware setups.
Keywords: Reflectance Transformation Imaging; Neural Network; Cam-
era Pose Estimation; Interactive Relighting
1 Introduction
In Reflectance Transformation Imaging (RTI) an object is acquired with dif-
ferent known light conditions to approximate the per-pixel Bi-directional Re-
flectance Distribution Function (BRDF) from a static viewpoint. Such process
is commonly used to produce relightable images for Cultural Heritage applica-
tions [19,6] or perform material quality analysis [4] and surface normal recon-
struction. The flexibility of such method makes it suitable for several materials,
and the resulting images can unravel novel information about the object un-
der study such as manufacturing techniques, surface conditions or conservation
treatments. Among the variety of practical applications in Cultural Heritage
arXiv:2210.09821v1 [cs.CV] 18 Oct 2022
2 M. Pistellato et al.
field, we can mention enhanced visualisation [6,21], documentation and preser-
vation [16,13,15] as well as surface analysis [3]. Moreover, RTI techniques can
be effectively paired with other tools as 3D reconstruction [36,23,24,25] or mul-
tispectral imaging [8] to further improve the results.
In the majority of the cases, the acquisition of RTI data is carried out with
specialised hardware involving a light dome and other custom devices that need
complex initial calibration. Since the amount of processed data is significant,
several compression methods have been proposed for RTI data representation
to obtain efficient storage and interactive rendering [27,9]. In addition to that,
part of the proposals focus on the need of low-cost portable solutions [12,38,28],
including mobile devices [31] to perform the computation on the field.
In this paper we first propose a low-cost acquisition pipeline that requires a
couple of ordinary smartphones and a simple marker printed on a flat surface.
During the process, both smartphones acquire two videos simultaneously: one
device acting as a static camera observing the object from a fixed viewpoint,
while the other provides a trackable moving light source. The two videos are
synchronised and then the marker is used to recover the light position with
respect to a common reference frame, originating a sequence of intensity images
paired with light directions. The second contribution of our work is an efficient
and accurate neural-network model to describe per-pixel reflectance based on
PCA-compressed intensity data. We tested the proposed relighting approach
both on a synthetic RTI dataset, involving different surfaces and materials, and
on several real-world objects acquired on the field.
2 Related Work
The literature counts a huge number of different methods for both acquisition
and processing of RTI data for relighting. In [22] the authors give a compre-
hensive survey on Multi-Light Image Collections (MLICs) for surface analysis.
Many approaches employ the classical polynomial texture maps [14] to (i) define
the per-pixel light function, (ii) store a representation of the acquire data, and
(iii) dynamically render the image under new lights. Similar techniques are the
so-called Hemispherical Harmonics coefficients [17] and Discrete Modal Decom-
position [26]. In [9] the authors propose a new method based on Radial Basis
Function (RBF) interpolation, while in [27] a compact representation for web
visualisation employing PCA is presented. The authors in [18] present the High-
light Reflectance Transformation Imaging (H-RTI) framework, where the light
direction is estimated by detecting its specular reflection on one or more spherical
objects captured in the scene. However, such setup involves several assumptions
such as constant light intensity and orthographic camera model, that in practice
make the model unstable. Other techniques that have been proposed to estimate
light directly from some scene features are [1,2], while in the authors [9] propose
a novel framework to expand the H-RTI technique.
Recently, neural networks have been employed successfully in several Com-
puter Vision tasks, including RTI. In particular, the encoder-decoder architec-
On-the-go RTI with Ordinary Smartphones 3
Syncrhonization &
Marker detection
Video acquisition MLIC
generation
Light vector
compression
Model
training Relighting
PCA y
Fig. 1. Complete mobile-based RTI acquisition and relighting pipeline.
ture is used in several applications for effective data compression [33]. The work
in [30] presents a NN-based method to model light transport as a non-linear func-
tion of light position and pixel coordinates to perform image relighting. Other
related work using neural networks are [39], in which a subset of optimal light di-
rections is selected, and [29] where a convolutional approach is adopted. Authors
in [5] propose an autoencoder architecture to perform relighting of RTI data: the
architecture is composed by an encoder part where pixel-wise acquired values
are compressed, then the decoder part uses the light information to output the
expected pixel value. They also propose two benchmark datasets for evaluation.
3 Proposed Method
Our method follows the classical procedure employed in the vast majority of ex-
isting RTI applications: the whole pipeline is presented in Figure 1. First, several
images of the object under study are acquired varying the lighting conditions.
In our case, the operation uses the on-board cameras and flash light of a pair
of ordinary smartphones while taking two videos. The two videos are then syn-
chronised and the smartphones positions with respect to the scene are recovered
using a fiducial marker: in this way we obtain light position and reflectance im-
age for each frame. Such data is processed to create a model that maps each
pair (pixel,light direction) to an observed reflectance value. Section 3.1 gives
a detailed description of this process. This results in a Multi-Light Image Col-
lection (MLIC), that is efficiently compressed by projecting light vectors to a
lower-dimensional space via PCA. Then, we designed a neural model defined as
a small Multi-Layer Perceptron (MLP) to decode the compressed light vectors
and extrapolate the expected intensity of a pixel given a light direction. In Sec-
tion 3.2 the neural reflectance model and data compression are illustrated in
detail. Finally, the trained model is used to dynamically relight the object by
setting the light direction to any (possibly unseen) value.
3.1 Data Acquisition
Data acquisition is performed using two smartphones and a custom fiducial
marker as shown in Figure 2 (left). The object to acquire is placed at the centre
of a marker composed by a thick black square with a white dot at one corner.
4 M. Pistellato et al.
Static
device
Moving
device
Acquired
object
Flash
light on
Marker
Fig. 2. Left: Proposed RTI acquisition setup. Right: Example frames acquired by the
static and moving devices.
One device is located above the object, with the camera facing it frontally so
that it produces images as depicted in Figure 2 (top-right). This device, called
static, must not move throughout the acquisition, so we suggest to attach it to a
tripod. The second device, called moving, is manually moved around the object
with an orbiting trajectory. The flash led-light located close to the backward-
facing camera must be kept on all the time to illuminate the object from different
locations. This will allow the static device to observe how the reflectance of each
pixel changes while moving the light source.
Both the devices record a video during the acquisition. For now, let’s con-
sider those videos as just sequences of images perfectly synchronised in time. In
other words, the acquisition consists in a sequence of Mimages (Is
0, Is
1, . . . , Is
M)
acquired from the static device paired with a sequence (Im
0, Im
1, . . . , Im
M) acquired
from the moving device at times t0, t1, . . . , tM.
After video acquisition, each image is processed to detect the fiducial marker.
For the static camera, this operation is needed to locate the 4 corners (c0, c1, c2, c3)
of the inner white square (i.e. the internal part of the marker inside the thick
black border). This region is then cropped to create a sequence of (I0,...,IN)
images composed by W×Hpixels commonly referred as Multi-light Image Col-
lection (MLIC). Note that Ncan be lower than Mbecause the fiducial marker
must be detected in both Is
iand Im
ito be added to the MLIC.
Each Iiis a single-channel grayscale image containing only the luminance
of the original Is
iimage. We decided to model only the reflectance intensity
(and not the wavelength) as a function of the light’s angle of incidence for two
reasons. First, we cannot change the colour of the light source and, second, it
is uncommon to have iridescent materials where the incindent angle affects the
reflectance spectrum [11]. Therefore, we convert all the images to the YUV colour
摘要:

On-the-goReectanceTransformationImagingwithOrdinarySmartphonesMaraPistellato1[000000016273290X]andFilippoBergamasco1[0000000166681556]DAIS,UniversitaCa'FoscariVenezia155,viaTorino,VeneziaItalyfmara.pistellato,filippo.bergamascog@unive.itAbstract.ReectanceTransformationImaging(RTI)isapopulartech-niq...

展开>> 收起<<
On-the-go Reectance Transformation Imaging with Ordinary Smartphones Mara Pistellato1000000016273290X.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:17 页 大小:7.83MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注