VM-NeRF Tackling Sparsity in NeRF with View Morphing Matteo Bortolon123 Alessio Del Bue2 and Fabio Poiesi12

2025-05-06 0 0 9.61MB 12 页 10玖币

侵权投诉

VM-NeRF: Tackling Sparsity in NeRF with View

Morphing

Matteo Bortolon1,2,3, Alessio Del Bue2, and Fabio Poiesi1,2

1TeV, Fondazione Bruno Kessler

2PAVIS, Istituto Italiano di Tecnologia

3DISI, University of Trento

{mbortolon,poiesi}@fbk.eu

{alessio.delbue}@iit.it

Abstract. NeRF aims to learn a continuous neural scene representation

by using a ﬁnite set of input images taken from various viewpoints. A

well-known limitation of NeRF methods is their reliance on data: the

fewer the viewpoints, the higher the likelihood of overﬁtting. This paper

addresses this issue by introducing a novel method to generate geomet-

rically consistent image transitions between viewpoints using View Mor-

phing. Our VM-NeRF approach requires no prior knowledge about the

scene structure, as View Morphing is based on the fundamental principles

of projective geometry. VM-NeRF tightly integrates this geometric view

generation process during the training procedure of standard NeRF ap-

proaches. Notably, our method signiﬁcantly improves novel view synthe-

sis, particularly when only a few views are available. Experimental eval-

uation reveals consistent improvement over current methods that handle

sparse viewpoints in NeRF models. We report an increase in PSNR of up

to 1.8dB and 1.0dB when training uses eight and four views, respectively.

Source code: https://github.com/mbortolon97/VM-NeRF

1 Introduction

Novel View Synthesis (NVS) is the problem of synthesising unseen camera views

from a set of known views4[29,8]. NVS is a key technology that can enable

compelling augmented or virtual reality experiences [10], new entertainment

technology [6], and robotics applications [11]. NVS has undergone a signiﬁcant

improvement after the introduction of Neural Radiance Fields (NeRF) [17,2] – a

trainable implicit neural representation of a 3D scene that can photorealistically

render unseen (novel) views. NeRF is a data-driven model that can synthe-

sise high-quality novel views but in general requiring several multi-view images,

e.g. about hundreds of images taken from diﬀerent and uniformly distributed

camera viewpoints around an object of interest [17]. If these viewpoints are few

and/or not uniformly distributed, the resulting NeRF model may fail to produce

4Throughout the paper, we will use the term viewpoint to refer to the camera pose,

view to refer to the scene seen through a certain viewpoint and to image to refer to

the photometric content captured from a view.

arXiv:2210.04214v2 [cs.CV] 16 Aug 2023

2 Matteo Bortolon, Alessio Del Bue, and Fabio Poiesi

Without

NeRF-basedVM

With

NeRF-basedVM

NeRF

Results

Fig. 1: Given a set of known views (ground truth), View Morphing-NeRF (VM-

NeRF) generates image transitions between views (morph) that can be eﬀectively

used to train a NeRF model in the case of few-shot view synthesis. Results are

of a higher quality when VM-NeRF is used.

satisfactory novel views [12,16]. This detrimental eﬀect is a known drawback of

NeRF-based approaches and it is due to the likelihood of overﬁtting on known

viewpoints while decreasing generalisation on novel views that are furthest from

the given viewpoints, namely the few-shot view synthesis problem [12].

In this paper, we propose to tackle the problem of training a NeRF model on

scenes captured with a sparse set of viewpoints by using a novel geometry-based

strategy based on View Morphing [24] (Fig. 1). This purely geometric method

can synthesise or morph a new viewpoint that lies in-between two given camera

views while ensuring realistic image transitions. Traditionally, view morphing

requires a set of accurate point matches between known image pairs in order to

successfully perform the morph. As this matching stage is hard to integrate into

a NeRF-based learning pipeline, our intuition is to leverage the per-image depth

information implicitly estimated by NeRF to obtain dense coordinate matches

among views after an image rectiﬁcation stage (Fig. 2). To this end, we have

to relax and modify several steps of the view morphing strategy to be duly

integrated in the NeRF learning paradigm. This technique does not require any

prior knowledge about the captured 3D scene, and it can synthesise 3D projective

transformations (e.g. 3D rotations, translations, shears) of objects by operating

entirely on the input images. We evaluate our approach by using the dataset of

the original NeRF’s paper [17] and we show that PSNR improves up to 1.8dB

and 1.0dB when eight and four views are used for training, respectively. We

compare our approach with DietNeRF [12], AugNeRF [5] and RegNeRF [19],

and show that our approach can produce higher-quality renderings.

VM-NeRF: Tackling Sparsity in NeRF with View Morphing 3

To summarise, our contributions are:

–We present a novel and eﬀective method for NeRF to address the problem

of few-shot view synthesis;

–We introduce a new view morphing technique based on the NeRF depth

output, named VM-NeRF;

–VM-NeRF can achieve higher-quality rendered images than alternative meth-

ods in the literature.

2 Related work

NVS scene synthesis can be solved either by using traditional 3D reconstruction

techniques [23] or by adopting methods based on neural rendering [26]. Neural

Radiance Fields (NeRF) is a recent neural rendering method that can learn a

volumetric representation of an unknown 3D scene approximating its radiance

and density ﬁelds from a set of known (ground truth) views by using a multilayer

perceptron (MLP) [17]. NeRF optimises its parameters on one scene based on a

set of known views, thus overﬁtting can occur when these views are few.

Current approaches addressing few-shot novel view synthesis can be divided

into two groups. The ﬁrst group uses the same trained network to generate novel

views of diﬀerent scenes. This category of methods trains on datasets charac-

terised by similar scenes, such as DTU [1]. Multiple-scene training can introduce

datasets biases and may produce low-quality results in contexts outside the train-

ing domain [27,18]. SparseNeuS [14] and ShaRF [22] train NVS on multiple scenes

by conditioning the MLP with features that encode appearance and geometry

of the surface at a 3D location. This can be achieved by using an auxiliary deep

network jointly trained with NeRF. The second group uses the original per-scene

optimisation procedure of NeRF, so a single network trains and tests only on

one scene leading to methods without dataset bias. These methods are more

likely to encounter overﬁt problems on the known views, however they reduce

this likelihood by adding either semantic or geometric constraints during train-

ing. DietNeRF belongs to this category and exploits the feature representations

of known images computed with a CLIP pre-trained image encoder, renders

random poses, and processes them by imposing semantic consistency through

CLIP features [12]. RegNeRF [19] renders random viewpoints around the known

ones, and introduces regularisation constraints between known viewpoints and

randomly sampled ones.

Single-scene methods working with few viewpoints may overﬁt on the known

images, producing artefacts when novel views are rendered. In general, we can

mitigate overﬁtting via data augmentation [25], and to the best of our knowledge,

the only methods that address data augmentation for NeRF are AugNeRF [5]

and GeoAug [4]. AugNeRF aims to improve NeRF generalisation by using ad-

versarial data augmentation to enforce each ray and its augmented version to

produce the same result. GeoAug [4] perturbs translation and rotation of the

known viewpoints during training. Our proposed approach does not perturb the

known input views and rays, instead we create new views (novel 3D projective

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

VM-NeRF:TacklingSparsityinNeRFwithViewMorphingMatteoBortolon1,2,3,AlessioDelBue2,andFabioPoiesi1,21TeV,FondazioneBrunoKessler2PAVIS,IstitutoItalianodiTecnologia3DISI,UniversityofTrento{mbortolon,poiesi}@fbk.eu{alessio.delbue}@iit.itAbstract.NeRFaimstolearnacontinuousneuralscenerepresentationbyusinga...

展开>> 收起<<

VM-NeRF Tackling Sparsity in NeRF with View Morphing Matteo Bortolon123 Alessio Del Bue2 and Fabio Poiesi12.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

VM-NeRF Tackling Sparsity in NeRF with View Morphing Matteo Bortolon123 Alessio Del Bue2 and Fabio Poiesi12

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: