VM-NeRF Tackling Sparsity in NeRF with View Morphing Matteo Bortolon123 Alessio Del Bue2 and Fabio Poiesi12

2025-05-06 0 0 9.61MB 12 页 10玖币
侵权投诉
VM-NeRF: Tackling Sparsity in NeRF with View
Morphing
Matteo Bortolon1,2,3, Alessio Del Bue2, and Fabio Poiesi1,2
1TeV, Fondazione Bruno Kessler
2PAVIS, Istituto Italiano di Tecnologia
3DISI, University of Trento
{mbortolon,poiesi}@fbk.eu
{alessio.delbue}@iit.it
Abstract. NeRF aims to learn a continuous neural scene representation
by using a finite set of input images taken from various viewpoints. A
well-known limitation of NeRF methods is their reliance on data: the
fewer the viewpoints, the higher the likelihood of overfitting. This paper
addresses this issue by introducing a novel method to generate geomet-
rically consistent image transitions between viewpoints using View Mor-
phing. Our VM-NeRF approach requires no prior knowledge about the
scene structure, as View Morphing is based on the fundamental principles
of projective geometry. VM-NeRF tightly integrates this geometric view
generation process during the training procedure of standard NeRF ap-
proaches. Notably, our method significantly improves novel view synthe-
sis, particularly when only a few views are available. Experimental eval-
uation reveals consistent improvement over current methods that handle
sparse viewpoints in NeRF models. We report an increase in PSNR of up
to 1.8dB and 1.0dB when training uses eight and four views, respectively.
Source code: https://github.com/mbortolon97/VM-NeRF
1 Introduction
Novel View Synthesis (NVS) is the problem of synthesising unseen camera views
from a set of known views4[29,8]. NVS is a key technology that can enable
compelling augmented or virtual reality experiences [10], new entertainment
technology [6], and robotics applications [11]. NVS has undergone a significant
improvement after the introduction of Neural Radiance Fields (NeRF) [17,2] – a
trainable implicit neural representation of a 3D scene that can photorealistically
render unseen (novel) views. NeRF is a data-driven model that can synthe-
sise high-quality novel views but in general requiring several multi-view images,
e.g. about hundreds of images taken from different and uniformly distributed
camera viewpoints around an object of interest [17]. If these viewpoints are few
and/or not uniformly distributed, the resulting NeRF model may fail to produce
4Throughout the paper, we will use the term viewpoint to refer to the camera pose,
view to refer to the scene seen through a certain viewpoint and to image to refer to
the photometric content captured from a view.
arXiv:2210.04214v2 [cs.CV] 16 Aug 2023
2 Matteo Bortolon, Alessio Del Bue, and Fabio Poiesi
Without
NeRF-basedVM
With
NeRF-basedVM
NeRF
Results
Fig. 1: Given a set of known views (ground truth), View Morphing-NeRF (VM-
NeRF) generates image transitions between views (morph) that can be effectively
used to train a NeRF model in the case of few-shot view synthesis. Results are
of a higher quality when VM-NeRF is used.
satisfactory novel views [12,16]. This detrimental effect is a known drawback of
NeRF-based approaches and it is due to the likelihood of overfitting on known
viewpoints while decreasing generalisation on novel views that are furthest from
the given viewpoints, namely the few-shot view synthesis problem [12].
In this paper, we propose to tackle the problem of training a NeRF model on
scenes captured with a sparse set of viewpoints by using a novel geometry-based
strategy based on View Morphing [24] (Fig. 1). This purely geometric method
can synthesise or morph a new viewpoint that lies in-between two given camera
views while ensuring realistic image transitions. Traditionally, view morphing
requires a set of accurate point matches between known image pairs in order to
successfully perform the morph. As this matching stage is hard to integrate into
a NeRF-based learning pipeline, our intuition is to leverage the per-image depth
information implicitly estimated by NeRF to obtain dense coordinate matches
among views after an image rectification stage (Fig. 2). To this end, we have
to relax and modify several steps of the view morphing strategy to be duly
integrated in the NeRF learning paradigm. This technique does not require any
prior knowledge about the captured 3D scene, and it can synthesise 3D projective
transformations (e.g. 3D rotations, translations, shears) of objects by operating
entirely on the input images. We evaluate our approach by using the dataset of
the original NeRF’s paper [17] and we show that PSNR improves up to 1.8dB
and 1.0dB when eight and four views are used for training, respectively. We
compare our approach with DietNeRF [12], AugNeRF [5] and RegNeRF [19],
and show that our approach can produce higher-quality renderings.
VM-NeRF: Tackling Sparsity in NeRF with View Morphing 3
To summarise, our contributions are:
We present a novel and effective method for NeRF to address the problem
of few-shot view synthesis;
We introduce a new view morphing technique based on the NeRF depth
output, named VM-NeRF;
VM-NeRF can achieve higher-quality rendered images than alternative meth-
ods in the literature.
2 Related work
NVS scene synthesis can be solved either by using traditional 3D reconstruction
techniques [23] or by adopting methods based on neural rendering [26]. Neural
Radiance Fields (NeRF) is a recent neural rendering method that can learn a
volumetric representation of an unknown 3D scene approximating its radiance
and density fields from a set of known (ground truth) views by using a multilayer
perceptron (MLP) [17]. NeRF optimises its parameters on one scene based on a
set of known views, thus overfitting can occur when these views are few.
Current approaches addressing few-shot novel view synthesis can be divided
into two groups. The first group uses the same trained network to generate novel
views of different scenes. This category of methods trains on datasets charac-
terised by similar scenes, such as DTU [1]. Multiple-scene training can introduce
datasets biases and may produce low-quality results in contexts outside the train-
ing domain [27,18]. SparseNeuS [14] and ShaRF [22] train NVS on multiple scenes
by conditioning the MLP with features that encode appearance and geometry
of the surface at a 3D location. This can be achieved by using an auxiliary deep
network jointly trained with NeRF. The second group uses the original per-scene
optimisation procedure of NeRF, so a single network trains and tests only on
one scene leading to methods without dataset bias. These methods are more
likely to encounter overfit problems on the known views, however they reduce
this likelihood by adding either semantic or geometric constraints during train-
ing. DietNeRF belongs to this category and exploits the feature representations
of known images computed with a CLIP pre-trained image encoder, renders
random poses, and processes them by imposing semantic consistency through
CLIP features [12]. RegNeRF [19] renders random viewpoints around the known
ones, and introduces regularisation constraints between known viewpoints and
randomly sampled ones.
Single-scene methods working with few viewpoints may overfit on the known
images, producing artefacts when novel views are rendered. In general, we can
mitigate overfitting via data augmentation [25], and to the best of our knowledge,
the only methods that address data augmentation for NeRF are AugNeRF [5]
and GeoAug [4]. AugNeRF aims to improve NeRF generalisation by using ad-
versarial data augmentation to enforce each ray and its augmented version to
produce the same result. GeoAug [4] perturbs translation and rotation of the
known viewpoints during training. Our proposed approach does not perturb the
known input views and rays, instead we create new views (novel 3D projective
摘要:

VM-NeRF:TacklingSparsityinNeRFwithViewMorphingMatteoBortolon1,2,3,AlessioDelBue2,andFabioPoiesi1,21TeV,FondazioneBrunoKessler2PAVIS,IstitutoItalianodiTecnologia3DISI,UniversityofTrento{mbortolon,poiesi}@fbk.eu{alessio.delbue}@iit.itAbstract.NeRFaimstolearnacontinuousneuralscenerepresentationbyusinga...

展开>> 收起<<
VM-NeRF Tackling Sparsity in NeRF with View Morphing Matteo Bortolon123 Alessio Del Bue2 and Fabio Poiesi12.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:9.61MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注