Cut-and-Approximate 3D Shape Reconstruction from Planar Cross-sections with Deep Reinforcement Learning

2025-05-06 0 0 4.05MB 13 页 10玖币
侵权投诉
Cut-and-Approximate: 3D Shape Reconstruction
from Planar Cross-sections with Deep Reinforcement
Learning
Azimkhon A. Ostonov
Department of Computer, Electrical and Mathematical Science & Engineering
King Abdullah University of Science and Technology
Thuwal, SA 23955-6900
azimkhon.ostonov@kaust.edu.sa
Abstract
Current methods for 3D object reconstruction from a set of planar cross-sections
still struggle to capture detailed topology or require a considerable number of cross-
sections. In this paper, we present, to the best of our knowledge the first 3D shape
reconstruction network to solve this task which additionally uses orthographic
projections of the shape. Our method is based on applying a Reinforcement
Learning algorithm to learn how to effectively parse the shape using a trial-and-
error scheme relying on scalar rewards. This method cuts a part of a 3D shape
in each step which is then approximated as a polygon mesh. The agent aims to
maximize the reward that depends on the accuracy of surface reconstruction for
the approximated parts. We also consider pre-training of the network for faster
learning using demonstrations generated by a heuristic approach. Experiments
show that our training algorithm which benefits from both imitation learning and
also self exploration, learns efficient policies faster, which results the agent to
produce visually compelling results.
1 Introduction
Surface reconstruction from planar cross-sections has many applications including organ reconstruc-
tion from signals stemming from MRI scanners, CT scanners [
4
], terrain contour lines [
43
], orebody
modeling, and many others. The problem in these examples can be formulated as building a manifold
surface interpolating (or approximating) given planar cross-sections. More concretely, the task is
equivalent to classifying each point
pR3
as outside, inside or on the surface taking into account the
given input information, so that classification results should match for each individual cross-section.
In many cases, several input planar cross-sections do not provide sufficient information about the
geometry and topology of the object. This makes it hard to detect connectivity and branching cases
between two consecutive cross sections. Increasing the number of cross sections is not an optimal
solution as it could result in slow performance of the reconstruction algorithm or additional data
might simply not be available. However, in many surface reconstruction domains, there exists some
prior information representing detailed geometry and topology of the objects that could be utilized.
This prior information in our case consists of
3
different orthographic projections (front view, top
view and end view) of the object. One particular advantage of this representation is that in most
cases it can be drawn from the initial slices, which makes the given data sufficient without having
additional knowledge.
Orthographic projections of the shape represent enough valuable information about its topology
and geometry. Our approach is next to employ classical divide-and-conquer algorithm to effectively
parse the initial shape into smaller regions (bounded by primitives) and then reconstruct each
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.12509v1 [cs.CV] 22 Oct 2022
Corner points
Reinforcement
Ground truth mesh Input slices
Input to the network
Separate parts Reconstructed mesh
Cut a part
Figure 1: The pipeline for shape reconstruction from slices. We assume that the initial slices and
orthographic projections of the shape into 3 planes are given. Our method based on Reinforcement
Learning iteratively divides the initial shape into smaller parts to decrease reconstruction error. Then
separate parts are reconstructed and combined to obtain the final result.
part separately. Shape parsing as a collection of smaller parts is an important aspect of shape
understanding and analysis [
39
,
46
,
18
,
33
]. This is a first step for computer programs or robots to
understand 3D shape. The result of the parsing of the input shape into smaller parts then can be used
in many applications to further analyze important features for example, stability of the input shape,
geometrical features such as symmetry and connectivity. More detailed shape parsing therefore
constitutes as a foundation for more accurate estimation of these features. As a result of this step, in
the ideal case, we assume that each primitive (box in our case) bounds a part of the object which has
no branches and disconnections. For this type of object most surface reconstruction algorithms work
efficiently [
6
,
2
]. Then, the final result would be possible to get by combining all reconstructed parts.
Our approach for the shape parsing is neural network based and it works sequentially, cutting a part of
the shape until all the area is covered. Taking into account input data representation we consider using
corner points detection algorithm in the orthographic projection images to help the agent. Corner
points usually indicates to complex geometry or branching points in the object topology. We learn
effective shape parsing which leads to increase in the accuracy of the reconstruction from planar
cross-sections of the shape. We employ Reinforcement Learning (RL) to solve this task, by taking
actions and collecting rewards. In each step we cut a part of 3D object and reconstruct it from planar
cross-sections in this region. The reward function is designed to take into account the accuracy of
the reconstruction. Objects are arbitrarily selected from the ShapeNet [
7
] dataset and the number of
planar cross sections is fixed. In the experiments section we show that our algorithm works efciently
for both ShapeNet objects and also downloaded objects from the Internet.
One of the problems related with using this approach is computational complexity while working with
3D objects. Direct applying off-the-shelf RL methods does not produce satisfying results. Therefore,
following the works [
48
,
29
] we also consider to use Imitation Learning for quick-start pre-training of
the network. The difference of our approach is we use a heuristic method to generate demonstration
data and effectively search for optimal policies. This boosts the performance of our method and
results in visually acceptable 3D reconstructions.
Our contributions are as follows:
We propose topology-aware reconstruction of 3D models represented as planar cross-sections
using an RL algorithm. To the best of our knowledge this is the first attempt using RL to
sequentially cut parts of a 3D model to reconstruct a given shape.
We propose an efficient training algorithm which benefits from both Imitation Learning (IL)
and RL. This is mainly due to that we considered to update suboptimal demonstration data
in the replay buffer using both agent’s self exploration data and heuristic policy during the
training.
We demonstrate the potential of our algorithm on the ShapeNet [
7
] data set on various types of shape
classes in terms of how it captures the details of the 3D model by comparing with state-of-the-art
2
methods. Additionally, we access our method on different models downloaded from the Internet to
show the generalization capability of the method.
2 Related Work
Slice based input shape reconstruction.
3D shape reconstruction from planar contours is well
explored problem. This is especially popular in the medical image reconstruction. Most of these
work focused on creating a closed surface from existing sections usually provided by CT or MRI
radiologists. Earlier approaches are directed to solve the connectivity problem between parallel slices
with different constraints on shape geometries [
6
,
49
]. Bajaj et al. [
2
] considered arbitrary topology
shape reconstruction from parallel planar cross sections. Liu et al. [
19
] proposed to use medial axes
based space partitioning using given slices to reconstruct the shape from non-parallel contours. The
main advantage of these methods is fast reconstruction, but they rely on a large number of slices
and also the final result usually contains jagged areas which requires additional post-processing
using mesh smoothing algorithms. Additionally, the above methods fail to find correspondence
between consecutive slices with difficult geometries, branches, self intersections. Another common
approach which studied in many previous work is related with extraction of the surface as the zero
set of an implicit function [
47
,
8
,
5
,
53
]. For example, Bermano et al. [
5
] proposed an effective
approach to capture unknown regions based on the given data. However, problems related with
branching and connectivity between difficult geometries still remain open. Few other approaches
are based on simulated annealing [
34
,
30
], which assumes infinite-source of slices from one or two
orthogonal views are given. Each slice is then modelled as a 2D Markov-Gibbs random field which
are concatenated to form 3D shape. These work also focused on the reconstruction of medical
images. Zou et al. [
53
] considered topology-constrained surface reconstruction from cross sections.
One example of reconstructing a 3D shape from slices is proposed by Fang et al. [
9
], which uses
point-cloud input. Shen et al. [
40
] considered reconstruction of input 3D shape, predicting slices
from image in the frequency domain. The output for their method is a volumetric mesh. Our work is
different from these methods, we solve this problem in RL paradigm by obtaining rewards for each
action.
Using additional data for surface reconstruction.
The main problem related with surface recon-
struction from planar cross sections are due to not awareness of the topological information of
the shape, such as connectivity, branching and self-intersections. This type of information is hard
to convey with slices as it would require large number of them. Thus, some previous methods
[
19
,
1
,
53
,
13
] rely on reconstructing the surface and then apply mesh fairing, smoothing algorithms
to reduce the topological errors. However, this usually results overly smooth regions which is not
compelling with the initial information characterized by the slices. Some other work considered to
additionally utilize topological information. Most notably, Zou et al. [
53
] proposed to use the number
of genus in the shape, and then apply dynamic programming approach to find the most suitable
topology out of several options based on handcrafted score function. While this approach solves
some topological issues, connectivity and branching related problems still remains unsolved. Some
methods addressed connectivity problems for closed surfaces [
31
,
15
]. Others put this problem to
solve by user interactions [
38
,
50
]. One common approach is related with using templates to fit the
resulting mesh to the ground-truth mesh [
12
]. Templates usually include different views, orthographic
projections of the shape. In this work we use orthographic projections, as in many cases it can be
derived from the slices, and also contain topological information about the shape.
Shape generation and reconstruction with Reinforcement Learning.
RL is successively used in
many 2D domains including stroke-based painting [
10
,
14
,
23
], shape grammar parsing [
44
], sketch
drawing [
52
,
28
], scene synthesis [
32
], point-cloud registration [
3
]. In 3D case, Sharma et al. [
39
]
proposed a network based on an encoder-decoder architecture and RL to produce a compact program
to generate 2D and 3D input shape. The input shape in this work is obtained from primitive shapes
by applying Boolean operations formed as a grammar. However, the 3D shapes in the real life could
be much more complex to obtain them with such a grammar, which limits the use of this method to
simple shapes. While, we train our method on ShapeNet [
7
] data set. The work proposed by Lin et
al. [
18
], uses two networks to model 3D shape like human modelers. The first network Prim-Agent
represents 3D shape as a combination of cuboids and the second network Mesh-Agent is used to
smooth the output of the first network to obtain similar result to the input shape. This method also
uses RL with imitation learning from expert demonstrations generated by heuristic approach using
3
摘要:

Cut-and-Approximate:3DShapeReconstructionfromPlanarCross-sectionswithDeepReinforcementLearningAzimkhonA.OstonovDepartmentofComputer,ElectricalandMathematicalScience&EngineeringKingAbdullahUniversityofScienceandTechnologyThuwal,SA23955-6900azimkhon.ostonov@kaust.edu.saAbstractCurrentmethodsfor3Dobjec...

展开>> 收起<<
Cut-and-Approximate 3D Shape Reconstruction from Planar Cross-sections with Deep Reinforcement Learning.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:4.05MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注