Cut-and-Approximate 3D Shape Reconstruction from Planar Cross-sections with Deep Reinforcement Learning

2025-05-06 0 0 4.05MB 13 页 10玖币

侵权投诉

Cut-and-Approximate: 3D Shape Reconstruction

from Planar Cross-sections with Deep Reinforcement

Learning

Azimkhon A. Ostonov

Department of Computer, Electrical and Mathematical Science & Engineering

King Abdullah University of Science and Technology

Thuwal, SA 23955-6900

azimkhon.ostonov@kaust.edu.sa

Abstract

Current methods for 3D object reconstruction from a set of planar cross-sections

still struggle to capture detailed topology or require a considerable number of cross-

sections. In this paper, we present, to the best of our knowledge the ﬁrst 3D shape

reconstruction network to solve this task which additionally uses orthographic

projections of the shape. Our method is based on applying a Reinforcement

Learning algorithm to learn how to effectively parse the shape using a trial-and-

error scheme relying on scalar rewards. This method cuts a part of a 3D shape

in each step which is then approximated as a polygon mesh. The agent aims to

maximize the reward that depends on the accuracy of surface reconstruction for

the approximated parts. We also consider pre-training of the network for faster

learning using demonstrations generated by a heuristic approach. Experiments

show that our training algorithm which beneﬁts from both imitation learning and

also self exploration, learns efﬁcient policies faster, which results the agent to

produce visually compelling results.

1 Introduction

Surface reconstruction from planar cross-sections has many applications including organ reconstruc-

tion from signals stemming from MRI scanners, CT scanners [

], terrain contour lines [

], orebody

modeling, and many others. The problem in these examples can be formulated as building a manifold

surface interpolating (or approximating) given planar cross-sections. More concretely, the task is

equivalent to classifying each point

p∈R3

as outside, inside or on the surface taking into account the

given input information, so that classiﬁcation results should match for each individual cross-section.

In many cases, several input planar cross-sections do not provide sufﬁcient information about the

geometry and topology of the object. This makes it hard to detect connectivity and branching cases

between two consecutive cross sections. Increasing the number of cross sections is not an optimal

solution as it could result in slow performance of the reconstruction algorithm or additional data

might simply not be available. However, in many surface reconstruction domains, there exists some

prior information representing detailed geometry and topology of the objects that could be utilized.

This prior information in our case consists of

different orthographic projections (front view, top

view and end view) of the object. One particular advantage of this representation is that in most

cases it can be drawn from the initial slices, which makes the given data sufﬁcient without having

additional knowledge.

Orthographic projections of the shape represent enough valuable information about its topology

and geometry. Our approach is next to employ classical divide-and-conquer algorithm to effectively

parse the initial shape into smaller regions (bounded by primitives) and then reconstruct each

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.12509v1 [cs.CV] 22 Oct 2022

Corner points

Reinforcement

Ground truth mesh Input slices

Input to the network

Separate parts Reconstructed mesh

Cut a part

Figure 1: The pipeline for shape reconstruction from slices. We assume that the initial slices and

orthographic projections of the shape into 3 planes are given. Our method based on Reinforcement

Learning iteratively divides the initial shape into smaller parts to decrease reconstruction error. Then

separate parts are reconstructed and combined to obtain the ﬁnal result.

part separately. Shape parsing as a collection of smaller parts is an important aspect of shape

understanding and analysis [

]. This is a ﬁrst step for computer programs or robots to

understand 3D shape. The result of the parsing of the input shape into smaller parts then can be used

in many applications to further analyze important features for example, stability of the input shape,

geometrical features such as symmetry and connectivity. More detailed shape parsing therefore

constitutes as a foundation for more accurate estimation of these features. As a result of this step, in

the ideal case, we assume that each primitive (box in our case) bounds a part of the object which has

no branches and disconnections. For this type of object most surface reconstruction algorithms work

efﬁciently [

]. Then, the ﬁnal result would be possible to get by combining all reconstructed parts.

Our approach for the shape parsing is neural network based and it works sequentially, cutting a part of

the shape until all the area is covered. Taking into account input data representation we consider using

corner points detection algorithm in the orthographic projection images to help the agent. Corner

points usually indicates to complex geometry or branching points in the object topology. We learn

effective shape parsing which leads to increase in the accuracy of the reconstruction from planar

cross-sections of the shape. We employ Reinforcement Learning (RL) to solve this task, by taking

actions and collecting rewards. In each step we cut a part of 3D object and reconstruct it from planar

cross-sections in this region. The reward function is designed to take into account the accuracy of

the reconstruction. Objects are arbitrarily selected from the ShapeNet [

] dataset and the number of

planar cross sections is ﬁxed. In the experiments section we show that our algorithm works efﬁciently

for both ShapeNet objects and also downloaded objects from the Internet.

One of the problems related with using this approach is computational complexity while working with

3D objects. Direct applying off-the-shelf RL methods does not produce satisfying results. Therefore,

following the works [

] we also consider to use Imitation Learning for quick-start pre-training of

the network. The difference of our approach is we use a heuristic method to generate demonstration

data and effectively search for optimal policies. This boosts the performance of our method and

results in visually acceptable 3D reconstructions.

Our contributions are as follows:

•

We propose topology-aware reconstruction of 3D models represented as planar cross-sections

using an RL algorithm. To the best of our knowledge this is the ﬁrst attempt using RL to

sequentially cut parts of a 3D model to reconstruct a given shape.

•

We propose an efﬁcient training algorithm which beneﬁts from both Imitation Learning (IL)

and RL. This is mainly due to that we considered to update suboptimal demonstration data

in the replay buffer using both agent’s self exploration data and heuristic policy during the

training.

We demonstrate the potential of our algorithm on the ShapeNet [

] data set on various types of shape

classes in terms of how it captures the details of the 3D model by comparing with state-of-the-art

methods. Additionally, we access our method on different models downloaded from the Internet to

show the generalization capability of the method.

2 Related Work

Slice based input shape reconstruction.

3D shape reconstruction from planar contours is well

explored problem. This is especially popular in the medical image reconstruction. Most of these

work focused on creating a closed surface from existing sections usually provided by CT or MRI

radiologists. Earlier approaches are directed to solve the connectivity problem between parallel slices

with different constraints on shape geometries [

]. Bajaj et al. [

] considered arbitrary topology

shape reconstruction from parallel planar cross sections. Liu et al. [

] proposed to use medial axes

based space partitioning using given slices to reconstruct the shape from non-parallel contours. The

main advantage of these methods is fast reconstruction, but they rely on a large number of slices

and also the ﬁnal result usually contains jagged areas which requires additional post-processing

using mesh smoothing algorithms. Additionally, the above methods fail to ﬁnd correspondence

between consecutive slices with difﬁcult geometries, branches, self intersections. Another common

approach which studied in many previous work is related with extraction of the surface as the zero

set of an implicit function [

]. For example, Bermano et al. [

] proposed an effective

approach to capture unknown regions based on the given data. However, problems related with

branching and connectivity between difﬁcult geometries still remain open. Few other approaches

are based on simulated annealing [

], which assumes inﬁnite-source of slices from one or two

orthogonal views are given. Each slice is then modelled as a 2D Markov-Gibbs random ﬁeld which

are concatenated to form 3D shape. These work also focused on the reconstruction of medical

images. Zou et al. [

] considered topology-constrained surface reconstruction from cross sections.

One example of reconstructing a 3D shape from slices is proposed by Fang et al. [

], which uses

point-cloud input. Shen et al. [

] considered reconstruction of input 3D shape, predicting slices

from image in the frequency domain. The output for their method is a volumetric mesh. Our work is

different from these methods, we solve this problem in RL paradigm by obtaining rewards for each

action.

Using additional data for surface reconstruction.

The main problem related with surface recon-

struction from planar cross sections are due to not awareness of the topological information of

the shape, such as connectivity, branching and self-intersections. This type of information is hard

to convey with slices as it would require large number of them. Thus, some previous methods

[

] rely on reconstructing the surface and then apply mesh fairing, smoothing algorithms

to reduce the topological errors. However, this usually results overly smooth regions which is not

compelling with the initial information characterized by the slices. Some other work considered to

additionally utilize topological information. Most notably, Zou et al. [

] proposed to use the number

of genus in the shape, and then apply dynamic programming approach to ﬁnd the most suitable

topology out of several options based on handcrafted score function. While this approach solves

some topological issues, connectivity and branching related problems still remains unsolved. Some

methods addressed connectivity problems for closed surfaces [

]. Others put this problem to

solve by user interactions [

]. One common approach is related with using templates to ﬁt the

resulting mesh to the ground-truth mesh [

]. Templates usually include different views, orthographic

projections of the shape. In this work we use orthographic projections, as in many cases it can be

derived from the slices, and also contain topological information about the shape.

Shape generation and reconstruction with Reinforcement Learning.

RL is successively used in

many 2D domains including stroke-based painting [

], shape grammar parsing [

], sketch

drawing [

], scene synthesis [

], point-cloud registration [

]. In 3D case, Sharma et al. [

]

proposed a network based on an encoder-decoder architecture and RL to produce a compact program

to generate 2D and 3D input shape. The input shape in this work is obtained from primitive shapes

by applying Boolean operations formed as a grammar. However, the 3D shapes in the real life could

be much more complex to obtain them with such a grammar, which limits the use of this method to

simple shapes. While, we train our method on ShapeNet [

] data set. The work proposed by Lin et

al. [

], uses two networks to model 3D shape like human modelers. The ﬁrst network Prim-Agent

represents 3D shape as a combination of cuboids and the second network Mesh-Agent is used to

smooth the output of the ﬁrst network to obtain similar result to the input shape. This method also

uses RL with imitation learning from expert demonstrations generated by heuristic approach using

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Cut-and-Approximate:3DShapeReconstructionfromPlanarCross-sectionswithDeepReinforcementLearningAzimkhonA.OstonovDepartmentofComputer,ElectricalandMathematicalScience&EngineeringKingAbdullahUniversityofScienceandTechnologyThuwal,SA23955-6900azimkhon.ostonov@kaust.edu.saAbstractCurrentmethodsfor3Dobjec...

展开>> 收起<<

Cut-and-Approximate 3D Shape Reconstruction from Planar Cross-sections with Deep Reinforcement Learning.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Cut-and-Approximate 3D Shape Reconstruction from Planar Cross-sections with Deep Reinforcement Learning

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: