methods. Additionally, we access our method on different models downloaded from the Internet to
show the generalization capability of the method.
2 Related Work
Slice based input shape reconstruction.
3D shape reconstruction from planar contours is well
explored problem. This is especially popular in the medical image reconstruction. Most of these
work focused on creating a closed surface from existing sections usually provided by CT or MRI
radiologists. Earlier approaches are directed to solve the connectivity problem between parallel slices
with different constraints on shape geometries [
6
,
49
]. Bajaj et al. [
2
] considered arbitrary topology
shape reconstruction from parallel planar cross sections. Liu et al. [
19
] proposed to use medial axes
based space partitioning using given slices to reconstruct the shape from non-parallel contours. The
main advantage of these methods is fast reconstruction, but they rely on a large number of slices
and also the final result usually contains jagged areas which requires additional post-processing
using mesh smoothing algorithms. Additionally, the above methods fail to find correspondence
between consecutive slices with difficult geometries, branches, self intersections. Another common
approach which studied in many previous work is related with extraction of the surface as the zero
set of an implicit function [
47
,
8
,
5
,
53
]. For example, Bermano et al. [
5
] proposed an effective
approach to capture unknown regions based on the given data. However, problems related with
branching and connectivity between difficult geometries still remain open. Few other approaches
are based on simulated annealing [
34
,
30
], which assumes infinite-source of slices from one or two
orthogonal views are given. Each slice is then modelled as a 2D Markov-Gibbs random field which
are concatenated to form 3D shape. These work also focused on the reconstruction of medical
images. Zou et al. [
53
] considered topology-constrained surface reconstruction from cross sections.
One example of reconstructing a 3D shape from slices is proposed by Fang et al. [
9
], which uses
point-cloud input. Shen et al. [
40
] considered reconstruction of input 3D shape, predicting slices
from image in the frequency domain. The output for their method is a volumetric mesh. Our work is
different from these methods, we solve this problem in RL paradigm by obtaining rewards for each
action.
Using additional data for surface reconstruction.
The main problem related with surface recon-
struction from planar cross sections are due to not awareness of the topological information of
the shape, such as connectivity, branching and self-intersections. This type of information is hard
to convey with slices as it would require large number of them. Thus, some previous methods
[
19
,
1
,
53
,
13
] rely on reconstructing the surface and then apply mesh fairing, smoothing algorithms
to reduce the topological errors. However, this usually results overly smooth regions which is not
compelling with the initial information characterized by the slices. Some other work considered to
additionally utilize topological information. Most notably, Zou et al. [
53
] proposed to use the number
of genus in the shape, and then apply dynamic programming approach to find the most suitable
topology out of several options based on handcrafted score function. While this approach solves
some topological issues, connectivity and branching related problems still remains unsolved. Some
methods addressed connectivity problems for closed surfaces [
31
,
15
]. Others put this problem to
solve by user interactions [
38
,
50
]. One common approach is related with using templates to fit the
resulting mesh to the ground-truth mesh [
12
]. Templates usually include different views, orthographic
projections of the shape. In this work we use orthographic projections, as in many cases it can be
derived from the slices, and also contain topological information about the shape.
Shape generation and reconstruction with Reinforcement Learning.
RL is successively used in
many 2D domains including stroke-based painting [
10
,
14
,
23
], shape grammar parsing [
44
], sketch
drawing [
52
,
28
], scene synthesis [
32
], point-cloud registration [
3
]. In 3D case, Sharma et al. [
39
]
proposed a network based on an encoder-decoder architecture and RL to produce a compact program
to generate 2D and 3D input shape. The input shape in this work is obtained from primitive shapes
by applying Boolean operations formed as a grammar. However, the 3D shapes in the real life could
be much more complex to obtain them with such a grammar, which limits the use of this method to
simple shapes. While, we train our method on ShapeNet [
7
] data set. The work proposed by Lin et
al. [
18
], uses two networks to model 3D shape like human modelers. The first network Prim-Agent
represents 3D shape as a combination of cuboids and the second network Mesh-Agent is used to
smooth the output of the first network to obtain similar result to the input shape. This method also
uses RL with imitation learning from expert demonstrations generated by heuristic approach using
3