Thus, strong priors encoding deformation regularity are necessary to tackle this problem. Physics
and differential geometry provide solutions that use various analytical priors which define natural-
looking mesh deformations, such as elasticity [
62
,
1
], Laplacian smoothness [
31
,
55
,
77
], and
rigidity [
54
,
57
,
29
] priors. They update mesh vertex coordinates by iteratively optimizing energy
functions that satisfy constraints from both the pre-defined deformation priors and given handle
locations. Although these algorithms can preserve geometric details of the original source model,
they still have limited capacity to model realistic deformations, since the deformation priors are
region independent, e.g., the head region deforms in a similar way as the tail of an animal, resulting
in unrealistic deformation states.
Hence, motivated by the recent success of deep neural networks for 3D shape modeling [
33
,
42
,
13
,
68
,
58
,
14
,
44
,
26
,
2
,
18
,
10
,
63
,
60
,
12
], we propose to learn shape deformation priors of a
specific object class, e.g., quadruped animals, to complete surface deformations beyond observed
handles. We formulate the following properties of such a learned model; (1) it should be robust to
different mesh quality and number of vertices, (2) the source mesh is not limited to canonical pose
(i.e., the input mesh can have arbitrary pose), and (3) it should generalize well to new deformations.
Towards these goals, we represent deformations as a continuous deformation field which is defined
in the near-surface region to describe the space deformation caused by the corresponding surface
deformation. The continuity property enables us to manipulate meshes with infinite number of
vertices and disconnected components. To handle source meshes in arbitrary poses, we learn shape
deformations via canonicalization. Specifically, the overall deformation process consists of two
stages: arbitrary-to-canonical transformation and canonical-to-arbitrary transformation. To obtain
more detailed surface deformations and better generalization capabilities to unseen deformations,
we propose to learn local deformation fields conditioned on local latent codes encoding geometry-
dependent deformation priors, instead of global deformation fields conditioned on a single latent
code. To this end, we propose Transformer-based Deformation Networks (TD-Nets), which learns
encoder-based local deformation fields on point cloud approximations of the input mesh. Concretely,
TD-Nets encode an input point cloud with surface geometry information and incomplete deformation
flow into a sparse set of local latent codes and a global feature vector by using the vector attention
blocks proposed in [
74
]. The deformation vectors of spatial points are estimated by an attentive
decoder, which aggregates the information of neighboring local latent codes of a spatial point based
on the feature similarity relationships. The aggregated feature vectors are finally passed to a multi-
layer-perceptron (MLP) to predict displacement vectors which can be applied to the source mesh to
compute the final output mesh.
To summarize, we introduce transformer-based local deformation field networks which are capable
to learn shape deformation priors for the task of user-driven shape manipulation. The deformation
networks learn a set of anchor features based on a vector attention mechanism, enhancing the
global deformation context, and selecting the most informative local deformation descriptors for
displacement vector estimations, leading to an improved generalization ability to new deformations.
In comparison to classical hand-crafted deformation priors as well as recent neural network-based
deformation predictors, our method achieves more accurate and natural shape deformations.
2 Related Work
User-guided shape manipulation lies at the intersection of computer graphics and computer vision.
Our proposed method is related to polygonal mesh geometry processing, neural field representations,
as well as vision transformers.
Optimization-based Shape Manipulation.
Classical methods formulate shape manipulation as
a mathematical optimization problem. They perform mesh deformations by either deforming the
vertices [
5
,
53
] or the 3D space [
23
,
3
,
29
,
37
,
51
]. Performing mesh deformation without any other
information about the target shape, but only using limited user-provided correspondences is an under-
constrained problem. To this end, the optimization methods require deformation priors to constraint
the deformation regularity as well as the smoothness of the deformed surface. Various analytic
priors have been proposed which encourage smooth surface deformations, such as elasticity [
62
,
1
],
Laplacian smoothness [
31
,
55
,
77
], and rigidity [
54
,
57
,
29
]. These methods use efficient linear solvers
to iteratively optimize energy functions that satisfy constraints from both the pre-defined deformation
prior and provided handle movements. Recently, NFGP [
69
] was proposed to optimize neural
2