Reaching Through Latent Space From Joint Statistics to Path Planning in Manipulation Chia-Man Hung12 Shaohong Zhong1 Walter Goodwin12

2025-04-29 0 0 4.21MB 10 页 10玖币
侵权投诉
Reaching Through Latent Space: From Joint Statistics to Path
Planning in Manipulation
Chia-Man Hung1,2, Shaohong Zhong1, Walter Goodwin1,2,
Oiwi Parker Jones1, Martin Engelcke1, Ioannis Havoutis2, Ingmar Posner1
Abstract—We present a novel approach to path planning for
robotic manipulators, in which paths are produced via iterative
optimisation in the latent space of a generative model of robot
poses. Constraints are incorporated through the use of constraint
satisfaction classifiers operating on the same space. Optimisation
leverages gradients through our learned models that provide a
simple way to combine goal reaching objectives with constraint
satisfaction, even in the presence of otherwise non-differentiable
constraints. Our models are trained in a task-agnostic manner on
randomly sampled robot poses. In baseline comparisons against a
number of widely used planners, we achieve commensurate per-
formance in terms of task success, planning time and path length,
performing successful path planning with obstacle avoidance on
a real 7-DoF robot arm.
Index Terms—Constrained Motion Planning, Representation
Learning, Deep Learning in Grasping and Manipulation, Opti-
mization and Optimal Control
I. INTRODUCTION
PATH planning is a cornerstone of robotics. For a robotic
manipulator, this generally consists of producing a se-
quence of joint states the robot needs to follow in order to
move from a start to a goal configuration. This requires that
the poses along the sequence are kinematically feasible while
at the same time avoiding unwanted contact either by the
manipulator with itself or with potential objects in the robot’s
workspace. Due to its importance, path planning is a richly
explored area in robotics (e.g. [1]–[6]). However, traditional
approaches are often marred by a number of issues. As the
state-space dimensionality increases and constraints become
more constrictive, the decreasing efficiency of traditional plan-
ning methods makes reactive behaviour computationally chal-
lenging. While existing sampling and optimisation-based ap-
proaches to the planning problem can find solutions, they scale
super-linearly with a robot’s degrees of freedom, and those that
have optimality guarantees on resulting paths are guaranteed
to achieve this only asymptotically, after infinite time [3].
Increasing system and task complexity also requires considera-
tion of multiple objectives (e.g. performing a certain task while
adhering to pose constraints). Yet, enforcing constraints on the
planned motion can be difficult. Traditional optimisation-based
planners can struggle to incorporate constraints that cannot be
expressed directly in joint space. Sampling-based planners, on
the other hand, struggle to find solutions in scenarios where
constraints render only a small volume of configuration space
feasible or where narrow passages exist [7].
The advent of deep learning has shown that learning-
based approaches can offer some relief in overcoming robotic
planning and control challenges. While a considerable body
of work examines the direct learning of control policies,
attempts have been made to apply deep learning to robotic
path planning (e.g. [8]). Learnt heuristics and neural network
collision detectors have been used as drop-in replacements to
stages of traditional methods (e.g. [9]–[11]). A number of
1Applied AI Lab (A2I), 2Dynamic Robot Systems (DRS)
Oxford Robotics Institute (ORI), University of Oxford
Correspondence to: chiaman@robots.ox.ac.uk
Fig. 1: A VAE is trained to produce a latent representation zof
the joint states and corresponding end-effector positions and an
obstacle collision predictor learns the probability of collision.
Once trained, gradients through the VAE decoder and collision
predictor enable optimisation in the latent space to bring the
decoded end-effector position closer to the target position.
Performing this optimisation iteratively with a learning rate
produces a series of latent values {zi}T
i=1 that describe a joint-
space path to the target that satisfies the collision constraint.
works explore the use of structured latent spaces to effect
planning and control (e.g. [12]–[15]). However, existing works
typically require training for a particular task on carefully cu-
rated data. In contrast, applications of variational autoencoders
(VAEs) in the space of affordance-learning [16] and quadruped
locomotion [17] have highlighted the potential of viewing
planning as run-time optimisation in pre-trained statistical
models of state-space to achieve feasible spatial paths under
environmental constraints.
Inspired by [16] and [17], in this work we explore an alter-
native, entirely data-driven approach to both joint-space plan-
ning and constraint satisfaction in a robot manipulation setting.
In particular, our approach leverages iterative, gradient-based
optimisation to produce a sequence of joint configurations by
traversing the latent space of a VAE. Training data for this
model is trivially obtained as it need not be in any way task-
oriented but can come from random motor-babbling on a real
platform, or simply sampling valid states in simulation. In
addition, performance predictors operating on the latent space
and potentially other observational data (for example, the posi-
tions of obstacles) are trained in a supervised fashion to output
probabilities of certain performance-related metrics, such as
whether the manipulator is in collision with an obstacle.
These networks are frozen after training and are subsequently
used in this gradient-based optimisation approach to planning
through activation maximisation [18], which is the process
of using backpropagation through network weights to find a
permutation to the network inputs that would act to bring about
a desired change in the network’s outputs.
Taking this view of path planning overcomes many of the
obstacles that make robotic path planning a non-trivial task: (a)
as our plans consist of states drawn from a deep generative
arXiv:2210.11779v1 [cs.RO] 21 Oct 2022
model fit to a large dataset of feasible robot poses, and are
thus approximately drawn from this data distribution, there is
a very high likelihood that every state in the planned path
is valid in terms of self-collisions and kinematic feasibil-
ity; (b) by modelling joint states and end-effector positions
jointly, we avoid the need to explicitly calculate inverse or
forward kinematics at any stage during planning, even when
the goal configuration is given in R3Cartesian space; (c)
by leveraging activation maximisation (AM) via gradients
through performance predictors, we can enforce arbitrarily
complex, potentially non-differentiable constraints that would
be hard to express in direct optimisation-based planners, and
might be intractably restrictive for sampling-based planners;
(d) by taking a pre-trained, data-driven approach to collision
avoidance, we do not need any geometric analysis or accurate
3D models at planning time, nor indeed do we need to perform
any kind of explicit collision checking, which is generally
the main computational bottleneck in sampling-based planners
[19].
In addition to the advantages in path planning that this
method offers and above and beyond related works, we in-
troduce an additional loss on the run-time AM optimisation
process which encourages the planning process to remain in
areas of high likelihood according to our prior belief under
the generative model. In our experiments we find that this
contribution is critical in enabling successful planning that
stays in feasible state-space.
II. RELATED WORK
Successful path planning for a robotic manipulator generally
consists of producing a kinematic sequence of joint states
through which the robot can actuate in order to move from
a start to a goal configuration, while moving only through
viable configurations. While goal positions may be specified
in the same joint space as the plan, in a manipulator context
it is more common for the goal position to be specified in R3
Cartesian end-effector space, or R6if the SO(3) rotation group
is included as well. Viable configurations are the intersection
of feasible states for the robot - i.e. those that are within
joint limits and do not result in self-collision - and collision-
free states with respect to obstacles in the environment. The
intersection of these defines the configuration space for the
robot in a given environment.
As analytically describing the valid configuration space is
generally intractable, sampling-based methods for planning
provide the ability to quickly find connected paths through
valid space, by checking for the validity of individual sampled
nodes. Variants of the Probabilistic Roadmap (PRM) and
Rapidly-exploring Random Tree (RRT) sampling-based algo-
rithms are widely used [1], [2], and provably asymptotically
optimal variants exists in PRM*, RRT* [3]. These methods
suffer from a trade-off between runtime and optimality: while
often relatively quick to find a feasible collision-free path, they
tend to employ a second, slower, stage of path optimisation
to shorten the path through the application of heuristics. In
the presence of restrictive constraints, both sampling- and
optimisation-based planners can be very slow to find an initial
feasible path [7].
Optimisation-based planners start from an initial path or
trajectory guess and then refine it until certain costs, such
as path length, are minimised, and differentiable constraints
satisfied. Techniques such as CHOMP [4] bridge the gap
between planning and optimal control by enabling planning
over path and dynamics. TrajOpt [20] differs from CHOMP
in the numerical optimisation method used and the method of
collision checking. The Gaussian Process Motion Planner [21]
leverages Gaussian process models to represent trajectories
and updates them through interpolation. Stochastic Trajectory
Optimization for Motion Planning (STOMP) [5] is notable in
this context as it is able to produce plans while optimising for
non-differentiable constraints , which our work enables with
gradients through trained performance predictors.
Planning with Deep Neural Networks. A recent line of
work has explored using deep networks to augment some or all
components of conventional planning methods. Qureshi et al.
[10], [11] train a pair of neural networks to embed point-cloud
environment representations and perform single timestep plan-
ning. Iterative application of the planning network produces a
path plan. Ichter and Pavone [9] learn an embedding space of
observations, and use RRT in this space with a learnt collision
checker to produce path plans, but need data to learn a forward
dynamics model in order to roll out the plan.
Another family of learning-based approaches to planning
learn embedding spaces from high dimensional data, and learn
forward dynamics models that operate on this learnt latent
space. Universal Planning Networks [12] learn deterministic
representations of high-dimensional data such that update
steps by gradient descent correspond to the unrolling of a
learned forward model. The Embed-to-Control works [13],
[22] employ variational inference in deep generative models
in which latent-space dynamics is locally linear, a property
that enables locally optimal control in these spaces. DVBFs
[23] improve on these models by relaxing the assumption
that the observation space is Markovian. PlaNet [24] uses
a latent-space dynamics model for planning in model-based
RL. However, planning in all these models tends to consist of
rolling out trajectories in time, finding a promising trajectory,
and executing the given actions. As such, these techniques tend
to become intractable for longer time horizons, and cannot be
thought of as path planning frameworks.
A different approach is that of encoding movement prim-
itives under the learning from demonstrations framework.
Conditional Neural Movement Primitives [25] extracts prior
knowledge from demonstrations and infers distributions over
trajectories conditioned on the current observation. Our ap-
proach differs in that we do not encode trajectories directly, but
rather learn a probabilistic model of robot states and generate
trajectories as we optimise in latent space.
Learning Inverse Kinematics. In this work, by learning a
joint embedding of joint angles qand end-effector positions
e, we are able to optimise for achieving an end-effector target
etarget, while planning state sequences in joint space. Note that
we do not care about the orientation in which the goal is
reached, therefore end-effector orientation is omitted in the
formulation of e. Learning the statistical model of kinematics
means we do not need to solve inverse kinematics (IK) at
any point. Prior work has sought to learn solutions to IK that
can cope with its ill-posed one-to-many nature for redundant
manipulators [26], [27], and to overcome the problems with
analytic and numerical approaches [28]–[30]. Ren et al. [27]
train a generative adversarial network to generate joint angles
from end-effector positions, with the discriminator acting on
the concatenation of both the input position and generated
joints. This method implicitly maximises p(q|e), but does not
address the multimodality of the true p(q|e)IK solutions.
Bocsi et al. [26] employed structured output learning to learn
a generative model for the joint distribution of joint angles
and end-effector positions. By modelling the joint instead of
摘要:

ReachingThroughLatentSpace:FromJointStatisticstoPathPlanninginManipulationChia-ManHung1;2,ShaohongZhong1,WalterGoodwin1;2,OiwiParkerJones1,MartinEngelcke1,IoannisHavoutis2,IngmarPosner1Abstract—Wepresentanovelapproachtopathplanningforroboticmanipulators,inwhichpathsareproducedviaiterativeoptimisatio...

展开>> 收起<<
Reaching Through Latent Space From Joint Statistics to Path Planning in Manipulation Chia-Man Hung12 Shaohong Zhong1 Walter Goodwin12.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:4.21MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注