Reaching Through Latent Space From Joint Statistics to Path Planning in Manipulation Chia-Man Hung12 Shaohong Zhong1 Walter Goodwin12

2025-04-29 0 0 4.21MB 10 页 10玖币

侵权投诉

Reaching Through Latent Space: From Joint Statistics to Path

Planning in Manipulation

Chia-Man Hung1,2, Shaohong Zhong1, Walter Goodwin1,2,

Oiwi Parker Jones1, Martin Engelcke1, Ioannis Havoutis2, Ingmar Posner1

Abstract—We present a novel approach to path planning for

robotic manipulators, in which paths are produced via iterative

optimisation in the latent space of a generative model of robot

poses. Constraints are incorporated through the use of constraint

satisfaction classiﬁers operating on the same space. Optimisation

leverages gradients through our learned models that provide a

simple way to combine goal reaching objectives with constraint

satisfaction, even in the presence of otherwise non-differentiable

constraints. Our models are trained in a task-agnostic manner on

randomly sampled robot poses. In baseline comparisons against a

number of widely used planners, we achieve commensurate per-

formance in terms of task success, planning time and path length,

performing successful path planning with obstacle avoidance on

a real 7-DoF robot arm.

Index Terms—Constrained Motion Planning, Representation

Learning, Deep Learning in Grasping and Manipulation, Opti-

mization and Optimal Control

I. INTRODUCTION

PATH planning is a cornerstone of robotics. For a robotic

manipulator, this generally consists of producing a se-

quence of joint states the robot needs to follow in order to

move from a start to a goal conﬁguration. This requires that

the poses along the sequence are kinematically feasible while

at the same time avoiding unwanted contact either by the

manipulator with itself or with potential objects in the robot’s

workspace. Due to its importance, path planning is a richly

explored area in robotics (e.g. [1]–[6]). However, traditional

approaches are often marred by a number of issues. As the

state-space dimensionality increases and constraints become

more constrictive, the decreasing efﬁciency of traditional plan-

ning methods makes reactive behaviour computationally chal-

lenging. While existing sampling and optimisation-based ap-

proaches to the planning problem can ﬁnd solutions, they scale

super-linearly with a robot’s degrees of freedom, and those that

have optimality guarantees on resulting paths are guaranteed

to achieve this only asymptotically, after inﬁnite time [3].

Increasing system and task complexity also requires considera-

tion of multiple objectives (e.g. performing a certain task while

adhering to pose constraints). Yet, enforcing constraints on the

planned motion can be difﬁcult. Traditional optimisation-based

planners can struggle to incorporate constraints that cannot be

expressed directly in joint space. Sampling-based planners, on

the other hand, struggle to ﬁnd solutions in scenarios where

constraints render only a small volume of conﬁguration space

feasible or where narrow passages exist [7].

The advent of deep learning has shown that learning-

based approaches can offer some relief in overcoming robotic

planning and control challenges. While a considerable body

of work examines the direct learning of control policies,

attempts have been made to apply deep learning to robotic

path planning (e.g. [8]). Learnt heuristics and neural network

collision detectors have been used as drop-in replacements to

stages of traditional methods (e.g. [9]–[11]). A number of

1Applied AI Lab (A2I), 2Dynamic Robot Systems (DRS)

Oxford Robotics Institute (ORI), University of Oxford

Correspondence to: chiaman@robots.ox.ac.uk

Fig. 1: A VAE is trained to produce a latent representation zof

the joint states and corresponding end-effector positions and an

obstacle collision predictor learns the probability of collision.

Once trained, gradients through the VAE decoder and collision

predictor enable optimisation in the latent space to bring the

decoded end-effector position closer to the target position.

Performing this optimisation iteratively with a learning rate

produces a series of latent values {zi}T

i=1 that describe a joint-

space path to the target that satisﬁes the collision constraint.

works explore the use of structured latent spaces to effect

planning and control (e.g. [12]–[15]). However, existing works

typically require training for a particular task on carefully cu-

rated data. In contrast, applications of variational autoencoders

(VAEs) in the space of affordance-learning [16] and quadruped

locomotion [17] have highlighted the potential of viewing

planning as run-time optimisation in pre-trained statistical

models of state-space to achieve feasible spatial paths under

environmental constraints.

Inspired by [16] and [17], in this work we explore an alter-

native, entirely data-driven approach to both joint-space plan-

ning and constraint satisfaction in a robot manipulation setting.

In particular, our approach leverages iterative, gradient-based

optimisation to produce a sequence of joint conﬁgurations by

traversing the latent space of a VAE. Training data for this

model is trivially obtained as it need not be in any way task-

oriented but can come from random motor-babbling on a real

platform, or simply sampling valid states in simulation. In

addition, performance predictors operating on the latent space

and potentially other observational data (for example, the posi-

tions of obstacles) are trained in a supervised fashion to output

probabilities of certain performance-related metrics, such as

whether the manipulator is in collision with an obstacle.

These networks are frozen after training and are subsequently

used in this gradient-based optimisation approach to planning

through activation maximisation [18], which is the process

of using backpropagation through network weights to ﬁnd a

permutation to the network inputs that would act to bring about

a desired change in the network’s outputs.

Taking this view of path planning overcomes many of the

obstacles that make robotic path planning a non-trivial task: (a)

as our plans consist of states drawn from a deep generative

arXiv:2210.11779v1 [cs.RO] 21 Oct 2022

model ﬁt to a large dataset of feasible robot poses, and are

thus approximately drawn from this data distribution, there is

a very high likelihood that every state in the planned path

is valid in terms of self-collisions and kinematic feasibil-

ity; (b) by modelling joint states and end-effector positions

jointly, we avoid the need to explicitly calculate inverse or

forward kinematics at any stage during planning, even when

the goal conﬁguration is given in R3Cartesian space; (c)

by leveraging activation maximisation (AM) via gradients

through performance predictors, we can enforce arbitrarily

complex, potentially non-differentiable constraints that would

be hard to express in direct optimisation-based planners, and

might be intractably restrictive for sampling-based planners;

(d) by taking a pre-trained, data-driven approach to collision

avoidance, we do not need any geometric analysis or accurate

3D models at planning time, nor indeed do we need to perform

any kind of explicit collision checking, which is generally

the main computational bottleneck in sampling-based planners

[19].

In addition to the advantages in path planning that this

method offers and above and beyond related works, we in-

troduce an additional loss on the run-time AM optimisation

process which encourages the planning process to remain in

areas of high likelihood according to our prior belief under

the generative model. In our experiments we ﬁnd that this

contribution is critical in enabling successful planning that

stays in feasible state-space.

II. RELATED WORK

Successful path planning for a robotic manipulator generally

consists of producing a kinematic sequence of joint states

through which the robot can actuate in order to move from

a start to a goal conﬁguration, while moving only through

viable conﬁgurations. While goal positions may be speciﬁed

in the same joint space as the plan, in a manipulator context

it is more common for the goal position to be speciﬁed in R3

Cartesian end-effector space, or R6if the SO(3) rotation group

is included as well. Viable conﬁgurations are the intersection

of feasible states for the robot - i.e. those that are within

joint limits and do not result in self-collision - and collision-

free states with respect to obstacles in the environment. The

intersection of these deﬁnes the conﬁguration space for the

robot in a given environment.

As analytically describing the valid conﬁguration space is

generally intractable, sampling-based methods for planning

provide the ability to quickly ﬁnd connected paths through

valid space, by checking for the validity of individual sampled

nodes. Variants of the Probabilistic Roadmap (PRM) and

Rapidly-exploring Random Tree (RRT) sampling-based algo-

rithms are widely used [1], [2], and provably asymptotically

optimal variants exists in PRM*, RRT* [3]. These methods

suffer from a trade-off between runtime and optimality: while

often relatively quick to ﬁnd a feasible collision-free path, they

tend to employ a second, slower, stage of path optimisation

to shorten the path through the application of heuristics. In

the presence of restrictive constraints, both sampling- and

optimisation-based planners can be very slow to ﬁnd an initial

feasible path [7].

Optimisation-based planners start from an initial path or

trajectory guess and then reﬁne it until certain costs, such

as path length, are minimised, and differentiable constraints

satisﬁed. Techniques such as CHOMP [4] bridge the gap

between planning and optimal control by enabling planning

over path and dynamics. TrajOpt [20] differs from CHOMP

in the numerical optimisation method used and the method of

collision checking. The Gaussian Process Motion Planner [21]

leverages Gaussian process models to represent trajectories

and updates them through interpolation. Stochastic Trajectory

Optimization for Motion Planning (STOMP) [5] is notable in

this context as it is able to produce plans while optimising for

non-differentiable constraints , which our work enables with

gradients through trained performance predictors.

Planning with Deep Neural Networks. A recent line of

work has explored using deep networks to augment some or all

components of conventional planning methods. Qureshi et al.

[10], [11] train a pair of neural networks to embed point-cloud

environment representations and perform single timestep plan-

ning. Iterative application of the planning network produces a

path plan. Ichter and Pavone [9] learn an embedding space of

observations, and use RRT in this space with a learnt collision

checker to produce path plans, but need data to learn a forward

dynamics model in order to roll out the plan.

Another family of learning-based approaches to planning

learn embedding spaces from high dimensional data, and learn

forward dynamics models that operate on this learnt latent

space. Universal Planning Networks [12] learn deterministic

representations of high-dimensional data such that update

steps by gradient descent correspond to the unrolling of a

learned forward model. The Embed-to-Control works [13],

[22] employ variational inference in deep generative models

in which latent-space dynamics is locally linear, a property

that enables locally optimal control in these spaces. DVBFs

[23] improve on these models by relaxing the assumption

that the observation space is Markovian. PlaNet [24] uses

a latent-space dynamics model for planning in model-based

RL. However, planning in all these models tends to consist of

rolling out trajectories in time, ﬁnding a promising trajectory,

and executing the given actions. As such, these techniques tend

to become intractable for longer time horizons, and cannot be

thought of as path planning frameworks.

A different approach is that of encoding movement prim-

itives under the learning from demonstrations framework.

Conditional Neural Movement Primitives [25] extracts prior

knowledge from demonstrations and infers distributions over

trajectories conditioned on the current observation. Our ap-

proach differs in that we do not encode trajectories directly, but

rather learn a probabilistic model of robot states and generate

trajectories as we optimise in latent space.

Learning Inverse Kinematics. In this work, by learning a

joint embedding of joint angles qand end-effector positions

e, we are able to optimise for achieving an end-effector target

etarget, while planning state sequences in joint space. Note that

we do not care about the orientation in which the goal is

reached, therefore end-effector orientation is omitted in the

formulation of e. Learning the statistical model of kinematics

means we do not need to solve inverse kinematics (IK) at

any point. Prior work has sought to learn solutions to IK that

can cope with its ill-posed one-to-many nature for redundant

manipulators [26], [27], and to overcome the problems with

analytic and numerical approaches [28]–[30]. Ren et al. [27]

train a generative adversarial network to generate joint angles

from end-effector positions, with the discriminator acting on

the concatenation of both the input position and generated

joints. This method implicitly maximises p(q|e), but does not

address the multimodality of the true p(q|e)IK solutions.

Bocsi et al. [26] employed structured output learning to learn

a generative model for the joint distribution of joint angles

and end-effector positions. By modelling the joint instead of

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ReachingThroughLatentSpace:FromJointStatisticstoPathPlanninginManipulationChia-ManHung1;2,ShaohongZhong1,WalterGoodwin1;2,OiwiParkerJones1,MartinEngelcke1,IoannisHavoutis2,IngmarPosner1AbstractWepresentanovelapproachtopathplanningforroboticmanipulators,inwhichpathsareproducedviaiterativeoptimisatio...

展开>> 收起<<

Reaching Through Latent Space From Joint Statistics to Path Planning in Manipulation Chia-Man Hung12 Shaohong Zhong1 Walter Goodwin12.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Reaching Through Latent Space From Joint Statistics to Path Planning in Manipulation Chia-Man Hung12 Shaohong Zhong1 Walter Goodwin12

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: