1 A Learning-Based Estimation and Control Framework for Contact-Intensive Tight-Tolerance Tasks

2025-04-24 1 0 2.15MB 8 页 10玖币

侵权投诉

A Learning-Based Estimation and Control Framework for

Contact-Intensive Tight-Tolerance Tasks

Bukun Son, Hyelim Choi, Jaemin Yoon and Dongjun Lee

Abstract—We present a two-stage framework that integrates

a learning-based estimator and a controller, designed to address

contact-intensive tasks. The estimator leverages a Bayesian par-

ticle ﬁlter with a mixture density network (MDN) structure,

effectively handling multi-modal issues arising from contact

information. The controller combines a self-supervised and re-

inforcement learning (RL) approach, strategically dividing the

low-level admittance controller’s parameters into labelable and

non-labelable categories, which are then trained accordingly.

To further enhance accuracy and generalization performance,

a transformer model is incorporated into the self-supervised

learning component. The proposed framework is evaluated

on the bolting task using an accurate real-time simulator

and successfully transferred to an experimental environment.

More visualization results are available on our project website:

https://sites.google.com/view/2stagecitt

Index Terms—Contact-intensive assembly, data-driven, force

and tactile sensing, pose estimation, reinforcement learning.

I. INTRODUCTION

CONTACT-intensive and tight-tolerance tasks, such as nut

tightening, are essential not only in factory automation

but also within hazardous environments. Despite this, au-

tomating this task with uncertainty in the pose of the target

object is a highly challenging problem. We can infer key

insights to solving this challenging problem by observing

human workﬂows. Humans ﬁrst make two objects come into

contact, then estimate their relative pose through some random

motions, and manipulate complex assemblies by adapting to

the contact wrench in real time. Consequently, Both sequential

accurate pose estimation of a target object and a precise real-

time control strategy are required.

For object pose estimation, vision-based methods are preva-

lent [1, 2, 3], but inherent occlusion between two objects could

occur, and only a limited view can be provided depending on

the camera’s installation. Also, environmental factors such as

insufﬁcient light or fog can degrade performance, and sensor

input can also be limited due to the object’s transparency. As

a result, there are limitations to practical application outside

of well-set laboratories or factory environments. As a result,

sensing modalities like contact sensing are extensively used

for precise object pose estimation because they don’t suffer

This research was supported by the Industrial Strategic Technology Devel-

opment Program (20001045, 20008957) of the Ministry of Trade, Industry &

Institute of Information & Communications Technology Planning & Evalua-

tion (2021-0-00896). (Corresponding author: Dongjun Lee.)

1B. Son, H. Choi, and D. J. Lee are with the Department of Mechanical En-

gineering, IAMD, and IOER, Seoul National University, Seoul 08826, South

Korea (e-mail: sonbukun@snu.ac.kr; helmchoi@snu.ac.kr; djlee@snu.ac.kr).

2J. Yoon was with Seoul National University and is now with

the Robot Center, Samsung Research, Seoul, South Korea (e-mail: jae

min.yoon@samsung.com).

Fig. 1: The overall structure of the two-stage framework.

from these issues. While contact sensing provides precise and

accurate information, it inherently yields partially observed is-

sues, creating a multi-modality problem. Addressing this issue

necessitates probabilistic modeling of possible multi-modal

pose, and based on this model, a sequential method, which

reduces estimation uncertainty through sustained contact, is

essential.

The online controller is necessary to handle residual er-

rors after the estimation stage because even small errors

could result in serious issues such as jamming. This be-

comes extremely challenging due to complex multi-contacts

that are difﬁcult to model and discontinuities (e.g., contact

point switching). In the case of rotating assembly like nut-

tightening, the difﬁculty greatly increases because the contact

force pattern becomes much more complex. Therefore, we

leverage the learning-based algorithm to optimize the control

parameters. To be speciﬁc, we utilize the supervised learning

methods for the parameters which can be labeled, and RL

for the parameters that cannot be labeled. In addition, since

the contact-based controller is also partially observed like the

estimation, the transformer network is used for the network

structure to moderate the issue, because the transformer has

high performance in sequential modeling and reasoning.

In this paper, we propose a two-stage framework, as shown

in Fig. 1, comprising a learning-based estimator and controller

that applies to contact-intensive tight-tolerance assembly tasks

with complex contact geometry. Each component of the frame-

work possesses the following novelty:

arXiv:2210.05524v2 [cs.RO] 1 Aug 2023

1) Learning-based Bayesian particle ﬁlter is formulated

as a Bayesian particle ﬁlter with modeling the pose

likelihood with a Mixture Density Network (MDN) [4]

to resolve the multi-modal issue and calculate estimation

uncertainties. This estimates the relatively large pose

errors (both of position and orientation) of complex

shapes based on the contact wrench.

2) Self-supervised and RL-based controller increases re-

liability and data efﬁciency compared to end-to-end RL

combining the transformer [5]-based supervised learning

to predict and on-policy RL to predict and optimize the

low-level controller for tightening. This completes the

task by adapting to the residual errors in real time.

3) Real-world implementation of nut-tightening is a key

contribution of this work. To the best of my knowledge,

this is the ﬁrst real-world execution of such a contact-

intensive tight-tolerance task (nut tightening) over large

position and orientation errors. The robustness and effec-

tiveness were validated through real-world experiments.

II. RELATED WORKS

A. Contact information based pose estimation

In an earlier study [6], pose uncertainty in SE(2) was

estimated by matching the contact conﬁguration space (C-

space) with a pre-acquired C-space, but this method is com-

putationally demanding to calculate the likelihood for the

complex shape of objects. More recent studies, such as the

memory unscented particle ﬁlter proposed in [7], aim to

localize more complex-shaped objects in SE(3) but multiple

tactile sensors are required and it is costly. While F/T sensors

have been used instead of tactile sensors in [8, 9], these

works focus on objects with simple shapes or require object-

speciﬁc motions, limiting generalization with complex shapes.

To overcome these issues, data-driven methods have been

proposed to address complex contacts that are difﬁcult to

model while maintaining low computation costs. In [10], the

contact pattern generated by a tilt-then-rotate motion is trained,

and the misalignment direction is classiﬁed. However, this

method classiﬁes the discretized misaligned directions with

only position uncertainties. Recently, [11, 12, 13] updates the

estimation ﬁlter for complex shapes by several discontinuous

poking or touching. These methods require full geometry

information about the shape, involve high computational over-

head, and are only applicable to objects with distinguishable

keypoints in their shape.

B. RL-based assembly tasks

Reinforcement learning (RL) has been widely employed

to address contact-intensive tasks to handle complex contact

behaviors. The most popular approach is end-to-end residual

learning of a control input to the position-based nominal

trajectory (e.g., learning a model-free residual policy [14, 15]

and optimizing the force-control parameters as the residual

control input [16]). A ﬁxed nominal trajectory limits the

range of adaptable uncertainty. In [17], an RL controller is

trained to compute the desired force and orientation of a

hybrid position/force low-level controller for the peg-in-hole

Fig. 2: The experiment environment setup consists of a Franka

Emika Panda robotic manipulator, ATI Gamma FT sensor,

HEBI X-series actuator, universal vice, nut, and bolt.

task. Another approach proposes a distributed RL agent, RD2,

which employs a long short-term memory (LSTM) structure to

use only the force/torque as input [18]. The common limitation

is that they only address relatively simple insertion problems

of objects with simple shapes. [19] develops RL-based nut

fastening with complex shapes through theira simulator [20],

but it has not been veriﬁed in real-world environments. Our

recent work [21] proposes a high-level RL-based controller on

top of a low-level linear quadratic tracking (LQT) controller

for the bolting task, and we extend the uncertainty range with

novel approach in this paper. Furthermore, the limitation of

all existing studies is that the policy is trained with end-to-

end RL, which has low reliability and data efﬁciency.

III. PRELIMINARIES

A. System Description

In this subsection, we describe the system setup of the

task, on which our proposed framework is implemented.

We construct the simulation and experimental setup with a

robotic manipulator (Franka Emika Panda), an FT sensor

(ATI gamma SI-65-6) to measure the 6-DOF contact wrench,

and a HEBI X-series gripper capable of inﬁnite rotation for

rotational assembly tasks, as shown in Fig. 2. A manipulating

object (e.g., nut) with the position pt∈R3and orientation

Rt∈SO(3) is rigidly attached to the HEBI gripper, and a

ﬁxed target object (e.g., bolt) with the position ptar

t∈R3

and orientation Rtar

t∈SO(3) is installed in the environment,

where ⋆trepresents a variable at time t. Motion planning and

low-level control of the manipulating object are implemented

in the 6-DOF Cartesian space. The low-level controller is an

admittance controller with the reference manipulating object

dynamics given as

Mt¨et+Bt˙et+Ktet=Fc

t(1)

where et= [ep

t, eR

t]T∈R6is the error vector, with the linear

position error ep

t=pref

t−pt∈R3and the orientation error as

geometric error eR

t=1

2(RT

tRref

t−Rref

TRt)∨. Here, pref

t∈

R3is the reference position, Rref

t∈SO(3) is the reference

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1ALearning-BasedEstimationandControlFrameworkforContact-IntensiveTight-ToleranceTasksBukunSon,HyelimChoi,JaeminYoonandDongjunLeeAbstract—Wepresentatwo-stageframeworkthatintegratesalearning-basedestimatorandacontroller,designedtoaddresscontact-intensivetasks.TheestimatorleveragesaBayesianpar-ticlefil...

展开>> 收起<<

1 A Learning-Based Estimation and Control Framework for Contact-Intensive Tight-Tolerance Tasks.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 A Learning-Based Estimation and Control Framework for Contact-Intensive Tight-Tolerance Tasks

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: