Bag All You Need Learning a Generalizable Bagging Strategy for Heterogeneous Objects Arpit Bahety1Shreeya Jain1Huy Ha1Nathalie Hager1

2025-05-02 0 0 6.98MB 8 页 10玖币

侵权投诉

Bag All You Need: Learning a Generalizable Bagging Strategy

for Heterogeneous Objects

Arpit Bahety∗1Shreeya Jain∗1Huy Ha1Nathalie Hager1

Benjamin Burchﬁel2Eric Cousineau2Siyuan Feng2Shuran Song1

bag-all-you-need.cs.columbia.edu

Abstract— We introduce a practical robotics solution for

the task of heterogeneous bagging, requiring the placement

of multiple rigid and deformable objects into a deformable

bag. This is a difﬁcult task as it features complex interactions

between multiple highly deformable objects under limited

observability. To tackle these challenges, we propose a robotic

system consisting of two learned policies: a rearrangement policy

that learns to place multiple rigid objects and fold deformable

objects in order to achieve desirable pre-bagging conditions, and

a lifting policy to infer suitable grasp points for bi-manual bag

lifting. We evaluate these learned policies on a real-world three-

arm robot platform that achieves a 70% heterogeneous bagging

success rate with novel objects. To facilitate future research

and comparison, we also develop a novel heterogeneous bagging

simulation benchmark that will be made publicly available.

I. INTRODUCTION

Imagine packing a bag for a picnic; we might ﬁrst put

several rigid objects (such as an apple and a water bottle)

into the bag, fold deformable objects (such as a picnic mat

and a T-shirt) and then place them on top of the bag opening.

We must then lift the bag (another deformable object) in a

way that these objects fall inside without spilling. Successful

completion of this task requires both a comprehensive

understanding of the objects’ physical properties and the

capability to plan and integrate multiple manipulation skills.

For instance, the robot’s actions must take into account:

•

Object geometry: objects must be placed and oriented

to ﬁt into the bag opening.

•

Object material: large deformable objects, such

as blankets, must be folded or crumpled into a

compact conﬁguration prior to packing. This requires

manipulation strategies that are conditioned on object

material (i.e. rigid and deformable).

•

Inter-object dynamics: the ultimate success of this task

is determined jointly by object conﬁgurations and the

robot’s grasp on the bag during lifting. Crucially, when

objects are partially inside a bag (for example, a mat on

top of the bag opening), different lifting positions will

result in different outcomes. Therefore, a successful

approach must decide when a desired pre-bagging

condition is achieved and, if so, determine a good grasp

location(s) to lift up the bag. Here, pre-bagging condition

refers to when all objects are sufﬁciently inside the bag

opening, and will fall into the bag with a proper lift.

∗indicates equal contribution

1Columbia University 2Toyota Research Institute

a) Rearrange b) Lift c) Final

Fig. 1. The Heterogeneous Bagging Task requires packing multiple rigid

(e.g., the apple) and deformable objects (e.g., the T-shirt) into a deformable

bag. The system must learn to (a) strategically manipulate these objects

to achieve a feasible pre-bagging conﬁguration. It also needs to (b) infer

suitable grasp points from which to lift up the bag such that (c) the objects

fall inside the bag.

Due to these difﬁculties, prior work focused either on only

the lifting step of the process [1] or considered a simpliﬁed

scenario of packing only rigid items [2], [3].

We seek to address these limitations and propose a system

that tackles the complete bagging process for a diverse set

of rigid and deformable objects — a task we refer to as

heterogeneous bagging. Our proposed approach consists of

two learnable policies: a rearrangement policy that uses

sequential pick-and-place actions to rearrange or fold items

(Fig. 1a) in order to achieve a suitable pre-bagging conﬁgura-

tion and a lifting policy that determines where to grasp and

lift up the bag once pre-bagging conditions are met (Fig. 1b,c).

We show that estimating the satisfaction of these pre-bagging

conditions (required to decide when to stop rearranging and

begin lifting) can be jointly performed by the two policies.

To accomplish this task on real hardware, we develop a

representative simulation environment and use it to train both

policies. Then, to facilitate a better bridge for the inevitable

sim2real gap, we train a self-supervised network that detects

the bag opening from real-world depth images. These

predictions are used as additional input to the rearrangement

and lifting policies, allowing them to transfer more robustly

from simulation, where they are trained, to the real world.

We evaluate the learned policies with a real-world three-arm

robot system with novel objects. The system is equipped with

two types of end-effectors: a suction gripper, responsible for

arXiv:2210.09997v2 [cs.RO] 1 Oct 2023

a) Workspace

b) Bag Opening Prediction

c) Observation d) Value

Networks

Rearrangement

Policy

Lifting

Policy

e) Max Value Action

…

(nth step)

Lift Value =

0.881 > 0.5

f) Execution

Pre-bag

Condition

Satisfied?

P&P Value =

0.387 No

Yes

Segmentation

Network

𝜃, 𝑤

𝜃

Ppick

Pplace

Fig. 2. Method Overview. Our system consists of (a) three robot arms and a top camera with a view of the workspace. A top-down depth image

of the bag (before placing any other objects) is used to (b) predict the bag opening boundary. For each step, (c) a top-down RGB image and the predicted

bag opening mask are input to (d) the rearrangement and lifting policies, which individually output (e) dense value maps and the action corresponding

to the highest pick-and-place and lift score. If the pre-bagging condition is satisﬁed, the bag is (f) lifted from the lift points predicted by the lifting network.

Otherwise, we (f) execute a rearrangement action and return to (c) for the next step.

object rearrangement, on one arm and a parallel-jaw gripper,

used to perform the bag lifting portion of the task, on the

other two arms. We ﬁnd that our proposed approach achieves

a 70% success rate for the heterogeneous bagging task.

The main contribution of this work is the development of

the ﬁrst real-world robot system for the task of heterogeneous

bagging. To this end, we propose:

•

A self-supervised bag opening detection algorithm from

depth images, whose pixel-wise supervision is automati-

cally obtained through color images. This detection result

enables robust sim2real transfer for downstream policies.

•

A learned rearrangement policy that strategically manip-

ulates and reconﬁgures multiple rigid and deformable

objects to satisfy required pre-bagging conditions.

•

A learned lifting policy that determines valid pre-

bagging conﬁgurations and infers suitable grasp points

for a bi-manual bag lifting action.

•

A novel simulation environment and benchmark for

heterogeneous bagging. The benchmark will be publicly

available to facilitate future research and enable a fair

comparison between heterogeneous bagging approaches.

II. RELATED WORK

Rigid object packing. Owing to numerous potential

real-world applications, the problem of packing rigid objects

has been extensively studied [4], [5]. In the ofﬂine setting,

where the set of items and packing order are predetermined,

prior works have primarily focused on exact algorithms [6],

heuristics and metaheuristics [7], [8]. In the online setting,

where arbitrary items arrive sequentially and must be packed

in the order they are received, deep reinforcement learning

strategies [2], [3], and the NDOP/QOP algorithm for the

nondeterministic order setting [9] have been used. However,

all these approaches are limited to packing rigid, generally

cuboidal, objects into rigid containers and are not suitable

for deformable objects or non-rigid containers such as bags.

Cloth and rope manipulation. Early attempts at deformable

object manipulation focused on methods for manipulating one-

dimensional deformable objects such as ropes and cables [10]–

[15] and two-dimensional deformable objects such as fab-

rics [16]–[20]. Data-driven techniques such as Reinforcement

Learning and Imitation Learning have also been developed

for cloth smoothing [21], [22], folding [23]–[25], and unfold-

ing [26], [27]. While our approach is inspired by some of these

prior works in fabric folding, we address a signiﬁcantly harder

task that involves 3D deformable objects such as bags and

complex interactions between multiple deformable objects.

Bag manipulation. The manipulation of 3D deformable

objects, such as bags, is an under-studied research area in

robotics due to the inherent complexity and difﬁculty of the

task. Initial work involved calculating the deformation char-

acteristics of an object and determining the minimum lifting

force through iterative lifting [28]. Recent relevant studies in-

volve grasping randomly or at maximum width to lift a bag us-

ing a physical robot [1] or opening a deformable bag and main-

taining the opened state using air-based blowing actions [29].

The most relevant work to our task is perhaps Seita et

al. [15], where the task is to insert a rigid object into a

deformable bag. But their approach is limited to handling

a single rigid object placement and further simpliﬁes the bag

lifting task by attaching rigid beads around the bag opening.

In contrast, our system can manipulate multiple objects (either

rigid or deformable) and infer lift points for a fully-deformable

bag directly from real-world RGB-D images, resulting in

a more practical solution for real-world applications.

III. METHOD

A. Task and System Setup

We formulate the bagging task as follows: First, a bag

is placed on a ﬂat surface with its mouth open and facing

upward. From this conﬁguration, the robot perceives the bag

and infers the bag opening (Fig. 2b). Note that this predicted

bag opening remains constant throughout the episode. We

then position all the objects randomly across the workspace

(Fig. 2a). The robot manipulates and iteratively rearranges

these objects to obtain a desirable pre-bagging conﬁguration,

estimates a pair of bag-lifting grasp points, and attempts to lift

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

BagAllYouNeed:LearningaGeneralizableBaggingStrategyforHeterogeneousObjectsArpitBahety∗1ShreeyaJain∗1HuyHa1NathalieHager1BenjaminBurchfiel2EricCousineau2SiyuanFeng2ShuranSong1bag-all-you-need.cs.columbia.eduAbstract—Weintroduceapracticalroboticssolutionforthetaskofheterogeneousbagging,requiringthepla...

展开>> 收起<<

Bag All You Need Learning a Generalizable Bagging Strategy for Heterogeneous Objects Arpit Bahety1Shreeya Jain1Huy Ha1Nathalie Hager1.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Bag All You Need Learning a Generalizable Bagging Strategy for Heterogeneous Objects Arpit Bahety1Shreeya Jain1Huy Ha1Nathalie Hager1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: