Bag All You Need Learning a Generalizable Bagging Strategy for Heterogeneous Objects Arpit Bahety1Shreeya Jain1Huy Ha1Nathalie Hager1

2025-05-02 0 0 6.98MB 8 页 10玖币
侵权投诉
Bag All You Need: Learning a Generalizable Bagging Strategy
for Heterogeneous Objects
Arpit Bahety1Shreeya Jain1Huy Ha1Nathalie Hager1
Benjamin Burchfiel2Eric Cousineau2Siyuan Feng2Shuran Song1
bag-all-you-need.cs.columbia.edu
Abstract We introduce a practical robotics solution for
the task of heterogeneous bagging, requiring the placement
of multiple rigid and deformable objects into a deformable
bag. This is a difficult task as it features complex interactions
between multiple highly deformable objects under limited
observability. To tackle these challenges, we propose a robotic
system consisting of two learned policies: a rearrangement policy
that learns to place multiple rigid objects and fold deformable
objects in order to achieve desirable pre-bagging conditions, and
a lifting policy to infer suitable grasp points for bi-manual bag
lifting. We evaluate these learned policies on a real-world three-
arm robot platform that achieves a 70% heterogeneous bagging
success rate with novel objects. To facilitate future research
and comparison, we also develop a novel heterogeneous bagging
simulation benchmark that will be made publicly available.
I. INTRODUCTION
Imagine packing a bag for a picnic; we might first put
several rigid objects (such as an apple and a water bottle)
into the bag, fold deformable objects (such as a picnic mat
and a T-shirt) and then place them on top of the bag opening.
We must then lift the bag (another deformable object) in a
way that these objects fall inside without spilling. Successful
completion of this task requires both a comprehensive
understanding of the objects’ physical properties and the
capability to plan and integrate multiple manipulation skills.
For instance, the robot’s actions must take into account:
Object geometry: objects must be placed and oriented
to fit into the bag opening.
Object material: large deformable objects, such
as blankets, must be folded or crumpled into a
compact configuration prior to packing. This requires
manipulation strategies that are conditioned on object
material (i.e. rigid and deformable).
Inter-object dynamics: the ultimate success of this task
is determined jointly by object configurations and the
robot’s grasp on the bag during lifting. Crucially, when
objects are partially inside a bag (for example, a mat on
top of the bag opening), different lifting positions will
result in different outcomes. Therefore, a successful
approach must decide when a desired pre-bagging
condition is achieved and, if so, determine a good grasp
location(s) to lift up the bag. Here, pre-bagging condition
refers to when all objects are sufficiently inside the bag
opening, and will fall into the bag with a proper lift.
indicates equal contribution
1Columbia University 2Toyota Research Institute
a) Rearrange b) Lift c) Final
Fig. 1. The Heterogeneous Bagging Task requires packing multiple rigid
(e.g., the apple) and deformable objects (e.g., the T-shirt) into a deformable
bag. The system must learn to (a) strategically manipulate these objects
to achieve a feasible pre-bagging configuration. It also needs to (b) infer
suitable grasp points from which to lift up the bag such that (c) the objects
fall inside the bag.
Due to these difficulties, prior work focused either on only
the lifting step of the process [1] or considered a simplified
scenario of packing only rigid items [2], [3].
We seek to address these limitations and propose a system
that tackles the complete bagging process for a diverse set
of rigid and deformable objects — a task we refer to as
heterogeneous bagging. Our proposed approach consists of
two learnable policies: a rearrangement policy that uses
sequential pick-and-place actions to rearrange or fold items
(Fig. 1a) in order to achieve a suitable pre-bagging configura-
tion and a lifting policy that determines where to grasp and
lift up the bag once pre-bagging conditions are met (Fig. 1b,c).
We show that estimating the satisfaction of these pre-bagging
conditions (required to decide when to stop rearranging and
begin lifting) can be jointly performed by the two policies.
To accomplish this task on real hardware, we develop a
representative simulation environment and use it to train both
policies. Then, to facilitate a better bridge for the inevitable
sim2real gap, we train a self-supervised network that detects
the bag opening from real-world depth images. These
predictions are used as additional input to the rearrangement
and lifting policies, allowing them to transfer more robustly
from simulation, where they are trained, to the real world.
We evaluate the learned policies with a real-world three-arm
robot system with novel objects. The system is equipped with
two types of end-effectors: a suction gripper, responsible for
arXiv:2210.09997v2 [cs.RO] 1 Oct 2023
a) Workspace
b) Bag Opening Prediction
c) Observation d) Value
Networks
Rearrangement
Policy
Lifting
Policy
e) Max Value Action
(nth step)
Lift Value =
0.881 > 0.5
f) Execution
Pre-bag
Condition
Satisfied?
P&P Value =
0.387 No
Yes
Segmentation
Network
𝜃, 𝑤
𝜃
Ppick
Pplace
L1
L2
Fig. 2. Method Overview. Our system consists of (a) three robot arms and a top camera with a view of the workspace. A top-down depth image
of the bag (before placing any other objects) is used to (b) predict the bag opening boundary. For each step, (c) a top-down RGB image and the predicted
bag opening mask are input to (d) the rearrangement and lifting policies, which individually output (e) dense value maps and the action corresponding
to the highest pick-and-place and lift score. If the pre-bagging condition is satisfied, the bag is (f) lifted from the lift points predicted by the lifting network.
Otherwise, we (f) execute a rearrangement action and return to (c) for the next step.
object rearrangement, on one arm and a parallel-jaw gripper,
used to perform the bag lifting portion of the task, on the
other two arms. We find that our proposed approach achieves
a 70% success rate for the heterogeneous bagging task.
The main contribution of this work is the development of
the first real-world robot system for the task of heterogeneous
bagging. To this end, we propose:
A self-supervised bag opening detection algorithm from
depth images, whose pixel-wise supervision is automati-
cally obtained through color images. This detection result
enables robust sim2real transfer for downstream policies.
A learned rearrangement policy that strategically manip-
ulates and reconfigures multiple rigid and deformable
objects to satisfy required pre-bagging conditions.
A learned lifting policy that determines valid pre-
bagging configurations and infers suitable grasp points
for a bi-manual bag lifting action.
A novel simulation environment and benchmark for
heterogeneous bagging. The benchmark will be publicly
available to facilitate future research and enable a fair
comparison between heterogeneous bagging approaches.
II. RELATED WORK
Rigid object packing. Owing to numerous potential
real-world applications, the problem of packing rigid objects
has been extensively studied [4], [5]. In the offline setting,
where the set of items and packing order are predetermined,
prior works have primarily focused on exact algorithms [6],
heuristics and metaheuristics [7], [8]. In the online setting,
where arbitrary items arrive sequentially and must be packed
in the order they are received, deep reinforcement learning
strategies [2], [3], and the NDOP/QOP algorithm for the
nondeterministic order setting [9] have been used. However,
all these approaches are limited to packing rigid, generally
cuboidal, objects into rigid containers and are not suitable
for deformable objects or non-rigid containers such as bags.
Cloth and rope manipulation. Early attempts at deformable
object manipulation focused on methods for manipulating one-
dimensional deformable objects such as ropes and cables [10]–
[15] and two-dimensional deformable objects such as fab-
rics [16]–[20]. Data-driven techniques such as Reinforcement
Learning and Imitation Learning have also been developed
for cloth smoothing [21], [22], folding [23]–[25], and unfold-
ing [26], [27]. While our approach is inspired by some of these
prior works in fabric folding, we address a significantly harder
task that involves 3D deformable objects such as bags and
complex interactions between multiple deformable objects.
Bag manipulation. The manipulation of 3D deformable
objects, such as bags, is an under-studied research area in
robotics due to the inherent complexity and difficulty of the
task. Initial work involved calculating the deformation char-
acteristics of an object and determining the minimum lifting
force through iterative lifting [28]. Recent relevant studies in-
volve grasping randomly or at maximum width to lift a bag us-
ing a physical robot [1] or opening a deformable bag and main-
taining the opened state using air-based blowing actions [29].
The most relevant work to our task is perhaps Seita et
al. [15], where the task is to insert a rigid object into a
deformable bag. But their approach is limited to handling
a single rigid object placement and further simplifies the bag
lifting task by attaching rigid beads around the bag opening.
In contrast, our system can manipulate multiple objects (either
rigid or deformable) and infer lift points for a fully-deformable
bag directly from real-world RGB-D images, resulting in
a more practical solution for real-world applications.
III. METHOD
A. Task and System Setup
We formulate the bagging task as follows: First, a bag
is placed on a flat surface with its mouth open and facing
upward. From this configuration, the robot perceives the bag
and infers the bag opening (Fig. 2b). Note that this predicted
bag opening remains constant throughout the episode. We
then position all the objects randomly across the workspace
(Fig. 2a). The robot manipulates and iteratively rearranges
these objects to obtain a desirable pre-bagging configuration,
estimates a pair of bag-lifting grasp points, and attempts to lift
摘要:

BagAllYouNeed:LearningaGeneralizableBaggingStrategyforHeterogeneousObjectsArpitBahety∗1ShreeyaJain∗1HuyHa1NathalieHager1BenjaminBurchfiel2EricCousineau2SiyuanFeng2ShuranSong1bag-all-you-need.cs.columbia.eduAbstract—Weintroduceapracticalroboticssolutionforthetaskofheterogeneousbagging,requiringthepla...

展开>> 收起<<
Bag All You Need Learning a Generalizable Bagging Strategy for Heterogeneous Objects Arpit Bahety1Shreeya Jain1Huy Ha1Nathalie Hager1.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:6.98MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注