Simulating Coverage Path Planning with Roomba Robert Chuchro chuchro3stanford.edu

2025-05-03 0 0 351.25KB 7 页 10玖币
侵权投诉
Simulating Coverage Path Planning with Roomba
Robert Chuchro
chuchro3@stanford.edu
Abstract
Coverage Path Planning involves visiting every unoc-
cupied state in an environment with obstacles. In this
paper, we explore this problem in environments which
are initially unknown to the agent, for purposes of sim-
ulating the task of a vacuum cleaning robot. A survey
of prior work reveals sparse effort in applying learn-
ing to solve this problem. In this paper, we explore
modeling a Cover Path Planning problem using Deep
Reinforcement Learning, and compare it with the per-
formance of the built-in algorithm of the Roomba, a
popular vacuum cleaning robot.
1 Introduction
A Roomba is a robot vacuum cleaner built with sen-
sors to automatically vacuum a living space while
avoiding obstacles and stair ledges. The Roomba con-
tains several sensors which allow it to receive informa-
tion from the outside world. In this paper, the relevant
sensors are the bumper sensors in front of the Roomba.
This sensor will be triggered when the Roomba en-
counters an obstacle, such as a wall or piece of furni-
ture. The Roomba can move by turning itself or driv-
ing forward.
2 Problem Statement
The environment consists of a bounded two-
dimensional plane which may contain obstacles. Any
location which is not an obstacle is reachable from any
other location in the environment. The goal is for the
Roomba to traverse the entire open space in under an
hour, avoiding collisions with obstacles.
Figure 1: Grid world representation of Roomba’s en-
vironment. The green cell is the base / start location.
Initially, the environment is unknown. The agent
will need to adopt a policy to explore the space around
it.
2.1 Representation
To simplify the problem, we will discretize the en-
vironment as a grid world, with each grid cell be-
ing a square with length equal to the diameter of the
Roomba, as shown in Figure (1).
The agent will initialize its world as only having
knowledge of the starting cell, with the remaining
neighboring states as hidden states, represented with
’?’. The agent has three actions available to it: drive
forward, turn left 90 degrees, and turn right 90 de-
grees. All three actions are assigned to take one sec-
ond, which is represented as one time step.
When the agent attempts to move onto a hidden
state, it will observe either an empty spaces or an
obstacle, based on feedback from its bumper sensor.
Here, the goal is to explore the hidden cells, ensuring
that every empty space was visited.
1
arXiv:2210.04988v1 [cs.RO] 10 Oct 2022
2.2 Success Measurement
Success of a particular strategy can be measured by
time taken to fully explore a given environment. In
the context of simulating, this can be measured by the
number of actions taken before visiting all empty cells.
In order to avoid modeling a desired or complex be-
havior through the reward function, the reward func-
tion remains kept simple:
R(s, a, s0) =
1s0is an unvisited cell,
1s0is an obstacle,
0otherwise
(1)
The environment also has a natural decay, since we
are limited to one hour (1800 time steps).
One of the goals is to come up with a strategy that
can outperform Roomba’s baseline algorithm, which
will be outlined in a subsequent section.
Performances amongst the agents can be compared
using the following metrics: percentage of open space
explored, number of collisions with obstacles, and
number of time steps to completion.
3 Related Work
3.1 Path Planning using Discretized States
The coverage algorithm described in [9] involves
discretizing a state space with obstacles, and plan-
ning safe paths betweens states. Based on simulation
results, it was shown that using a 4-connected grid
representation of the environment created paths that
were no more than 2% less efficient than if it were 8-
connected. This means that restricting the turn angles
of the agent to 90 degrees does not have a drastically
negative effect on simulation results, in exchange for
reducing the size of the state and action space.
3.2 Boustrophedon Cellular Decomposition
Boustrophedon or S-shaped pathing, follows the same
pathing that farmers use on fields of crops, or lawn
mowers on a field of grass. The algorithm involves
finding a perimeter, then traversing back and forth
along the length of the perimeter across the entire
space [5].
This pathing concept was extended in [2] by at-
tempting to divide the region into segments, each of
which can be completed by executing Boustrophedon
pathing.
3.3 Combination of Basic Policies
A combination of policies was shown to outperform
any individual strategy in [5]. The basic pathing strate-
gies were:
- Outward Spiral
- Wall Walking
- Random Walk
- Boustrophedon Pathing
It was not stated how the combination policy was
determined; however, a possible approach would be to
learn a scoring function to evaluate which strategy to
use based on the current set of observations.
3.4 Deep Reinforcement Learning
One of the main challenges of this task is for the
agent to learn a policy that generalizes well so that
its policy is not specific to a particular environment.
A classic example of generalized reinforcement learn-
ing is presented in [8]. Minh, et al. introduced a
Deep Q-Network (DQN), a neural network used for Q-
Learning. They used a convolutional neural network
to train seven Atari 2600 games. They were able to
achieve super-human results in six of the seven games
it was tested on, with no adjustment of the architecture
or hyperparameters between different games.
We will apply Deep Q-Learning to the simulated en-
vironment to produce an agent that can explore well in
unseen environments.
4 Methods
4.1 Baseline Roomba Algorithm
The basic Roomba strategy consists of exploring out-
ward in a spiral fashion until the nearest obstacle is
2
摘要:

SimulatingCoveragePathPlanningwithRoombaRobertChuchrochuchro3@stanford.eduAbstractCoveragePathPlanninginvolvesvisitingeveryunoc-cupiedstateinanenvironmentwithobstacles.Inthispaper,weexplorethisprobleminenvironmentswhichareinitiallyunknowntotheagent,forpurposesofsim-ulatingthetaskofavacuumcleaningrob...

展开>> 收起<<
Simulating Coverage Path Planning with Roomba Robert Chuchro chuchro3stanford.edu.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:351.25KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注