Simulating Coverage Path Planning with Roomba Robert Chuchro chuchro3stanford.edu

2025-05-03 0 0 351.25KB 7 页 10玖币

侵权投诉

Simulating Coverage Path Planning with Roomba

Robert Chuchro

chuchro3@stanford.edu

Abstract

Coverage Path Planning involves visiting every unoc-

cupied state in an environment with obstacles. In this

paper, we explore this problem in environments which

are initially unknown to the agent, for purposes of sim-

ulating the task of a vacuum cleaning robot. A survey

of prior work reveals sparse effort in applying learn-

ing to solve this problem. In this paper, we explore

modeling a Cover Path Planning problem using Deep

Reinforcement Learning, and compare it with the per-

formance of the built-in algorithm of the Roomba, a

popular vacuum cleaning robot.

1 Introduction

A Roomba is a robot vacuum cleaner built with sen-

sors to automatically vacuum a living space while

avoiding obstacles and stair ledges. The Roomba con-

tains several sensors which allow it to receive informa-

tion from the outside world. In this paper, the relevant

sensors are the bumper sensors in front of the Roomba.

This sensor will be triggered when the Roomba en-

counters an obstacle, such as a wall or piece of furni-

ture. The Roomba can move by turning itself or driv-

ing forward.

2 Problem Statement

The environment consists of a bounded two-

dimensional plane which may contain obstacles. Any

location which is not an obstacle is reachable from any

other location in the environment. The goal is for the

Roomba to traverse the entire open space in under an

hour, avoiding collisions with obstacles.

Figure 1: Grid world representation of Roomba’s en-

vironment. The green cell is the base / start location.

Initially, the environment is unknown. The agent

will need to adopt a policy to explore the space around

it.

2.1 Representation

To simplify the problem, we will discretize the en-

vironment as a grid world, with each grid cell be-

ing a square with length equal to the diameter of the

Roomba, as shown in Figure (1).

The agent will initialize its world as only having

knowledge of the starting cell, with the remaining

neighboring states as hidden states, represented with

’?’. The agent has three actions available to it: drive

forward, turn left 90 degrees, and turn right 90 de-

grees. All three actions are assigned to take one sec-

ond, which is represented as one time step.

When the agent attempts to move onto a hidden

state, it will observe either an empty spaces or an

obstacle, based on feedback from its bumper sensor.

Here, the goal is to explore the hidden cells, ensuring

that every empty space was visited.

arXiv:2210.04988v1 [cs.RO] 10 Oct 2022

2.2 Success Measurement

Success of a particular strategy can be measured by

time taken to fully explore a given environment. In

the context of simulating, this can be measured by the

number of actions taken before visiting all empty cells.

In order to avoid modeling a desired or complex be-

havior through the reward function, the reward func-

tion remains kept simple:

R(s, a, s0) = 









1s0is an unvisited cell,

−1s0is an obstacle,

0otherwise

(1)

The environment also has a natural decay, since we

are limited to one hour (1800 time steps).

One of the goals is to come up with a strategy that

can outperform Roomba’s baseline algorithm, which

will be outlined in a subsequent section.

Performances amongst the agents can be compared

using the following metrics: percentage of open space

explored, number of collisions with obstacles, and

number of time steps to completion.

3 Related Work

3.1 Path Planning using Discretized States

The coverage algorithm described in [9] involves

discretizing a state space with obstacles, and plan-

ning safe paths betweens states. Based on simulation

results, it was shown that using a 4-connected grid

representation of the environment created paths that

were no more than 2% less efﬁcient than if it were 8-

connected. This means that restricting the turn angles

of the agent to 90 degrees does not have a drastically

negative effect on simulation results, in exchange for

reducing the size of the state and action space.

3.2 Boustrophedon Cellular Decomposition

Boustrophedon or S-shaped pathing, follows the same

pathing that farmers use on ﬁelds of crops, or lawn

mowers on a ﬁeld of grass. The algorithm involves

ﬁnding a perimeter, then traversing back and forth

along the length of the perimeter across the entire

space [5].

This pathing concept was extended in [2] by at-

tempting to divide the region into segments, each of

which can be completed by executing Boustrophedon

pathing.

3.3 Combination of Basic Policies

A combination of policies was shown to outperform

any individual strategy in [5]. The basic pathing strate-

gies were:

- Outward Spiral

- Wall Walking

- Random Walk

- Boustrophedon Pathing

It was not stated how the combination policy was

determined; however, a possible approach would be to

learn a scoring function to evaluate which strategy to

use based on the current set of observations.

3.4 Deep Reinforcement Learning

One of the main challenges of this task is for the

agent to learn a policy that generalizes well so that

its policy is not speciﬁc to a particular environment.

A classic example of generalized reinforcement learn-

ing is presented in [8]. Minh, et al. introduced a

Deep Q-Network (DQN), a neural network used for Q-

Learning. They used a convolutional neural network

to train seven Atari 2600 games. They were able to

achieve super-human results in six of the seven games

it was tested on, with no adjustment of the architecture

or hyperparameters between different games.

We will apply Deep Q-Learning to the simulated en-

vironment to produce an agent that can explore well in

unseen environments.

4 Methods

4.1 Baseline Roomba Algorithm

The basic Roomba strategy consists of exploring out-

ward in a spiral fashion until the nearest obstacle is

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SimulatingCoveragePathPlanningwithRoombaRobertChuchrochuchro3@stanford.eduAbstractCoveragePathPlanninginvolvesvisitingeveryunoc-cupiedstateinanenvironmentwithobstacles.Inthispaper,weexplorethisprobleminenvironmentswhichareinitiallyunknowntotheagent,forpurposesofsim-ulatingthetaskofavacuumcleaningrob...

展开>> 收起<<

Simulating Coverage Path Planning with Roomba Robert Chuchro chuchro3stanford.edu.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Simulating Coverage Path Planning with Roomba Robert Chuchro chuchro3stanford.edu

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: