Leveraging Structure from Motion to Localize Inaccessible Bus Stops Indu Panigrahi1 Tom Bu2 and Christoph Mertz2 Abstract The detection of hazardous conditions near pub-

2025-04-29 0 0 9.21MB 8 页 10玖币

侵权投诉

Leveraging Structure from Motion to Localize Inaccessible Bus Stops

Indu Panigrahi1, Tom Bu2, and Christoph Mertz2

Abstract— The detection of hazardous conditions near pub-

lic transit stations is necessary for ensuring the safety and

accessibility of public transit. Smart city infrastructures aim

to facilitate this task among many others through the use

of computer vision. However, most state-of-the-art computer

vision models require thousands of images in order to perform

accurate detection, and there exist few images of hazardous

conditions as they are generally rare.

In this paper, we examine the detection of snow-covered

sidewalks along bus routes. Previous work has focused on

detecting other vehicles in heavy snowfall or simply detect-

ing the presence of snow. However, our application has an

added complication of determining if the snow covers areas

of importance and can cause falls or other accidents (e.g.

snow covering a sidewalk) or simply covers some background

area (e.g. snow on a neighboring ﬁeld). This problem involves

localizing the positions of the areas of importance when they

are not necessarily visible.

We introduce a method that utilizes Structure from Motion

(SfM) rather than additional annotated data to address this

issue. Speciﬁcally, our method learns the locations of sidewalks

in a given scene by applying a segmentation model and SfM

to images from bus cameras during clear weather. Then, we

use the learned locations to detect if and where the sidewalks

become obscured with snow. After evaluating across various

threshold parameters, we identify an optimal range at which

our method consistently classiﬁes different categories of side-

walk images correctly. Although we demonstrate an application

for snow coverage along bus routes, this method can extend

to other hazardous conditions as well. Code for this project

is available at https://github.com/ind1010/SfM_for_

BusEdge.

Index Terms— Computer Vision for Transportation, Intelli-

gent Transportation Systems, Localization, Segmentation and

Categorization

I. INTRODUCTION

Smart city infrastructures aim to use ﬁelds like computer

vision to facilitate city management, part of which involves

overseeing transportation systems. As transportation systems

become more intelligent, an increasing amount of public

transit vehicles are equipped with cameras that capture thou-

sands of images of the city per day along with geographic

positioning information. City infrastructures can use this

immense amount of raw data to monitor the conditions of

public transit stations and the surrounding areas.

Our application focuses on detecting snow-covered side-

walks along bus routes; snow-covered sidewalks are one

1Indu Panigrahi is with Robotics Institute Summer Scholars Program at

Carnegie Mellon University, Pittsburgh, PA 15213, USA and also with the

Department of Computer Science at Princeton University, NJ 08544, USA

indup@princeton.edu

2Tom Bu and Christoph Mertz are with the Robotics Institute

at Carnegie Mellon University, Pittsburgh, PA 15213, USA tomb,

cmertz@andrew.cmu.edu

type of hazardous condition that can limit the safety and

accessibility of public buses as pedestrians can lose access

to bus stops and/or slip (Fig. 1). We use images that are

captured on-board a public bus as data. However, instead

of annotating this data, we leverage the fact that the bus

travels around a set route and apply Structure from Motion

and a segmentation model to learn the locations of the

sidewalks in clear weather. Then, in future rounds, when

the bus encounters snowfall, we compare the detected snow

coverage to the learned locations of the sidewalks. If the

coverage exceeds a set threshold, we generate an alert, and

the bus company can contact the city to clear the sidewalk.

Fig. 1: Snow-covered sidewalk leading to a bus stop.

When evaluating on a few categories of sidewalk images,

we identify a set of thresholds at which our method per-

forms well across all categories for this bus route. Though

we demonstrate an application for detecting snow-covered

sidewalks, our method can generalize to detecting other

conditions such as snow on roads or bike lanes.

Our contributions are as follows:

•We present a method that combines Structure from

Motion with a segmentation model to learn the expected

locations of sidewalks and detect whether or not the

learned sidewalk locations become covered by snow.

•Although we demonstrate by detecting snow-covered

sidewalks, our method can easily generalize to other

problems.

•We collect a small dataset of images depicting sidewalks

in clear and snowy weather that we use for evaluation.

Additionally, we compile other categories of images that

may be relevant for other works.

II. RELATED WORK

A. Existing Municipal Infrastructures

Many American cities use the telephone number 311 that

allows anyone to report issues for the city to ﬁx, such

as snow-covered sidewalks. However, this process can be

inefﬁcient as it is decentralized and relies on the motivation

of people.

arXiv:2210.03646v1 [cs.CV] 7 Oct 2022

B. BusEdge

Since buses regularly travel around cities, and many are

equipped with cameras, we can facilitate the detection of

municipal problems by regularly analyzing bus camera im-

ages. We use a platform called BusEdge [1] that captures and

packages images with GPS information from the client (bus)

and sends the data to the server (cloudlet) to be analyzed

(Fig. 2). Intensive on-board analysis can be limited because

the bus is equipped with a CPU.

Fig. 2: Overview of BusEdge Platform. Figure adapted from

Fig. 3.1 in [1].

C. Panoptic Segmentation

Panoptic segmentation combines semantic and instance

segmentation by both categorizing pixels that represent un-

countable areas (e.g. snow) and grouping pixels into in-

stances if they belong to countable objects (e.g. cars) [2].

Although our application involves semantic segmentation

categories, we employ a panoptic segmentation model so

that our method can be extended more easily for applications

where instances are needed.

We apply an off-the-shelf segmentation model called

Mask2Former [3]. This model incorporates a Transformer

decoder. Transformers have recently become a popular op-

tion for computer vision models in terms of accuracy [4].

They are not necessarily more efﬁcient; however, since our

application is not signiﬁcantly time-sensitive (i.e. the bus

company can be informed of a snow-covered sidewalk within

a few hours rather than within a few seconds), we prioritize

accuracy over efﬁciency.

D. Snow Detection

Most work has focused on detecting the presence of

snowfall [5]–[8] and localizing the presence of vehicles and

other objects in adverse weather conditions such as snow [9]–

[11]. However, in addition to detecting snow, our application

has the added complication of localizing the positions of

sidewalks that are occluded by snow.

To our knowledge, there exists no dataset that contains

labeled snow-covered sidewalks. Synthetic images are com-

monly used to artiﬁcially enlarge datasets; however, they

are difﬁcult to render realistic-looking [12]. Furthermore,

training a deep learning model to classify an image as a

“snow-covered sidewalk” would not be straightforward as

any miscellaneous snow-covered area could look identical to

a snow-covered sidewalk (Fig. 3).

Fig. 3: Classifying an image as a snow-covered sidewalk is

difﬁcult because the area under the snow is not visible.

E. Image Localization

LiDAR is often used to localize the positions of objects

surrounding an autonomous vehicle, such as other vehicles

[13]–[18]. However, LiDAR is expensive, and we already

have thousands of images available from bus cameras [19].

Furthermore, weather conditions like snow can interfere with

LiDAR measurements [19].

Some methods have been developed for an analogous

problem of localizing roads in adverse weather conditions.

Some applications depend on a previously generated map

of the terrain [20]; we apply a similar idea of generating

a preconception of where the sidewalks should be. A few

methods use the geometry of the road, such as the vanishing

point of the road and the horizon in the image, to generate an

expected target area for where the road could be [21], [22].

Another method uses self-supervision to generate a pseudo-

mask of where the road is expected to be [23]. However,

these approaches are more effective in weather conditions

under which the road is partially occluded, such as fog or

rain. They are generally unable to localize roads that are

fully occluded by snow. Furthermore, these methods target

the autonomous driving domain where they must anticipate

completely novel surroundings on any given drive. On the

other hand, we leverage the fact that we work with images

from a mostly repetitive bus route.

Structure from Motion (SfM) [24], [25] is a classic com-

puter vision algorithm that uses several two-dimensional

images taken at different angles of a scene to construct a

three-dimensional point cloud representation of the scene.

Furthermore, SfM can deduce the pose of the camera for

each image and for new images of the same scene [26]. We

use a pipeline for SfM called COLMAP [27], [28]. More

speciﬁcally, COLMAP implements incremental SfM which

gradually adds images when reconstructing a scene (Fig. 4);

this is as opposed to global SfM [29].

Visual odometry (VO) methods can also localize images

[30] and tend to run faster than COLMAP; however, they are

not as accurate. Furthermore, VO methods that involve deep

learning [31] are inherently data-hungry, and our method

aims to reduce the amount of annotated data needed. Since

our application is not signiﬁcantly time-sensitive, and we

need to accurately classify a sidewalk as snow-covered or

clear, we require a robust pipeline like COLMAP. Further-

more, the COLMAP software is well-documented and often

referenced as a baseline method by these new methods.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LeveragingStructurefromMotiontoLocalizeInaccessibleBusStopsInduPanigrahi1,TomBu2,andChristophMertz2AbstractThedetectionofhazardousconditionsnearpub-lictransitstationsisnecessaryforensuringthesafetyandaccessibilityofpublictransit.Smartcityinfrastructuresaimtofacilitatethistaskamongmanyothersthrought...

展开>> 收起<<

Leveraging Structure from Motion to Localize Inaccessible Bus Stops Indu Panigrahi1 Tom Bu2 and Christoph Mertz2 Abstract The detection of hazardous conditions near pub-.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Leveraging Structure from Motion to Localize Inaccessible Bus Stops Indu Panigrahi1 Tom Bu2 and Christoph Mertz2 Abstract The detection of hazardous conditions near pub-

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: