Leveraging Structure from Motion to Localize Inaccessible Bus Stops Indu Panigrahi1 Tom Bu2 and Christoph Mertz2 Abstract The detection of hazardous conditions near pub-

2025-04-29 0 0 9.21MB 8 页 10玖币
侵权投诉
Leveraging Structure from Motion to Localize Inaccessible Bus Stops
Indu Panigrahi1, Tom Bu2, and Christoph Mertz2
Abstract The detection of hazardous conditions near pub-
lic transit stations is necessary for ensuring the safety and
accessibility of public transit. Smart city infrastructures aim
to facilitate this task among many others through the use
of computer vision. However, most state-of-the-art computer
vision models require thousands of images in order to perform
accurate detection, and there exist few images of hazardous
conditions as they are generally rare.
In this paper, we examine the detection of snow-covered
sidewalks along bus routes. Previous work has focused on
detecting other vehicles in heavy snowfall or simply detect-
ing the presence of snow. However, our application has an
added complication of determining if the snow covers areas
of importance and can cause falls or other accidents (e.g.
snow covering a sidewalk) or simply covers some background
area (e.g. snow on a neighboring field). This problem involves
localizing the positions of the areas of importance when they
are not necessarily visible.
We introduce a method that utilizes Structure from Motion
(SfM) rather than additional annotated data to address this
issue. Specifically, our method learns the locations of sidewalks
in a given scene by applying a segmentation model and SfM
to images from bus cameras during clear weather. Then, we
use the learned locations to detect if and where the sidewalks
become obscured with snow. After evaluating across various
threshold parameters, we identify an optimal range at which
our method consistently classifies different categories of side-
walk images correctly. Although we demonstrate an application
for snow coverage along bus routes, this method can extend
to other hazardous conditions as well. Code for this project
is available at https://github.com/ind1010/SfM_for_
BusEdge.
Index Terms Computer Vision for Transportation, Intelli-
gent Transportation Systems, Localization, Segmentation and
Categorization
I. INTRODUCTION
Smart city infrastructures aim to use fields like computer
vision to facilitate city management, part of which involves
overseeing transportation systems. As transportation systems
become more intelligent, an increasing amount of public
transit vehicles are equipped with cameras that capture thou-
sands of images of the city per day along with geographic
positioning information. City infrastructures can use this
immense amount of raw data to monitor the conditions of
public transit stations and the surrounding areas.
Our application focuses on detecting snow-covered side-
walks along bus routes; snow-covered sidewalks are one
1Indu Panigrahi is with Robotics Institute Summer Scholars Program at
Carnegie Mellon University, Pittsburgh, PA 15213, USA and also with the
Department of Computer Science at Princeton University, NJ 08544, USA
indup@princeton.edu
2Tom Bu and Christoph Mertz are with the Robotics Institute
at Carnegie Mellon University, Pittsburgh, PA 15213, USA tomb,
cmertz@andrew.cmu.edu
type of hazardous condition that can limit the safety and
accessibility of public buses as pedestrians can lose access
to bus stops and/or slip (Fig. 1). We use images that are
captured on-board a public bus as data. However, instead
of annotating this data, we leverage the fact that the bus
travels around a set route and apply Structure from Motion
and a segmentation model to learn the locations of the
sidewalks in clear weather. Then, in future rounds, when
the bus encounters snowfall, we compare the detected snow
coverage to the learned locations of the sidewalks. If the
coverage exceeds a set threshold, we generate an alert, and
the bus company can contact the city to clear the sidewalk.
Fig. 1: Snow-covered sidewalk leading to a bus stop.
When evaluating on a few categories of sidewalk images,
we identify a set of thresholds at which our method per-
forms well across all categories for this bus route. Though
we demonstrate an application for detecting snow-covered
sidewalks, our method can generalize to detecting other
conditions such as snow on roads or bike lanes.
Our contributions are as follows:
We present a method that combines Structure from
Motion with a segmentation model to learn the expected
locations of sidewalks and detect whether or not the
learned sidewalk locations become covered by snow.
Although we demonstrate by detecting snow-covered
sidewalks, our method can easily generalize to other
problems.
We collect a small dataset of images depicting sidewalks
in clear and snowy weather that we use for evaluation.
Additionally, we compile other categories of images that
may be relevant for other works.
II. RELATED WORK
A. Existing Municipal Infrastructures
Many American cities use the telephone number 311 that
allows anyone to report issues for the city to fix, such
as snow-covered sidewalks. However, this process can be
inefficient as it is decentralized and relies on the motivation
of people.
arXiv:2210.03646v1 [cs.CV] 7 Oct 2022
B. BusEdge
Since buses regularly travel around cities, and many are
equipped with cameras, we can facilitate the detection of
municipal problems by regularly analyzing bus camera im-
ages. We use a platform called BusEdge [1] that captures and
packages images with GPS information from the client (bus)
and sends the data to the server (cloudlet) to be analyzed
(Fig. 2). Intensive on-board analysis can be limited because
the bus is equipped with a CPU.
Fig. 2: Overview of BusEdge Platform. Figure adapted from
Fig. 3.1 in [1].
C. Panoptic Segmentation
Panoptic segmentation combines semantic and instance
segmentation by both categorizing pixels that represent un-
countable areas (e.g. snow) and grouping pixels into in-
stances if they belong to countable objects (e.g. cars) [2].
Although our application involves semantic segmentation
categories, we employ a panoptic segmentation model so
that our method can be extended more easily for applications
where instances are needed.
We apply an off-the-shelf segmentation model called
Mask2Former [3]. This model incorporates a Transformer
decoder. Transformers have recently become a popular op-
tion for computer vision models in terms of accuracy [4].
They are not necessarily more efficient; however, since our
application is not significantly time-sensitive (i.e. the bus
company can be informed of a snow-covered sidewalk within
a few hours rather than within a few seconds), we prioritize
accuracy over efficiency.
D. Snow Detection
Most work has focused on detecting the presence of
snowfall [5]–[8] and localizing the presence of vehicles and
other objects in adverse weather conditions such as snow [9]–
[11]. However, in addition to detecting snow, our application
has the added complication of localizing the positions of
sidewalks that are occluded by snow.
To our knowledge, there exists no dataset that contains
labeled snow-covered sidewalks. Synthetic images are com-
monly used to artificially enlarge datasets; however, they
are difficult to render realistic-looking [12]. Furthermore,
training a deep learning model to classify an image as a
“snow-covered sidewalk” would not be straightforward as
any miscellaneous snow-covered area could look identical to
a snow-covered sidewalk (Fig. 3).
Fig. 3: Classifying an image as a snow-covered sidewalk is
difficult because the area under the snow is not visible.
E. Image Localization
LiDAR is often used to localize the positions of objects
surrounding an autonomous vehicle, such as other vehicles
[13]–[18]. However, LiDAR is expensive, and we already
have thousands of images available from bus cameras [19].
Furthermore, weather conditions like snow can interfere with
LiDAR measurements [19].
Some methods have been developed for an analogous
problem of localizing roads in adverse weather conditions.
Some applications depend on a previously generated map
of the terrain [20]; we apply a similar idea of generating
a preconception of where the sidewalks should be. A few
methods use the geometry of the road, such as the vanishing
point of the road and the horizon in the image, to generate an
expected target area for where the road could be [21], [22].
Another method uses self-supervision to generate a pseudo-
mask of where the road is expected to be [23]. However,
these approaches are more effective in weather conditions
under which the road is partially occluded, such as fog or
rain. They are generally unable to localize roads that are
fully occluded by snow. Furthermore, these methods target
the autonomous driving domain where they must anticipate
completely novel surroundings on any given drive. On the
other hand, we leverage the fact that we work with images
from a mostly repetitive bus route.
Structure from Motion (SfM) [24], [25] is a classic com-
puter vision algorithm that uses several two-dimensional
images taken at different angles of a scene to construct a
three-dimensional point cloud representation of the scene.
Furthermore, SfM can deduce the pose of the camera for
each image and for new images of the same scene [26]. We
use a pipeline for SfM called COLMAP [27], [28]. More
specifically, COLMAP implements incremental SfM which
gradually adds images when reconstructing a scene (Fig. 4);
this is as opposed to global SfM [29].
Visual odometry (VO) methods can also localize images
[30] and tend to run faster than COLMAP; however, they are
not as accurate. Furthermore, VO methods that involve deep
learning [31] are inherently data-hungry, and our method
aims to reduce the amount of annotated data needed. Since
our application is not significantly time-sensitive, and we
need to accurately classify a sidewalk as snow-covered or
clear, we require a robust pipeline like COLMAP. Further-
more, the COLMAP software is well-documented and often
referenced as a baseline method by these new methods.
摘要:

LeveragingStructurefromMotiontoLocalizeInaccessibleBusStopsInduPanigrahi1,TomBu2,andChristophMertz2Abstract—Thedetectionofhazardousconditionsnearpub-lictransitstationsisnecessaryforensuringthesafetyandaccessibilityofpublictransit.Smartcityinfrastructuresaimtofacilitatethistaskamongmanyothersthrought...

展开>> 收起<<
Leveraging Structure from Motion to Localize Inaccessible Bus Stops Indu Panigrahi1 Tom Bu2 and Christoph Mertz2 Abstract The detection of hazardous conditions near pub-.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:9.21MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注