ASAP Accurate semantic segmentation for real time performance Jae Hyun Park

2025-05-08 0 0 1.63MB 5 页 10玖币
侵权投诉
ASAP: Accurate semantic segmentation for real
time performance
Jae Hyun Park
AI tech team
Lotte Data Communication Company
Seoul, South Korea
jaehyun-park@lotte.net
Su Bin Lee
AI tech team
Lotte Data Communication Company
Seoul, South Korea
leesubin@lotte.net
Eon Kim
AI tech team
Lotte Data Communication Company
Seoul, South Korea
eon.kim@lotte.net
Byeong Jun Moon
AI tech team
Lotte Data Communication Company
Seoul, South Korea
bj moon@lotte.net
Da Been Yu
AI tech team
Lotte Data Communication Company
Seoul, South Korea
db.yu@lotte.net
Yeon Seung Yu
AI tech team
Lotte Data Communication Company
Seoul, South Korea
yys4000@lotte.net
Jung Hwan Kim
AI tech team
Lotte Data Communication Company
Seoul, South Korea
jhwan kim@lotte.net
Abstract—Feature fusion modules from encoder and self-
attention module have been adopted in semantic segmentation.
However, the computation of these modules is costly and has
operational limitations in real-time environments. In addition,
segmentation performance is limited in autonomous driving
environments with a lot of contextual information perpendicular
to the road surface, such as people, buildings, and general objects.
In this paper, we propose an efficient feature fusion method,
Feature Fusion with Different Norms (FFDN) that utilizes rich
global context of multi-level scale and vertical pooling module
before self-attention that preserves most contextual information
while reducing the complexity of global context encoding in
the vertical direction. By doing this, we could handle the
properties of representation in global space and reduce additional
computational cost. In addition, we analyze low performance in
challenging cases including small and vertically featured objects.
We achieve the mean Interaction of-union(mIoU) of 73.1 and the
Frame Per Second(FPS) of 191, which are comparable results
with state-of-the-arts on Cityscapes test datasets.
Index Terms—semantic segmentation, deep learning
I. INTRODUCTION
Semantic segmentation is a per-pixel classification which
predicts pixel by pixel. Including biomedical and human-
machine interaction, semantic segmentation has been widely
researched [1], [2].
In particular, segmentation used in autonomous driving,
such as depth estimation and free space, operates in real time
and requires fast inference speed and high performance. To
improve inference speed, aligned feature maps at adjacent
levels used to balance performance and inference speed in
segmentation task [3]. ladder-style lightweight decoder is
designed for upsampling low spatial resolution [4].
To achieve high accuracy, segmentation models require
global contextual information and capabilities with multi-
level semantics. Some studies include a self-attention module,
which helps to concentrate contextual features [5] to satisfy
accuracy. Other studies propose the feature fusion module,
which combine multi-level features [6], [7]. However, these
modules, which contain convolution-based operations to fusion
multi-level features, require huge computational complexity
and memory storage.
In order to reduce the amount of computation while not
dropping the accuracy, we attempt to exploit normalization
technics in feature fusion of semantic segmentation. In U-GAT-
IT [1], spatial and semantic contents are considered adequately
by using adaptive normalizations [12, 13] to reflect image
content such as style and geometry information. Inspired by
these approaches, we propose an efficient Feature Fusion
with Different Norms (FFDN) where layer normalization
and instance normalization are used in aggregating features
of different layers as shown in Fig 2. These normalization
methods allow the segmentation model to obtain exact object
location from spatial information and detailed parts of object
from content information with low computational complexity.
FFDN receives multi-level features obtained from simply
modified FPN (*FPN) as input and combines them to capture
global properties of representations. The *FPN is shown in
Fig 1.
One of the challenging problems with semantic segmen-
tation is considering specific directions, such as vertical or
diagonal (e.g., people, pole). Since a general convolution or
pooling operation uses uniform kernels with the same height
arXiv:2210.01323v1 [cs.CV] 4 Oct 2022
摘要:

ASAP:AccuratesemanticsegmentationforrealtimeperformanceJaeHyunParkAItechteamLotteDataCommunicationCompanySeoul,SouthKoreajaehyun-park@lotte.netSuBinLeeAItechteamLotteDataCommunicationCompanySeoul,SouthKorealeesubin@lotte.netEonKimAItechteamLotteDataCommunicationCompanySeoul,SouthKoreaeon.kim@lotte.n...

展开>> 收起<<
ASAP Accurate semantic segmentation for real time performance Jae Hyun Park.pdf

共5页,预览1页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:5 页 大小:1.63MB 格式:PDF 时间:2025-05-08

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 5
客服
关注