
VideoPipe 2022 Challenge: Real-World Video
Understanding for Urban Pipe Inspection
Yi Liu1∗, Xuan Zhang1∗, Ying Li1, Guixin Liang2, Yabing Jiang2, Lixia Qiu2, Haiping Tang2,
Fei Xie2, Wei Yao3, Yi Dai2†, Yu Qiao1,4†, Yali Wang1,5†
1ShenZhen Key Lab of Computer Vision and Pattern Recognition, Shenzhen Institute of Advanced Technology,
Chinese Academy of Sciences, China
2Shenzhen Bwell Technology Co., Ltd, China
3Shenzhen Longhua Drainage Co., Ltd, China
4Shanghai AI Laboratory, Shanghai, China
5SIAT Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society
Abstract—Video understanding is an important problem in
computer vision. Currently, the well-studied task in this research
is human action recognition, where the clips are manually
trimmed from the long videos, and a single class of human action
is assumed for each clip. However, we may face more complicated
scenarios in the industrial applications. For example, in the real-
world urban pipe system, anomaly defects are fine-grained, multi-
labeled, domain-relevant. To recognize them correctly, we need
to understand the detailed video content. For this reason, we
propose to advance research areas of video understanding, with
a shift from traditional action recognition to industrial anomaly
analysis. In particular, we introduce two high-quality video
benchmarks, namely QV-Pipe and CCTV-Pipe, for anomaly in-
spection in the real-world urban pipe systems. Based on these new
datasets, we will host two competitions including (1) Video Defect
Classification on QV-Pipe and (2) Temporal Defect Localization
on CCTV-Pipe. In this report, we describe the details of these
benchmarks, the problem definitions of competition tracks, the
evaluation metric, and the result summary. We expect that, this
competition would bring new opportunities and challenges for
video understanding in smart city and beyond. The details of our
VideoPipe challenge can be found in https://videopipe.github.io.
I. INTRODUCTION
In the last decades, sewer pipe system is one of the most
crucial infrastructures in modern cities. In order to ensure
its normal operation, we need to inspect pipe defects in an
effective and efficient manner. Several technologies have been
applied in the traditional pipe inspection procedure. [1] has
conducted a thorough investigation and categorized them into
visual methods, electromagnetic methods, acoustic methods,
and ultrasound methods. In particular, Quick-View (QV) In-
spection and Closed-Circuit Television (CCTV) Inspection are
the most popular methods, as shown in Figure 1. The Quick-
View (QV) Inspection is used for rapid anomaly assessment
on sewer pipes, since the camera can only record videos
on the pipe orifice. The CCTV inspection system involves
a remote-controlled robot that travels along the sewer pipe
with a camera for video recording [2]. Hence, it can get more
∗Yi Liu (yi.liu1@siat.ac.cn) and Xuan Zhang (xuan.zhang1@siat.ac.cn)
are equally-contributed first authors.
†Yi Dai (daiyi@bominwell.com), Yu Qiao (yu.qiao@siat.ac.cn) and Yali
Wang (yl.wang@siat.ac.cn) are equally-contributed corresponding authors.
Fig. 1: Two widely-used pipe inspection methods.
detailed anomaly analysis for the whole pipe. Based on these
QV and CCTV videos, the standardized protocols for manual
inspection have been established and adopted in the recent
years [3]. However, it is often labor-intensive to find anomaly
from hundreds of hours of videos in the complex urban pipes.
To tackle this problem, it is essential to develop automatic
inspection methods to discover sewer anomaly from large-
scale pipe videos. Early works use hand-crafted visual features
with traditional classifiers [4], [5], [6]. These approaches are
often limited for inspecting defects in the complex scenarios.
arXiv:2210.11158v1 [cs.CV] 20 Oct 2022