tinuous vector space and then assist the knowledge inference
like the task of link prediction or triple classification in a KG.
However, those technologies always suffer from the knowledge
inconsistency, i.e., the same entity or noun in the real world
may have different surfaces like the “Alm” v.s. “Alarm”.
Besides, the textual knowledge and semantic information in
entity surfaces are always abandoned during training, limiting
models’ intra-domain scalability and cross-domain portability.
The textual product documents are valuable resources in
tele-domain. Instead of simply using them as handbooks, one
approach is to pre-train a domain-specific language model
(LM). LM pre-training [9]–[12] is a good recipe for learning
implicit semantic knowledge with self-supervised text recon-
struction as the training objective in a vast amount of language
data. However, their challenges lie in exploiting the structured
knowledge for explicit intellectual reasoning. Additionally, our
machine data is semi-structured and multi-directional: with a
vertical direction of the time and a horizontal direction of
multiple indicators extending the machine data at a single
moment, as shown in Fig. 2(a). This differs from the typically
log-based anomaly detection methods [13]–[15] which target
at the unidirectional and serial log data.
In this work, we propose to pre-train all data that contains
tele-knowledge, including machine data, Tele-Corpus from
the product documents, and triples from the Tele-KG. We
expect that this pre-trained model can aid in downstream fault
analysis tasks in a convenient and effective manner, and boost
their performance, especially for tasks with limited data (also
known as low-resource tasks).
To achieve this, we first address the issue from multi-
source and multi-modal data (e.g., multi-directional machine
data, textual documents, and semi-structured KG), which can
distract the model from efficient learning. To remedy this,
we refer to the prompt engineering techniques [16]–[18] for
modality unification and provide relevant template hints to
the model for modalities unification.
Secondly, we address the challenge of handling numerical
data, which is an essential component of data in tele-domain
and frequently appears in machine data (e.g., KPI scores).
This data format is similar to the tabular data, sharing the
characteristic of: (i) The text part is short; (ii) The Numerical
values always have different meanings and ranges under dif-
ferent circumstances; (iii) Data stretches from both vertically
and horizontally which is hierarchical. However, existing table
pre-training methods mainly study the hierarchical structure
of tabular data [19]–[24] where the numerical information is
rarely studied in depth. Furthermore, those methods that target
at learning numerical features [13]–[15] focus on learning field
embedding for each numerical field. They tend to consider the
task with limited fields (e.g., the user attributes like height and
weight) but fail when migrated to our tele-scenario where the
field number (e.g., KPI name) is numerous (≥1000) and new
fields are often generated during the development of enterprise.
Thus, we propose an adaptive numeric encoder (ANEnc) in
tele-domain for type-aware numeric encoding.
Thirdly, we are aware of different training target among
the tele-corpus, machine data and the knowledge triples.
Thus we adopt a multi-stage training mode for multi-level
knowledge acquisition: (i) TeleBERT: in stage one we follow
ELECTRA [25] pre-training paradigm and data augmentation
method SimCSE [26] for large-scale (about 20 million) textual
tele-corpus pre-training; (ii) KTeleBERT: In stage two, we
extract those causal sentences which contain relevant causal
keywords to re-train TeleBERT together with the numeric-
related machine data, where a knowledge embedding training
objective and multi-task learning method are introduced for
explicit knowledge integration.
With our pre-trained model, we apply the model-generated
service vectors to enhance three tasks of fault analysis: root-
cause analysis (RCA), event association prediction (EAP),
and fault chain tracing (FCT). The experimental results show
that our TeleBERT and KTeleBERT successfully improve the
performance of these three tasks.
In summary, the contributions of this work are as follows:
•We emphasize the importance of encoding knowledge
uniformly in tele-domain application, and share our en-
coding experience in real-world scenarios.
•We propose a tele-domain pre-training model TeleBERT
and its knowledge-enhanced version KTeleBERT to fuse
and encode diverse tele-knowledge in different forms.
•We prove that our proposed models could serve multiple
fault analysis task models and boost their performance.
II. BACKGROUND
A. Corpus in Telecommunication
1) Machine Log Data: The machine (log) data, such as
abnormal events or normal indicator logs, is continuously
generated in both real-world tele-environments and simulation
scenes. Typically, as shown in 2(a), these abnormal events
like the service interruption, have varying levels of importance
and are always accompanied by anomalies in relevant network
elements (NEs). The normal indicators like the numerical
KPI data, on the other hand, are cyclical and persistent in
nature and make up the majority of automatically generated
machine data. Most abnormal events can self-recover after
existing a period of time, (e.g., network congestion), and there
may be correlation or causal relationships across abnormal
events or indicators, e.g, the alarm “(NF destination service
is unreachable)”, always lead to abnormal KPI score “(the
number of initial registration requests increases abnormally)”.
2) Product Document: Those domain engineers or experts
are constantly recording and updating the product docu-
mentation. Particularly, each scenario may contain one or
more product documents, which are maintained by different
departments and may include nearly all relevant information in
the field, such as the fault cases, solutions for already occurred
or potential cases, and the event descriptions shown in 2(b).
3) Tele-product Knowledge Graph (Tele-KG): We construct
the Tele-KG to integrate massive information about events
and resources on our platform. Our goal is intuitive: hoping
that such a fine-grained Tele-KG could refine and purify the
knowledge of tele-domain, as a semi-structured knowledge