SEQ2SEQ-SC: END-TO-END SEMANTIC COMMUNICATION SYSTEMS WITH
PRE-TRAINED LANGUAGE MODEL
Ju-Hyung Lee∗Dong-Ho Lee∗Eunsoo Sheen Thomas Choi Jay Pujara
University of Southern California
ABSTRACT
In this work, we propose a realistic semantic network called
seq2seq-SC, designed to be compatible with 5G NR and
capable of working with generalized text datasets using a
pre-trained language model. The goal is to achieve unprece-
dented communication efficiency by focusing on the mean-
ing of messages in semantic communication. We employ
a performance metric called semantic similarity, measured
by BLEU for lexical similarity and SBERT for semantic
similarity. Our findings demonstrate that seq2seq-SC outper-
forms previous models in extracting semantically meaningful
information while maintaining superior performance. This
study paves the way for continued advancements in semantic
communication and its prospective incorporation with future
wireless systems in 6G networks.
Index Terms—Semantic communication, natural lan-
guage processing (NLP), link-level simulation.
I. INTRODUCTION
The recent rise of deep learning-based techniques to infer
semantics (i.e., the meaning of the message rather than the
message itself) from texts, speeches, and videos, as well
as the ever-increasing quality of service requirements for
extremely data-hungry applications such as extended reality
(XR), have motivated the use of semantic communication [1]
for a new generation of wireless systems (6G). Focusing on
semantics allows forgoing unnecessary data (e.g., articles in
a sentence or background in a portrait photo), which can
increase communication efficiency.
While semantic communication may bring unprecedented
benefits, many challenges remain to realize it for actual
usage. First, it must be compatible with existing commu-
nication infrastructure; a “link-level” simulation is hence
required to verify its realistic end-to-end (E2E) performance.
Second, the semantic network has to be generalized to work
with any dataset rather than a particular dataset. Third,
since there is no universal performance metric for semantic
communication yet, metrics such as semantic similarity
must be refined to evaluate performance of the semantic
network from the semantic point of view. Lastly, since classic
communication cannot be completely replaced by semantic
communication, the network must be able to deliver infor-
mation both as-is or modified with high semantic similarity
dependent upon the communication scenario.
Contributions. We revisit questions raised by DeepSC [2]
regarding semantic communication:
Q 1: How do we design the semantic and channel coding
jointly?
Q 2: How do we measure semantic error (similarity) be-
tween transmitted and received sentences?
Our main contributions, which address these questions, are
summarized as follows:
•We employ E2E link-level simulation compliant to
5G NR (NVIDIA Sionna [3]), which contains features
like Polar codes. Through such method, we validate
semantic network performance in real-world settings
and answer Q 1.
•We integrate the pre-trained encoder-decoder transform-
ers with the E2E semantic communication systems,
which efficiently extract the semantic (meaningful)
information with reduced computation effort, dubbed
seq2seq-SC. This network is “generalized”, meaning
it works with all (general) text corpus in comparison
to Deep-SC which has limitations dependent upon a
particular datasets.
•To answer Q 2 and evaluate performance of a semantic
network in a semantic way, we introduce a metric
called semantic similarity. In order to make the network
flexible with respect to the communication scenario, the
network may either prioritize delivering a message as
perfect as possible or on semantic similarity only.
II. PRE-TRAINED MODEL FOR LANGUAGE
Contextualized embeddings from pre-training in a self-
supervised manner (e.g., masked language modeling) with
transformers [4] are extremely effective in providing ini-
tial representations that can be refined to attain acceptable
performance on numerous downstream tasks. Recent studies
on text semantic communication exploit transformer archi-
tectures [4] to extract semantics at transmitter and recover
the original information at receiver [2], [5]. However, such
frameworks may have following challenges: (1) Training
an E2E semantic communication pipeline requires a huge
computational effort due to the many randomly initialized
parameters of semantic encoder/decoder to be trained on; (2)
Difficulty in handling out-of-vocabulary (OOV) since they
only use a set of whitespace-separated tokens in the training
data. In this work, we use a pre-trained encoder-decoder
arXiv:2210.15237v2 [eess.SP] 18 Oct 2023