An Analysis of RF Transfer Learning Behavior Using Synthetic Data

2025-04-30 0 0 5.86MB 25 页 10玖币
侵权投诉


Citation: Wong L.J.; McPherson S;
Michaels A.J. An Analysis of RF
Transfer Learning Behavior Using
Synthetic Data. Preprints 2022,1, 0.
https://doi.org/
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
An Analysis of RF Transfer Learning Behavior
Using Synthetic Data
Lauren J. Wong 1,2,3, Sean McPherson 3, and Alan J. Michaels 1,2
1Hume Center for National Security and Technology, Virginia Tech
2Bradley Department of Electrical and Computer Engineering, Virginia Tech
3Intel AI Lab, Santa Clara, CA
*Correspondence: ljwong@vt.edu
Abstract:
Transfer learning (TL) techniques, which leverage prior knowledge gained from data with
different distributions to achieve higher performance and reduced training time, are often used in
computer vision (CV) and natural language processing (NLP), but have yet to be fully utilized in the field
of radio frequency machine learning (RFML). This work systematically evaluates how radio frequency (RF)
TL behavior by examining how the training domain and task, characterized by the transmitter/receiver
hardware and channel environment, impact RF TL performance for an example automatic modulation
classification (AMC) use-case. Through exhaustive experimentation using carefully curated synthetic
datasets with varying signal types, signal-to-noise ratios (SNRs), and frequency offsets (FOs), generalized
conclusions are drawn regarding how best to use RF TL techniques for domain adaptation and sequential
learning. Consistent with trends identified in other modalities, results show that RF TL performance is
highly dependent on the similarity between the source and target domains/tasks. Results also discuss the
impacts of channel environment, hardware variations, and domain/task difficulty on RF TL performance,
and compare RF TL performance using head re-training and model fine-tuning methods.
Keywords: machine learning, deep learning, transfer learning, radio frequency machine learning
1. Introduction
Radio frequency machine learning (RFML) is loosely defined as the application of deep
learning (DL) to raw RF data, and has yielded state-of-the-art algorithms for spectrum aware-
ness, cognitive radio, and networking tasks. Existing RFML works have delivered increased
performance and flexibility, and reduced the need for pre-processing and expert-defined fea-
ture extraction techniques. As a result, RFML is expected to enable greater efficiency, lower
latency, and better spectrum efficiency in 6G systems [
1
]. However, to date, little research has
considered and evaluated the performance of these algorithms in the presence of changing
hardware platforms and channel environments, adversarial contexts, or resource constraints
that are likely to be encountered in real-world systems [2].
Current state-of-the-art RFML techniques rely upon supervised learning techniques
trained from random initialization, and thereby assume the availability of a large corpus
of labeled training data (synthetic, captured, or augmented [
3
]), which is representative of
the anticipated deployed environment. Over time, this assumption inevitably breaks down
as a result of changing hardware and channel conditions, and as a consequence, performance
degrades significantly [
4
,
5
]. TL techniques can be used to mitigate these performance degrada-
tions by using prior knowledge obtained from a source domain and task, in the form of learned
representations, to improve performance on a “similar" target domain and task using less data,
as depicted in Fig. 1.
Though TL techniques have demonstrated significant benefits in fields such as CV and
NLP [
6
], including higher performing models, significantly less training time, and far fewer
training samples [
7
], [
8
] showed that the use of TL in RFML is currently lacking through the
arXiv:2210.01158v1 [eess.SP] 3 Oct 2022
2 of 25
(a) Traditional Machine Learning (b) Transfer Learning
Figure 1.
In traditional machine learning (ML) (Fig. 1a), a new model is trained from random initialization for each domain/task pairing.
TL (Fig. 1b) utilizes prior knowledge learned on one domain/task, in the form of a pre-trained model, to improve performance on a second
domain and/or task. A concrete example for environmental adaptation to SNR is given in blue.
Figure 2.
A system overview of the RF hardware and channel environment simulated in this work with the parameters/variables (
α[t]
,
ω[t],θ[t],ν[t],ω[t]) that each component of the system has the most significant impact on.
construction of an RFML specific TL taxonomy. This work begins to address current limitations
in understanding how the training domain and task impact learned behavior and therefore
facilitate or prevent successful transfer, where the training domain is characterized by the
RF hardware and the channel environment [
8
] depicted in Fig. 2and the training task is
the application being addressed including the range of possible outputs (i.e. the modulation
schemes classified). More specifically, this work systematically evaluates RF TL performance,
as measured by post-transfer top-1 accuracy, as a function of several parameters of interest
for an AMC use-case [
4
] using synthetic datasets. First, RF domain adaptation performance is
examined as a function of
SNR, which represents an environment adaptation problem characterized by a change in
the RF channel environment (i.e., an increase/decrease in the additive interference,
ν[t]
,
of the channel) and/or transmitting devices (i.e., an increases/decrease in the magnitude,
α[t], of the transmitted signal),
FO, which represents a platform adaptation problem characterized by a change in the
transmitting and/or receiving devices (i.e., an increase/decrease in
ω[t]
due to hardware
imperfections or a lack of synchronization), and
Both SNR and FO, representing an environment platform co-adaptation problem charac-
tereized by a change in both the RF channel environment and the transmitting/receiving
devices.
Parameter sweeps over these three scenarios addresses each type of RF domain adaptation
discussed in the RFML TL taxonomy [
8
], and resulted in the construction of 81 training sets, 81
validation sets, and 81 test sets and the training and evaluation of 4360 models. Additionally,
RF sequential learning performance is evaluated across broad categories of modulation types,
namely linear, frequency-shifted, and analog modulation schemes, as well as in a successive
3 of 25
model refinement scenario, where a single modulation type is added/removed from the source
dataset. These experiments resulted in an additional 17 training sets, 17 validation sets, and 17
test sets, and the training and evaluation of 304 models. From these experiments, we identify a
number of practical takeaways for how best to utilize TL in an RFML setting including how
changes in SNR and FO impact the difficulty of AMC and a comparison of head re-training
versus fine-tuning for RF TL. These takeaways serve as initial guidelines for RF TL, subject
to further experimentation using additional signal types, channel models, use-cases, model
architectures, and augmented or captured datasets.
This paper is organized as follows: Section 2provides requisite background knowledge of
TL and RFML. In Section 3, each of the key methods and systems used and developed for this
work are described in detail, including the simulation environment and dataset creation, as well
as the model architecture and training. Section 4presents experimental results and analysis,
addressing the key research questions described above. Finally, Section 5offers conclusions
about the effectiveness of TL for RFML and next steps for incorporating and extending TL
techniques in RFML-based research. A list of the acronyms used in this work is provided in the
appendix for reference.
2. Background
The following subsections provide an overview of RFML, TL, and TL for RFML to provide
context for the work performed herein.
2.1. Radio Frequency Machine Learning (RFML)
The term RFML is often used in the literature to describe any application of machine
learning (ML) or DL to the RF domain. However, RFML was coined by Defense Advanced
Research Projects Agency (DARPA) and defined as systems that:
Autonomously learn features from raw data to detect, characterize, and identify signals-
of-interest,
Can autonomously configure RF sensors or communications platforms for changing
communications environments, and
Can synthesize “any possible waveform" [9].
Therefore, RFML algorithms typically utilize raw RF data as input to ML/DL techniques; most
often deep neural networks (DNNs).
To date, most RFML research has focused on delivering state-of-the-art performance on
spectrum awareness and cognitive radio tasks, whether through increased accuracy, increased
adaptability, or using less expert knowledge. Such spectrum awareness cognitive radio tasks
include signal detection, signal classification or AMC, specific emitter identification (SEI),
channel modeling/emulation, positioning/localization, and spectrum anomaly detection [
2
].
One of the most common and arguably the most mature spectrum awareness or cognitive radio
applications explored in the literature is AMC, and as such, AMC is the example use-case in
this work. AMC is the task of identifying the type of or format of a detected signal, and is a
key step in receiving RF signals. Traditional AMC techinques have typically consisted of an
expert-defined feature extraction stage and a pattern recognition stage using techniques such as
decision trees, support vector machines, and multi-layer perceptrons (MLPs) [
10
]. RFML-based
approaches aim to both automatically learn and identify key features within signals-of-interest,
as well as utilize those features to classify the signal, using only minimally pre-processed
raw RF as input to DNN architectures including convolutional neural networks (CNNs) and
recurrent neural networks (RNNs) [11].
4 of 25
Figure 3. The RF TL taxonomy proposed in [8].
2.2. Transfer Learning (TL) for RFML
As previously mentioned, TL aims to utilize prior knowledge gained from a source
domain/task to improve performance on a “similar" target domain/task, where training data
may be limited. The domain,
D={X
,
P(X)}
, consists of the input data
X
and the marginal
probability distribution over the data
P(X)
. Meanwhile, the task,
T={Y
,
P(Y|X)}
, consists
of the label space
Y
, and the conditional probability distribution
P(Y|X)
learned from the
training data pairs
{xi
,
yi}
such that
xiX
and
yiY
. In the context of RFML, the domain is
characterized by the RF hardware and channel environments (i.e. In-phase/Quadrature (IQ)
imbalance, non-linear distortion, SNR, multi-path effects), and the task is the application being
addressed, including the range of possible outputs (i.e. n-class AMC, SEI, SNR estimation).
Recent work presented the RF-specific TL taxonomy shown in Fig. 3[
8
], adapted from
the general TL taxonomy of [12] and the NLP-specific taxonomy of [6]. Per this taxonomy, RF
TL is categorized by training data availability and whether or not the source and target tasks
differ:
Domain adaptation is the setting in which source and target tasks are the same, but the
source and target domains differ, and can be further categorized as
Environment adaptation, where the channel environment is changing, but the trans-
mitter/receiver pair(s) are constant,
Platform adaptation, where the transmitter/receiver hardware is changing, but the
channel environment is consistant, and
Environment platform co-adaptation, where changes in both the channel environ-
ment and transmitter/receiver hardware must be overcome.
Multi-task learning is the setting in which different source and target tasks are learned
simultaneously.
Sequential learning is the setting in which a source task is learned first, and the target
task, different from the source task, is learned during a second training phase.
Typically, the same training techniques are used to perform both domain adaptation and
sequential learning, most commonly head re-training and model fine-tuning, which are the
focus of in this work. Existing works have successfully utilized such techniques to overcome
changes in channel environment [
13
,
14
] and wireless protocol [
15
,
16
], to transfer from synthetic
data to captured data [
17
20
], and to add or remove output classes [
21
], for a variety of RFML
5 of 25
use-cases. Meanwhile, multi-task learning approaches tend to utilize more than one loss term
during a single training phase, and has been more commonly used in the context of ML-enabled
wireless communications systems that use expert-defined features rather than raw RF data
as input. However, multi-task learning techniques have been used to facilitate end-to-end
communications systems [
22
], as well as to improve the explainability and accuracy of RFML
models [
23
,
24
]. A systematic examination and evaluation of multi-task learning performance is
left for future work.
Outside of observing the inability of pre-trained RFML models to generalize to new
domains/tasks [
4
,
25
,
26
], little-to-no work has examined what characteristics within RF data
facilitate or restrict transfer [
8
]. Without such knowledge, TL algorithms for RFML are generally
restricted to those borrowed from other modalities, such as CV and NLP. While correlations can
be drawn between the vision or language spaces and the RF space, these parallels do not always
align, and therefore algorithms designed for CV and NLP may not always be appropriate for
use in RFML. For example, while CV algorithm performance is not significantly impacted
by a change in the camera(s) used to collect data, so long as the image resolution remains
consistent [
27
], work in [
4
] showed that a change in transmitter/receiver pairs negatively
impacted performance by as much as 7%, despite the collection parameters and even the
brand/models of transmitters/receivers remaining consistent. Therefore, platform adaptation
techniques that transfer knowledge gleaned from one hardware platform (or set of platforms)
to a second hardware platform (or set of platforms) are a necessity in RFML, but not in CV.
3. Methodology
This section presents the experimental setup used in this work, shown in Fig. 4, which
includes the data and dataset creation process and the model architecture, training, and
evaluation, each described in detail in the following subsections.
3.1. Dataset Creation
This work used a custom synthetic dataset generation tool based off the open-source signal
processing library liquid-dsp [
28
], which allowed for full control over the chosen parameters-
of-interest, SNR, FO, and modulation type, and ensured accurate labelling of the training,
validation, and test data. The dataset creation process, shown in Fig. 4a, begins with the
construction of a large “master" dataset containing all modulation schemes and combinations
of SNR and FO needed for the experiments performed (Section 3.1.2). Then, for each experiment
performed herein, subsets of the data were selected from the master dataset using configuration
files containing the desired metadata parameters (Sections 3.1.3 -3.1.6). The master dataset is
publicly available on IEEE DataPort [29].
3.1.1. Simulation Environment
All data used in this work was generated using the same noise generation, signal param-
eters, and signal types as in [
23
]. More specifically, in this work, the signal space has been
restricted to the 23 signal types shown in Table 1, observed at complex baseband in the form of
discrete time-series signals, s[t], where
s[t] = α[t]·α[t]e(jω[t]+jθ[t]) ·e(jω[t]+jθ[t]) +ν[t](1)
α[t]
,
ω[t]
, and
θ[t]
are the magnitude, frequency, and phase of the signal at time
t
, and
ν[t]
is the additive interference from the channel. Any values subscripted with a
represent
imperfections/offsets caused by the transmitter/receiver and/or synchronization. Without
loss of generality, all offsets caused by hardware imperfections or lack of synchronization have
been consolidated onto the transmitter during simulation.
摘要:

Citation:WongL.J.;McPhersonS;MichaelsA.J.AnAnalysisofRFTransferLearningBehaviorUsingSyntheticData.Preprints2022,1,0.https://doi.org/Publisher'sNote:MDPIstaysneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalafl-iations.Copyright:©2022bytheauthors.LicenseeMDPI,Basel,Switzerland.T...

展开>> 收起<<
An Analysis of RF Transfer Learning Behavior Using Synthetic Data.pdf

共25页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:25 页 大小:5.86MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 25
客服
关注