An Analysis of RF Transfer Learning Behavior Using Synthetic Data

2025-04-30 0 0 5.86MB 25 页 10玖币

侵权投诉





Citation: Wong L.J.; McPherson S;

Michaels A.J. An Analysis of RF

Transfer Learning Behavior Using

Synthetic Data. Preprints 2022,1, 0.

https://doi.org/

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

Article

An Analysis of RF Transfer Learning Behavior

Using Synthetic Data

Lauren J. Wong 1,2,3∗, Sean McPherson 3, and Alan J. Michaels 1,2

1Hume Center for National Security and Technology, Virginia Tech

2Bradley Department of Electrical and Computer Engineering, Virginia Tech

3Intel AI Lab, Santa Clara, CA

*Correspondence: ljwong@vt.edu

Abstract:

Transfer learning (TL) techniques, which leverage prior knowledge gained from data with

different distributions to achieve higher performance and reduced training time, are often used in

computer vision (CV) and natural language processing (NLP), but have yet to be fully utilized in the ﬁeld

of radio frequency machine learning (RFML). This work systematically evaluates how radio frequency (RF)

TL behavior by examining how the training domain and task, characterized by the transmitter/receiver

hardware and channel environment, impact RF TL performance for an example automatic modulation

classiﬁcation (AMC) use-case. Through exhaustive experimentation using carefully curated synthetic

datasets with varying signal types, signal-to-noise ratios (SNRs), and frequency offsets (FOs), generalized

conclusions are drawn regarding how best to use RF TL techniques for domain adaptation and sequential

learning. Consistent with trends identiﬁed in other modalities, results show that RF TL performance is

highly dependent on the similarity between the source and target domains/tasks. Results also discuss the

impacts of channel environment, hardware variations, and domain/task difﬁculty on RF TL performance,

and compare RF TL performance using head re-training and model ﬁne-tuning methods.

Keywords: machine learning, deep learning, transfer learning, radio frequency machine learning

1. Introduction

Radio frequency machine learning (RFML) is loosely deﬁned as the application of deep

learning (DL) to raw RF data, and has yielded state-of-the-art algorithms for spectrum aware-

ness, cognitive radio, and networking tasks. Existing RFML works have delivered increased

performance and ﬂexibility, and reduced the need for pre-processing and expert-deﬁned fea-

ture extraction techniques. As a result, RFML is expected to enable greater efﬁciency, lower

latency, and better spectrum efﬁciency in 6G systems [

]. However, to date, little research has

considered and evaluated the performance of these algorithms in the presence of changing

hardware platforms and channel environments, adversarial contexts, or resource constraints

that are likely to be encountered in real-world systems [2].

Current state-of-the-art RFML techniques rely upon supervised learning techniques

trained from random initialization, and thereby assume the availability of a large corpus

of labeled training data (synthetic, captured, or augmented [

]), which is representative of

the anticipated deployed environment. Over time, this assumption inevitably breaks down

as a result of changing hardware and channel conditions, and as a consequence, performance

degrades signiﬁcantly [

]. TL techniques can be used to mitigate these performance degrada-

tions by using prior knowledge obtained from a source domain and task, in the form of learned

representations, to improve performance on a “similar" target domain and task using less data,

as depicted in Fig. 1.

Though TL techniques have demonstrated signiﬁcant beneﬁts in ﬁelds such as CV and

NLP [

], including higher performing models, signiﬁcantly less training time, and far fewer

training samples [

], [

] showed that the use of TL in RFML is currently lacking through the

arXiv:2210.01158v1 [eess.SP] 3 Oct 2022

2 of 25

(a) Traditional Machine Learning (b) Transfer Learning

Figure 1.

In traditional machine learning (ML) (Fig. 1a), a new model is trained from random initialization for each domain/task pairing.

TL (Fig. 1b) utilizes prior knowledge learned on one domain/task, in the form of a pre-trained model, to improve performance on a second

domain and/or task. A concrete example for environmental adaptation to SNR is given in blue.

Figure 2.

A system overview of the RF hardware and channel environment simulated in this work with the parameters/variables (

α[t]

ω[t],θ[t],ν[t],ω∆[t]) that each component of the system has the most signiﬁcant impact on.

construction of an RFML speciﬁc TL taxonomy. This work begins to address current limitations

in understanding how the training domain and task impact learned behavior and therefore

facilitate or prevent successful transfer, where the training domain is characterized by the

RF hardware and the channel environment [

] depicted in Fig. 2and the training task is

the application being addressed including the range of possible outputs (i.e. the modulation

schemes classiﬁed). More speciﬁcally, this work systematically evaluates RF TL performance,

as measured by post-transfer top-1 accuracy, as a function of several parameters of interest

for an AMC use-case [

] using synthetic datasets. First, RF domain adaptation performance is

examined as a function of

•

SNR, which represents an environment adaptation problem characterized by a change in

the RF channel environment (i.e., an increase/decrease in the additive interference,

ν[t]

of the channel) and/or transmitting devices (i.e., an increases/decrease in the magnitude,

α[t], of the transmitted signal),

•

FO, which represents a platform adaptation problem characterized by a change in the

transmitting and/or receiving devices (i.e., an increase/decrease in

ω∆[t]

due to hardware

imperfections or a lack of synchronization), and

•

Both SNR and FO, representing an environment platform co-adaptation problem charac-

tereized by a change in both the RF channel environment and the transmitting/receiving

devices.

Parameter sweeps over these three scenarios addresses each type of RF domain adaptation

discussed in the RFML TL taxonomy [

], and resulted in the construction of 81 training sets, 81

validation sets, and 81 test sets and the training and evaluation of 4360 models. Additionally,

RF sequential learning performance is evaluated across broad categories of modulation types,

namely linear, frequency-shifted, and analog modulation schemes, as well as in a successive

3 of 25

model reﬁnement scenario, where a single modulation type is added/removed from the source

dataset. These experiments resulted in an additional 17 training sets, 17 validation sets, and 17

test sets, and the training and evaluation of 304 models. From these experiments, we identify a

number of practical takeaways for how best to utilize TL in an RFML setting including how

changes in SNR and FO impact the difﬁculty of AMC and a comparison of head re-training

versus ﬁne-tuning for RF TL. These takeaways serve as initial guidelines for RF TL, subject

to further experimentation using additional signal types, channel models, use-cases, model

architectures, and augmented or captured datasets.

This paper is organized as follows: Section 2provides requisite background knowledge of

TL and RFML. In Section 3, each of the key methods and systems used and developed for this

work are described in detail, including the simulation environment and dataset creation, as well

as the model architecture and training. Section 4presents experimental results and analysis,

addressing the key research questions described above. Finally, Section 5offers conclusions

about the effectiveness of TL for RFML and next steps for incorporating and extending TL

techniques in RFML-based research. A list of the acronyms used in this work is provided in the

appendix for reference.

2. Background

The following subsections provide an overview of RFML, TL, and TL for RFML to provide

context for the work performed herein.

2.1. Radio Frequency Machine Learning (RFML)

The term RFML is often used in the literature to describe any application of machine

learning (ML) or DL to the RF domain. However, RFML was coined by Defense Advanced

Research Projects Agency (DARPA) and deﬁned as systems that:

•

Autonomously learn features from raw data to detect, characterize, and identify signals-

of-interest,

•

Can autonomously conﬁgure RF sensors or communications platforms for changing

communications environments, and

• Can synthesize “any possible waveform" [9].

Therefore, RFML algorithms typically utilize raw RF data as input to ML/DL techniques; most

often deep neural networks (DNNs).

To date, most RFML research has focused on delivering state-of-the-art performance on

spectrum awareness and cognitive radio tasks, whether through increased accuracy, increased

adaptability, or using less expert knowledge. Such spectrum awareness cognitive radio tasks

include signal detection, signal classiﬁcation or AMC, speciﬁc emitter identiﬁcation (SEI),

channel modeling/emulation, positioning/localization, and spectrum anomaly detection [

One of the most common and arguably the most mature spectrum awareness or cognitive radio

applications explored in the literature is AMC, and as such, AMC is the example use-case in

this work. AMC is the task of identifying the type of or format of a detected signal, and is a

key step in receiving RF signals. Traditional AMC techinques have typically consisted of an

expert-deﬁned feature extraction stage and a pattern recognition stage using techniques such as

decision trees, support vector machines, and multi-layer perceptrons (MLPs) [

]. RFML-based

approaches aim to both automatically learn and identify key features within signals-of-interest,

as well as utilize those features to classify the signal, using only minimally pre-processed

raw RF as input to DNN architectures including convolutional neural networks (CNNs) and

recurrent neural networks (RNNs) [11].

4 of 25

Figure 3. The RF TL taxonomy proposed in [8].

2.2. Transfer Learning (TL) for RFML

As previously mentioned, TL aims to utilize prior knowledge gained from a source

domain/task to improve performance on a “similar" target domain/task, where training data

may be limited. The domain,

D={X

P(X)}

, consists of the input data

and the marginal

probability distribution over the data

P(X)

. Meanwhile, the task,

T={Y

P(Y|X)}

, consists

of the label space

, and the conditional probability distribution

P(Y|X)

learned from the

training data pairs

{xi

yi}

such that

xi∈X

and

yi∈Y

. In the context of RFML, the domain is

characterized by the RF hardware and channel environments (i.e. In-phase/Quadrature (IQ)

imbalance, non-linear distortion, SNR, multi-path effects), and the task is the application being

addressed, including the range of possible outputs (i.e. n-class AMC, SEI, SNR estimation).

Recent work presented the RF-speciﬁc TL taxonomy shown in Fig. 3[

], adapted from

the general TL taxonomy of [12] and the NLP-speciﬁc taxonomy of [6]. Per this taxonomy, RF

TL is categorized by training data availability and whether or not the source and target tasks

differ:

•

Domain adaptation is the setting in which source and target tasks are the same, but the

source and target domains differ, and can be further categorized as

–

Environment adaptation, where the channel environment is changing, but the trans-

mitter/receiver pair(s) are constant,

–

Platform adaptation, where the transmitter/receiver hardware is changing, but the

channel environment is consistant, and

–

Environment platform co-adaptation, where changes in both the channel environ-

ment and transmitter/receiver hardware must be overcome.

•

Multi-task learning is the setting in which different source and target tasks are learned

simultaneously.

•

Sequential learning is the setting in which a source task is learned ﬁrst, and the target

task, different from the source task, is learned during a second training phase.

Typically, the same training techniques are used to perform both domain adaptation and

sequential learning, most commonly head re-training and model ﬁne-tuning, which are the

focus of in this work. Existing works have successfully utilized such techniques to overcome

changes in channel environment [

] and wireless protocol [

], to transfer from synthetic

data to captured data [

–

], and to add or remove output classes [

], for a variety of RFML

5 of 25

use-cases. Meanwhile, multi-task learning approaches tend to utilize more than one loss term

during a single training phase, and has been more commonly used in the context of ML-enabled

wireless communications systems that use expert-deﬁned features rather than raw RF data

as input. However, multi-task learning techniques have been used to facilitate end-to-end

communications systems [

], as well as to improve the explainability and accuracy of RFML

models [

]. A systematic examination and evaluation of multi-task learning performance is

left for future work.

Outside of observing the inability of pre-trained RFML models to generalize to new

domains/tasks [

], little-to-no work has examined what characteristics within RF data

facilitate or restrict transfer [

]. Without such knowledge, TL algorithms for RFML are generally

restricted to those borrowed from other modalities, such as CV and NLP. While correlations can

be drawn between the vision or language spaces and the RF space, these parallels do not always

align, and therefore algorithms designed for CV and NLP may not always be appropriate for

use in RFML. For example, while CV algorithm performance is not signiﬁcantly impacted

by a change in the camera(s) used to collect data, so long as the image resolution remains

consistent [

], work in [

] showed that a change in transmitter/receiver pairs negatively

impacted performance by as much as 7%, despite the collection parameters and even the

brand/models of transmitters/receivers remaining consistent. Therefore, platform adaptation

techniques that transfer knowledge gleaned from one hardware platform (or set of platforms)

to a second hardware platform (or set of platforms) are a necessity in RFML, but not in CV.

3. Methodology

This section presents the experimental setup used in this work, shown in Fig. 4, which

includes the data and dataset creation process and the model architecture, training, and

evaluation, each described in detail in the following subsections.

3.1. Dataset Creation

This work used a custom synthetic dataset generation tool based off the open-source signal

processing library liquid-dsp [

], which allowed for full control over the chosen parameters-

of-interest, SNR, FO, and modulation type, and ensured accurate labelling of the training,

validation, and test data. The dataset creation process, shown in Fig. 4a, begins with the

construction of a large “master" dataset containing all modulation schemes and combinations

of SNR and FO needed for the experiments performed (Section 3.1.2). Then, for each experiment

performed herein, subsets of the data were selected from the master dataset using conﬁguration

ﬁles containing the desired metadata parameters (Sections 3.1.3 -3.1.6). The master dataset is

publicly available on IEEE DataPort [29].

3.1.1. Simulation Environment

All data used in this work was generated using the same noise generation, signal param-

eters, and signal types as in [

]. More speciﬁcally, in this work, the signal space has been

restricted to the 23 signal types shown in Table 1, observed at complex baseband in the form of

discrete time-series signals, s[t], where

s[t] = α∆[t]·α[t]e(jω[t]+jθ[t]) ·e(jω∆[t]+jθ∆[t]) +ν[t](1)

α[t]

ω[t]

, and

θ[t]

are the magnitude, frequency, and phase of the signal at time

, and

ν[t]

is the additive interference from the channel. Any values subscripted with a

∆

represent

imperfections/offsets caused by the transmitter/receiver and/or synchronization. Without

loss of generality, all offsets caused by hardware imperfections or lack of synchronization have

been consolidated onto the transmitter during simulation.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Citation:WongL.J.;McPhersonS;MichaelsA.J.AnAnalysisofRFTransferLearningBehaviorUsingSyntheticData.Preprints2022,1,0.https://doi.org/Publisher'sNote:MDPIstaysneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalafl-iations.Copyright:©2022bytheauthors.LicenseeMDPI,Basel,Switzerland.T...

展开>> 收起<<

An Analysis of RF Transfer Learning Behavior Using Synthetic Data.pdf

共25页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

An Analysis of RF Transfer Learning Behavior Using Synthetic Data

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: