1 How can a Radar Mask its Cognition Kunal Pattanayak Student Member IEEE Vikram Krishnamurthy Fellow IEEE and Christopher Berry

2025-04-30
0
0
1.1MB
18 页
10玖币
侵权投诉
1
How can a Radar Mask its Cognition?
Kunal Pattanayak, Student Member, IEEE, Vikram Krishnamurthy, Fellow, IEEE and Christopher Berry
Abstract—A cognitive radar is a constrained utility maximizer
that adapts its sensing mode in response to a changing envi-
ronment. If an adversary can estimate the utility function of a
cognitive radar, it can determine the radar’s sensing strategy and
mitigate the radar performance via electronic countermeasures
(ECM). This paper discusses how a cognitive radar can hide
its strategy from an adversary that detects cognition. The
radar does so by transmitting purposefully designed sub-optimal
responses to spoof the adversary’s Neyman-Pearson detector.
We provide theoretical guarantees by ensuring the Type-I error
probability of the adversary’s detector exceeds a pre-defined level
for a specified tolerance on the radar’s performance loss. We
illustrate our cognition masking scheme via numerical examples
involving waveform adaptation and beam allocation. We show
that small purposeful deviations from the optimal strategy of
the radar confuse the adversary by significant amounts, thereby
masking the radar’s cognition. Our approach uses novel ideas
from revealed preference in microeconomics and adversarial
inverse reinforcement learning. Our proposed algorithms pro-
vide a principled approach for system-level electronic counter-
countermeasures (ECCM) to mask the radar’s cognition, i.e. ,
hide the radar’s strategy from an adversary. We also provide
performance bounds for our cognition masking scheme when the
adversary has misspecified measurements of the radar’s response.
Index Terms—Cognitive Radar, Meta-cognition, Revealed Pref-
erence, Inverse Reinforcement Learning, Electronic Counter
Countermeasures, Bayesian Tracker, Afriat’s Theorem
I. INTRODUCTION
In abstract terms, a cognitive radar is a constrained utility
maximizer with multiple sets of utility functions and con-
straints that allow the radar to deploy different strategies
depending on changing environments. Cognitive radars adapt
their waveform scheduling and beam allocation by optimiz-
ing their utility functions in different situations. If a smart
adversary can estimate the utility function or constraints of
the cognitive radar, then it can exploit this information to
mitigate the radar’s performance (e.g., jam the radar with
purposefully designed interference). A natural question is: how
can a cognitive radar hide its cognition from an adversary?
Put simply, how can a smart sensor hide its strategy by
acting dumb? We term this cognition-masking functionality as
meta-cognition.1A meta-cognitive radar [1] switches between
Short versions containing partial results appear in the IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022,
International Conference of Information Fusion (FUSION), 2022 and IEEE
International Conference on Decision and Control (CDC), 2022.
V. Krishnamurthy and K. Pattanayak are with the School of Electrical and
Computer Engineering, Cornell University, Ithaca, New York, 14853 USA. e-
mail: vikramk@cornell.edu, kp487@cornell.edu. C. Berry is with Lockheed
Martin Advanced Technology Laboratories, Cherry Hill, NJ, 08002 USA. e-
mail: christopher.m.berry@lmco.com. This research was supported in part by
a research contract from Lockheed Martin, the Army Research Office grant
W911NF-21-1-0093 and the Air Force Office of Scientific Research grant
FA9550-22-1-0016.
1“Meta-cognition” [1] is used to describe a sensing platform that switches
between multiple objectives (constrained utility functions).
multiple objectives (plans) to maintain stealth; for example, it
can switch between the conflicting objectives of maximizing
the signal-to-noise ratio of a target to maximizing privacy of
its plan to maintain stealth.
A meta-cognitive radar pays a penalty for stealth - it de-
liberately transmits sub-optimal responses to keep its strategy
hidden from the adversary resulting in performance degrada-
tion. This paper investigates how a cognitive radar hide its
strategy when the adversary observes the radar’s responses.
Our meta-cognition results are inspired by privacy-preserving
mechanisms in differential privacy and adversarial obfuscation
in deep learning with related works discussed below. Although
this paper is radar-centric, we emphasize that the problem
formulation and algorithms also apply to adversarial inverse
reinforcement learning in general machine learning applica-
tions, namely, how to purposefully choose suboptimal actions
to hide a strategy.
Related Works
Cognitive radars are widely studied [2], [3]. More recently,
our papers [4], [5] deal with inverse reinforcement learning
(IRL) algorithms for cognitive radars, namely, how can an ad-
versary estimate the utility function of a cognitive radar by ob-
serving its decisions. Reconstructing a decision maker’s utility
function by observing its actions is the main focus of IRL [6],
[7], [8] in machine learning and revealed preference [9], [10]
in micro-economics literature. In the radar literature, such IRL
based adversarial actions to mitigate the radar’s operations
are called electronic countermeasures (ECM) [11], [12], [4].
This paper builds on [4], [5] and develops electronic counter-
countermeasures (ECCM) [13], [14], [15] to mitigate ECM.
This paper assumes that adversary’s ECM is unaware if the
radar has ECCM capability, which is consistent with state-of-
the-art ECCM literature. The central theme of this paper is
to apply results from revealed preference in micro-economics
theory [9], [16]. To the best of our knowledge, this approach
for ECCM to hide cognition is novel.
Several works in literature [17], [18], [19] highlight how an
adversary benefits from learning the radar’s utility function. In
[17], the adversary optimize its probes to increase the power
of its statistical hypothesis test for utility maximization. [18],
[19] show how revealed preference-based IRL techniques can
be used to manipulate consumer behavior.
In the radar context, [20] uses the Laplacian mechanism
for meta-cognition; the cognitive radar anonymizes its trajec-
tories via additive Laplacian noise. In our cognition masking
approach, the radar mitigates adversarial IRL via purposeful
perturbations from optimal strategy, where the perturbations
are computed via stochastic gradient algorithms (see Algo-
rithm 2 in Sec. IV-B).
arXiv:2210.11444v1 [eess.SP] 20 Oct 2022
2
Context
Radar Design Paradigm: Cognition Masking vs LPI. Low-
probability-of-intercept (LPI) radars [21], [22], [23] achieve
stealth by minimizing the probability of the radar signals being
detected by an adversarial target. Our rationale for stealth in
this paper is at a higher level of abstraction than classical
LPI. The cognitive radar’s aim is to confuse the adversary’s
detector, i.e. , ensure the adversary incorrectly reconstructs the
radar’s strategy with high probability.
System level ECCM vs Pulse level ECCM. Our cogni-
tion masking algorithm is implemented at the system level
(Bayesian tracker level) and not the pulse level (Wiener filter
level). Pulse-level ECCM [24], [25], [26] accomplishes LPI-
type functionalities for cognitive radars. Cognition masking
hides the radar’s strategy from the adversary instead of mit-
igating the adversary’s detection of the radar’s transmission.
Hence, cognition masking ECCM is deployed at a higher level
of abstraction than pulse level ECCM.
Hiding Cognition against Optimal IRL vs Sub-optimal IRL.
The cognition masking results in this paper assume the adver-
sary performs optimal IRL using Afriat’s theorem [9], [16].
Afriat’s theorem achieves optimal IRL for non-parametric util-
ity estimation of a cognitive radar as it generates a polytope of
all viable utilities that rationalizes a finite dataset of adversarial
probes and radar responses. However, our cognition masking
results can be extended to any potentially sub-optimal IRL
algorithm that generates a set-valued estimate of the radar’s
utility, as long as the radar has knowledge of the IRL algorithm
being used by the adversary. Algorithm 3 in the appendix
outlines how a cognitive radar can mask its cognition for
an arbitrary IRL algorithm. At an abstract level, cognition
masking simply obfuscates a set-valued mapping from the
adversary’s dataset to a set of feasible utilities by intelligently
distorting the radar’s responses and hence, is not affected by
the optimality of adversarial IRL.
At a deeper level, this paper quantifies cognition
masking performance when the adversary has misspecified
measurements of the radar’s response, and performs sub-
optimal IRL. Theorem 8 (in appendix) provides performance
guarantees for cognition masking when the radar does not
know the misspecification errors and provides a bound on
the cognitive masking performance in terms of the error
magnitude.
Why not an MDP or non-cooperative game?
In machine learning based IRL [6], [8], [27], the aim
is to reconstruct the rewards of a Markov decision process
(MDP) subject to entropic constraints on the policy. This
requires complete knowledge of the transition dynamics of
the adversary’s probes. In comparison, our radar-adversary
interaction is batch-wise - the adversary transmits a batch
of probe signals, and then the radar responds with a batch
of responses. This non-parametric identification of the radar’s
strategy is agnostic to transition dynamics in the adversary’s
probes. Hence, a static utility maximization setup is more
realistic for IRL and inverse IRL involving cognitive radar.
We consider a radar-adversary interaction where the adver-
sary is not aware of the radar’s cognition masking strategy.
A more general formulation is a Stackelberg game between
the radar and the adversary, with the adversary as the leader
and the radar as the follower. However, such an approach for
computing the optimal meta-cognition strategy for the radar
is ill-posed since the existence of a pure and unique Nash
equilibrium is not guaranteed. Finally, from an inverse game
theoretic perspective, identifying if the radar-adversary behav-
ior is consistent with Nash equilibrium is intractable since the
analyst needs to know both the radar’s and adversary’s utility
function. Addressing these issues is beyond the scope of this
paper, and the subject of future work.
Outline and Organization of Results
(i) Background. Inverse reinforcement learning (IRL): In
Sec. II, we formulate the interaction between a cognitive
radar and an adversary target. We review the main idea of
revealed preference-based adversarial IRL algorithms, namely,
Theorems 1 and 5 used by the adversarial target to reconstruct
the radar’s strategy from its actions. Then we outline two
examples, namely waveform adaptation and beam allocation.
Theorem 6 stated in Appendix F extends adversarial IRL to
the case where the cognitive radar faces multiple constraints.
Theorem 6 is omitted from the main text for readability.
(ii) Masking Radar’s Strategy from Adversarial IRL: Sec. III
contains our main meta-cognition results, namely, Theorems 2
for mitigating adversarial IRL by masking the radar’s strategy.
The key idea is for the radar to deliberately deviate from its
optimal (naive) response to ensure:
(1) its true strategy almost fails to rationalize its perturbed
responses (masked from adversarial IRL), and
(2) its performance degradation due to sub-optimal responses
does not exceed a particular threshold. Theorem 7 in Ap-
pendix F extends Theorem 2 to the case where the cognitive
radar has multiple constraints. Theorem 8 provides perfor-
mance bounds on the cognition masking scheme of Theorem 2
when the adversary has misspecified measurements of the
radar’s response.
(iii) Masking Radar’s Strategy from Adversarial IRL Detectors
in noise. Sec. IV extends our IRL and cognition masking
results to the case where the adversary has noisy measurements
of the radar’s response. First, we define IRL detectors (Defini-
tion 5) that detect radar’s cognition in noise. Then, we enhance
our cognition masking scheme of Theorem 2 to mitigate the
IRL detectors. The radar’s cognition masking objective now is
to maximize the detectors’ conditional Type-I error probability,
subject to a bound on its deliberate performance degradation.
(iv) Numerical illustration of masking cognition by meta-
cognitive radars. Sec. V illustrates our meta-cognition results
on two target tracking functionalities, namely, waveform adap-
tation and beam allocation. Our numerical experiments show
that the meta-cognition algorithms in this paper can effectively
mask both the radar’s utility function and resource constraint
when the cognitive radar is probed by the adversarial target.
Our main finding is that a small deliberate performance loss of
the meta-cognitive radar suffices to mask the radar’s strategy
from the adversary to a large extent.
3
II. BACKGROUND. IRL TO ESTIMATE COGNITIVE RADAR
Since this paper investigates how to construct a cognitive
radar that hides its utility from an adversarial IRL system, this
section gives the background on how an adversarial system can
use IRL to estimate the radar’s utility. An important aspect
of the IRL framework below is that it is a necessary and
sufficient condition for identifying cognition (utility maximiza-
tion behavior); hence it can be considered as an optimal IRL
scheme. Appendix H and G discuss cognition masking when
the adversary performs sub-optimal IRL.
A. Radar-Adversary Dynamics
Definition 1 (Radar-Target Interaction).The cognitive radar-
adversary interaction has the following dynamics:
target probe: αk∈Rd
+
radar action: βk∈Rd
+
target state: xk={xk(t), t = 1,2, . . .},
xk(t+ 1) ∼pαk(x|xk(t)), x0∼π0
radar observation: yk∼pβk(y|xk)
radar tracker: πk=T(πk−1, yk)
observed radar action: ˆ
βk=βk+ωk, ωk∼fω
(1)
Remarks. We now give examples for the abstract model (1).
1. A widely used example [28], [29] for the radar-adversary
dynamics model (1) is that of linear Gaussian dynamics for
target kinematics and linear Gaussian measurements:
xk(t+ 1) = Axk(t) + wt(αk), xk(0) ∼π0=N(ˆx0,Σ0)
yk(t) = Cxk(t) + vt(βk), k = 1,2, . . . , K (2)
Here xk(t)∈ X =RX,yk(t)∈ Y =RY.Ais a
block diagonal matrix [30] when the target state represents
its position and velocity in Euclidean space. The variables
wt∼ N(0, Q(αk)) and vt∼ N(0, R(βk)) are mutually
independent Gaussian noise processes.
2. In this paper, we are only concerned with the asymptotic
statistics of the radar tracker T(1) for our cognition-masking
algorithms. One example is that of a Bayesian tracker (Kalman
filter) where the asymptotic covariance of the state estimate
is the unique positive semi-definite solution of the algebraic
Riccati equation (ARE). Other tracker examples include the
particle filter, interacting multiple model (IMM) filter etc.
We now proceed to define a cognitive radar which we
assume in this paper to be a constrained utility maximizer.
Definition 2 (Cognitive Radar).Consider the radar-adversary
interaction dynamics of Definition 1. The cognitive radar
chooses its response β∗
k(1) at time kby maximizing a utility
function u(αk,·)subject to constraint g(αk,·)≤0:
β∗
k∈argmax u(αk, β),
g(αk, β)≤0,(3)
We assume that g(·)is an increasing function of β.
Remarks.
1. In the main text of this paper, we consider a single con-
straint. This is consistent with most works in cognitive radar
literature which also assume a single operating constraint. For
example, in [31], the cognitive radar is constrained by a bound
on the target dwell time (monotone in the time the radar spends
tracking each target). In [32], the radar’s constraint is a bound
on the receiver sensor processing cost (monotone in the radar’s
choice of sensor accuracy for target tracking). Hence, we only
consider the operating cost of the radar in the main text which
is reflected in the radar’s scalar-valued constraint gin (3).
2. Multiple resource constraints. Our IRL methodology dis-
cussed below can be extended to multiple resource constraints
(gis vector-valued). However, for readability, we only con-
sider a scalar-valued constraint gin the main text of this paper.
We consider multiple resource constraints in Appendix F. The
notation for IRL and cognition masking results is complicated
for vector-valued gand hence omitted from the main text.
B. Adversarial IRL for Identifying Strategy of Cognitive Radar
We now review the main results for adversarial IRL, namely,
how an adversary can identify and reconstruct the radar’s
strategy by observing the radar’s responses. The adversarial
IRL system is schematically shown in Fig. 1. The key idea
is to formulate the adversary’s task of identifying the radar’s
strategy as a linear feasibility problem in terms of the radar’s
responses. This paper considers two distinct scenarios in terms
of the dependency of the adversary’s probe αkon the radar’s
utility uand resource constraint gin (3). The two scenarios are
formalized in Assumptions 1 and 2 below in our IRL results,
Theorems 1 and 5, and justified in Sec. II-C in the tracking
examples of waveform adaptation and beam allocation.
IRL for Identifying Utility Function
Theorem 1 below provides a set-valued reconstruction algo-
rithm to estimate the radar’s utility function when the adver-
sary controls the radar’s resource constraint. Such scenarios
where the adversary knows the radar’s resource constraint is
formalized below in Assumption 1:
Assumption 1. The radar’s resource constraint g(·)in (3) is
linear in the adversary’s probe αkand the radar’s utility u(·)
is independent of αk:
g(αk, β) = α0
kβ−1,u(αk, β)≡u(β)(4)
IRL objective. The adversary aims to reconstruct the radar’s
utility u(·)using the dataset Dg, where Dgis defined as:
Dg={g(αk,·), βk}K
k=1,(5)
where g(αk,·)is defined in (4).
Let us now state Theorem 1 for achieving IRL when
assumption 1 holds.
Theorem 1 (IRL for Identifying Radar’s Utility Function).
Consider the cognitive radar described in Definition 1. Sup-
pose assumption 1 holds. Then:
(a) The adversary checks for the existence of a feasible utility
function that satisfies (3) by checking the feasibility of a set
of linear inequalities:
There exists a feasible θ∈R2K
+s.t. A(θ, Dg)≤0,
⇔∃ us.t. βk∈argmax u(β), α0
kβ≤1∀k, (6)
4
where dataset Dgis defined in (5) and the set of inequalities
A(·)≤0is defined in Appendix A.
(b) If A(·,Dg)has a feasible solution, the set-valued IRL
estimate of the radar’s utility uis given by:
uIRL(β)≡ {uIRL(β;θ) : A(θ, Dg)≤0},
uIRL(β;θ) = min
k∈{1,2,...,K}{θk+θk+Kα0
k(β−βk)}.(7)
Theorem 1 is well known in micro-economics as Afriat’s
theorem [9], [16] and widely used for set-valued estimation
of consumer utilities from logged offline data. In complete
analogy, the adversary also performs IRL on a batch of probe-
response exchanges with the cognitive radar to reconstruct
the radar’s utility 2. Abstractly, Theorem 1 says that given a
finite dataset, the adversary can at best construct a polytope of
feasible strategies that rationalize the adversary’s dataset. The-
orem 1 achieves IRL when the radar faces a single operating
constraint. We discuss adversarial IRL for multiple resource
constraints in Theorem 6 in Appendix F. Then the linear
feasibility test of (6) generalizes to a mixed-integer linear
feasibility test, linear in the real-valued feasible variables in
the multi-constraint case.
IRL for Identifying Radar’s Resource Constraints
In certain scenarios, the utility of the radar is well known
(e.g., signal-to-noise ratio), but the operational constraints of
the radar are not known. We formalize such scenarios where
the adversary knows the radar’s utility function below as
Assumption 2:
Assumption 2. The radar’s utility function u(·)(3) is con-
trolled by the adversary’s probe αk, the radar’s resource
constraint gis independent of αkand has the following form:
g(αk, β)≡g(β)−γk, γk>0,(8)
where γk, g are independent of αk.
IRL objective. The adversary aims to reconstruct g(·)using
the dataset Du, where Duis defined as:
Du={u(αk,·), βk}K
k=1.(9)
IRL for estimating the radar resource constraints has the
same structure as that of Theorem 1 and is discussed in
the appendix. IRL for Assumption 2 is formally stated in
Theorem 5 in Appendix B and summarized below:
gIRL(β)≡ {gIRL(β;θ) : A(θ, Du)≥0},(10)
gIRL(β;θ) = max
k∈{1,2,...,K}{θk+θK+k(u(αk, β)−u(αk, βk))},
where gIRL is the adversary’s set-valued estimate of the radar’s
constraint g, dataset Duis defined in (9) and θ∈R2K
+is a
feasible vector wrt the feasibility test A(·,Du)≥0. Note how
the IRL feasibility inequalities in (10) are identical to that of
(6) in Theorem 1 but with the inequality direction reversed.
2Afriat’s theorem with linear constraints (4) has been generalized to non-
linear monotone constraints in literature [33]. For the radar context in this
paper, it suffices to assume a linear constraint when the adversary is trying
to estimate the radar’s utility
C. Examples of IRL for Identifying Radar Cognition
Below, we discuss two examples of cognitive radar func-
tionalities, namely, waveform adaptation and beam allocation.
Throughout this paper, we will use the two examples below
for contextualizing our cognition masking results.
1) Example 1. Waveform Adaptation for Cognitive Radar:
Waveform adaptation [34], [35], [36], [37] is a crucial func-
tionality of a cognitive radar. Consider a cognitive radar
with linear Gaussian dynamics and measurements (2). The
cognitive radar’s aim is to choose the optimal sensor mode (ob-
servation noise covariance) based on the target’s maneuvers.
A more accurate sensor results in more precise observations,
but is also costlier to deploy. Appendix C formalizes the
optimal waveform adaptation and abstracts the problem as the
constrained utility maximization problem of (3). The key idea
is to assume that adversary’s probe αkand radar’s response
βkare the eigenvalues of covariance matrices Qand R−1,
respectively, and hence, parameterize the state and observation
noise covariance in the state space model of (2). Appendix C
then shows the equivalence between an upper bound on the
radar’s asymptotic covariance (Σ∗(αk, βk))−1and the linear
constraint α0
kβ≤1. In summary, the cognitive radar’s optimal
waveform adaptation strategy can be abstracted as:
βk∈argmax u(β), α0
kβ≤1,(11)
where uis the radar’s utility, and the linear constraint α0
kβ≤1
equivalently bounds the asymptotic precision of the radar.
IRL for optimal waveform adaptation. The adversary’s aim
is to identify the radar’s utility function u. Also, the setup of
(11) falls under Assumption 1. Hence, the adversary uses the
IRL test of (6) in Theorem 1 for identifying u.
2) Example 2. Beam allocation for Cognitive Radar:
Appendix D discusses optimal beam allocation [38], [39], [40],
[41], [42], [43]. The cognitive radar’s aim is to allocate its
beam intensity optimally between multiple targets. Compared
to a target with less jerky maneuvers, a target with unpre-
dictable maneuvers requires a more focused beam for the
SNR to lie above a certain threshold. Appendix D formalizes
the beam allocation problem and abstracts the problem as a
constrained utility maximization problem (3). The key idea is
to relate the adversary’s probe αkto the asymptotic predicted
precision of the radar tracker. In summary, the cognitive radar’s
optimal waveform adaptation problem can be abstracted as:
βk∈argmax u(αk, β)≡
m
Y
i=1
β(i)αk(i),kβkκ≤γk,(12)
where the radar maximizes a Cobb-Douglas utility subject to
a bound γkon the total transmit beam intensity (κ-norm of
intensity vector) for all k.
IRL for optimal beam allocation. Since the adversary knows
the radar’s utility (Assumption 2), its aim is to identify the
radar’s constraint g(·)−γk≤0using the IRL test (48) in
Theorem 5.
Summary. This section discussed how an adversary can
deploy IRL to estimate a cognitive radar’s utility and con-
straint. While IRL with a single operational constraint is
discussed in [4], the IRL algorithm for multiple constraints
(in Appendix F) is new.
摘要:
展开>>
收起<<
1HowcanaRadarMaskitsCognition?KunalPattanayak,StudentMember,IEEE,VikramKrishnamurthy,Fellow,IEEEandChristopherBerryAbstractAcognitiveradarisaconstrainedutilitymaximizerthatadaptsitssensingmodeinresponsetoachangingenvi-ronment.Ifanadversarycanestimatetheutilityfunctionofacognitiveradar,itcandetermin...
声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源
价格:10玖币
属性:18 页
大小:1.1MB
格式:PDF
时间:2025-04-30
作者详情
-
Voltage-Controlled High-Bandwidth Terahertz Oscillators Based On Antiferromagnets Mike A. Lund1Davi R. Rodrigues2Karin Everschor-Sitte3and Kjetil M. D. Hals1 1Department of Engineering Sciences University of Agder 4879 Grimstad Norway10 玖币0人下载
-
Voltage-controlled topological interface states for bending waves in soft dielectric phononic crystal plates10 玖币0人下载