1 How can a Radar Mask its Cognition Kunal Pattanayak Student Member IEEE Vikram Krishnamurthy Fellow IEEE and Christopher Berry

2025-04-30 1 0 1.1MB 18 页 10玖币

侵权投诉

How can a Radar Mask its Cognition?

Kunal Pattanayak, Student Member, IEEE, Vikram Krishnamurthy, Fellow, IEEE and Christopher Berry

Abstract—A cognitive radar is a constrained utility maximizer

that adapts its sensing mode in response to a changing envi-

ronment. If an adversary can estimate the utility function of a

cognitive radar, it can determine the radar’s sensing strategy and

mitigate the radar performance via electronic countermeasures

(ECM). This paper discusses how a cognitive radar can hide

its strategy from an adversary that detects cognition. The

radar does so by transmitting purposefully designed sub-optimal

responses to spoof the adversary’s Neyman-Pearson detector.

We provide theoretical guarantees by ensuring the Type-I error

probability of the adversary’s detector exceeds a pre-deﬁned level

for a speciﬁed tolerance on the radar’s performance loss. We

illustrate our cognition masking scheme via numerical examples

involving waveform adaptation and beam allocation. We show

that small purposeful deviations from the optimal strategy of

the radar confuse the adversary by signiﬁcant amounts, thereby

masking the radar’s cognition. Our approach uses novel ideas

from revealed preference in microeconomics and adversarial

inverse reinforcement learning. Our proposed algorithms pro-

vide a principled approach for system-level electronic counter-

countermeasures (ECCM) to mask the radar’s cognition, i.e. ,

hide the radar’s strategy from an adversary. We also provide

performance bounds for our cognition masking scheme when the

adversary has misspeciﬁed measurements of the radar’s response.

Index Terms—Cognitive Radar, Meta-cognition, Revealed Pref-

erence, Inverse Reinforcement Learning, Electronic Counter

Countermeasures, Bayesian Tracker, Afriat’s Theorem

I. INTRODUCTION

In abstract terms, a cognitive radar is a constrained utility

maximizer with multiple sets of utility functions and con-

straints that allow the radar to deploy different strategies

depending on changing environments. Cognitive radars adapt

their waveform scheduling and beam allocation by optimiz-

ing their utility functions in different situations. If a smart

adversary can estimate the utility function or constraints of

the cognitive radar, then it can exploit this information to

mitigate the radar’s performance (e.g., jam the radar with

purposefully designed interference). A natural question is: how

can a cognitive radar hide its cognition from an adversary?

Put simply, how can a smart sensor hide its strategy by

acting dumb? We term this cognition-masking functionality as

meta-cognition.1A meta-cognitive radar [1] switches between

Short versions containing partial results appear in the IEEE International

Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022,

International Conference of Information Fusion (FUSION), 2022 and IEEE

International Conference on Decision and Control (CDC), 2022.

V. Krishnamurthy and K. Pattanayak are with the School of Electrical and

Computer Engineering, Cornell University, Ithaca, New York, 14853 USA. e-

mail: vikramk@cornell.edu, kp487@cornell.edu. C. Berry is with Lockheed

Martin Advanced Technology Laboratories, Cherry Hill, NJ, 08002 USA. e-

mail: christopher.m.berry@lmco.com. This research was supported in part by

a research contract from Lockheed Martin, the Army Research Ofﬁce grant

W911NF-21-1-0093 and the Air Force Ofﬁce of Scientiﬁc Research grant

FA9550-22-1-0016.

1“Meta-cognition” [1] is used to describe a sensing platform that switches

between multiple objectives (constrained utility functions).

multiple objectives (plans) to maintain stealth; for example, it

can switch between the conﬂicting objectives of maximizing

the signal-to-noise ratio of a target to maximizing privacy of

its plan to maintain stealth.

A meta-cognitive radar pays a penalty for stealth - it de-

liberately transmits sub-optimal responses to keep its strategy

hidden from the adversary resulting in performance degrada-

tion. This paper investigates how a cognitive radar hide its

strategy when the adversary observes the radar’s responses.

Our meta-cognition results are inspired by privacy-preserving

mechanisms in differential privacy and adversarial obfuscation

in deep learning with related works discussed below. Although

this paper is radar-centric, we emphasize that the problem

formulation and algorithms also apply to adversarial inverse

reinforcement learning in general machine learning applica-

tions, namely, how to purposefully choose suboptimal actions

to hide a strategy.

Related Works

Cognitive radars are widely studied [2], [3]. More recently,

our papers [4], [5] deal with inverse reinforcement learning

(IRL) algorithms for cognitive radars, namely, how can an ad-

versary estimate the utility function of a cognitive radar by ob-

serving its decisions. Reconstructing a decision maker’s utility

function by observing its actions is the main focus of IRL [6],

[7], [8] in machine learning and revealed preference [9], [10]

in micro-economics literature. In the radar literature, such IRL

based adversarial actions to mitigate the radar’s operations

are called electronic countermeasures (ECM) [11], [12], [4].

This paper builds on [4], [5] and develops electronic counter-

countermeasures (ECCM) [13], [14], [15] to mitigate ECM.

This paper assumes that adversary’s ECM is unaware if the

radar has ECCM capability, which is consistent with state-of-

the-art ECCM literature. The central theme of this paper is

to apply results from revealed preference in micro-economics

theory [9], [16]. To the best of our knowledge, this approach

for ECCM to hide cognition is novel.

Several works in literature [17], [18], [19] highlight how an

adversary beneﬁts from learning the radar’s utility function. In

[17], the adversary optimize its probes to increase the power

of its statistical hypothesis test for utility maximization. [18],

[19] show how revealed preference-based IRL techniques can

be used to manipulate consumer behavior.

In the radar context, [20] uses the Laplacian mechanism

for meta-cognition; the cognitive radar anonymizes its trajec-

tories via additive Laplacian noise. In our cognition masking

approach, the radar mitigates adversarial IRL via purposeful

perturbations from optimal strategy, where the perturbations

are computed via stochastic gradient algorithms (see Algo-

rithm 2 in Sec. IV-B).

arXiv:2210.11444v1 [eess.SP] 20 Oct 2022

Context

Radar Design Paradigm: Cognition Masking vs LPI. Low-

probability-of-intercept (LPI) radars [21], [22], [23] achieve

stealth by minimizing the probability of the radar signals being

detected by an adversarial target. Our rationale for stealth in

this paper is at a higher level of abstraction than classical

LPI. The cognitive radar’s aim is to confuse the adversary’s

detector, i.e. , ensure the adversary incorrectly reconstructs the

radar’s strategy with high probability.

System level ECCM vs Pulse level ECCM. Our cogni-

tion masking algorithm is implemented at the system level

(Bayesian tracker level) and not the pulse level (Wiener ﬁlter

level). Pulse-level ECCM [24], [25], [26] accomplishes LPI-

type functionalities for cognitive radars. Cognition masking

hides the radar’s strategy from the adversary instead of mit-

igating the adversary’s detection of the radar’s transmission.

Hence, cognition masking ECCM is deployed at a higher level

of abstraction than pulse level ECCM.

Hiding Cognition against Optimal IRL vs Sub-optimal IRL.

The cognition masking results in this paper assume the adver-

sary performs optimal IRL using Afriat’s theorem [9], [16].

Afriat’s theorem achieves optimal IRL for non-parametric util-

ity estimation of a cognitive radar as it generates a polytope of

all viable utilities that rationalizes a ﬁnite dataset of adversarial

probes and radar responses. However, our cognition masking

results can be extended to any potentially sub-optimal IRL

algorithm that generates a set-valued estimate of the radar’s

utility, as long as the radar has knowledge of the IRL algorithm

being used by the adversary. Algorithm 3 in the appendix

outlines how a cognitive radar can mask its cognition for

an arbitrary IRL algorithm. At an abstract level, cognition

masking simply obfuscates a set-valued mapping from the

adversary’s dataset to a set of feasible utilities by intelligently

distorting the radar’s responses and hence, is not affected by

the optimality of adversarial IRL.

At a deeper level, this paper quantiﬁes cognition

masking performance when the adversary has misspeciﬁed

measurements of the radar’s response, and performs sub-

optimal IRL. Theorem 8 (in appendix) provides performance

guarantees for cognition masking when the radar does not

know the misspeciﬁcation errors and provides a bound on

the cognitive masking performance in terms of the error

magnitude.

Why not an MDP or non-cooperative game?

In machine learning based IRL [6], [8], [27], the aim

is to reconstruct the rewards of a Markov decision process

(MDP) subject to entropic constraints on the policy. This

requires complete knowledge of the transition dynamics of

the adversary’s probes. In comparison, our radar-adversary

interaction is batch-wise - the adversary transmits a batch

of probe signals, and then the radar responds with a batch

of responses. This non-parametric identiﬁcation of the radar’s

strategy is agnostic to transition dynamics in the adversary’s

probes. Hence, a static utility maximization setup is more

realistic for IRL and inverse IRL involving cognitive radar.

We consider a radar-adversary interaction where the adver-

sary is not aware of the radar’s cognition masking strategy.

A more general formulation is a Stackelberg game between

the radar and the adversary, with the adversary as the leader

and the radar as the follower. However, such an approach for

computing the optimal meta-cognition strategy for the radar

is ill-posed since the existence of a pure and unique Nash

equilibrium is not guaranteed. Finally, from an inverse game

theoretic perspective, identifying if the radar-adversary behav-

ior is consistent with Nash equilibrium is intractable since the

analyst needs to know both the radar’s and adversary’s utility

function. Addressing these issues is beyond the scope of this

paper, and the subject of future work.

Outline and Organization of Results

(i) Background. Inverse reinforcement learning (IRL): In

Sec. II, we formulate the interaction between a cognitive

radar and an adversary target. We review the main idea of

revealed preference-based adversarial IRL algorithms, namely,

Theorems 1 and 5 used by the adversarial target to reconstruct

the radar’s strategy from its actions. Then we outline two

examples, namely waveform adaptation and beam allocation.

Theorem 6 stated in Appendix F extends adversarial IRL to

the case where the cognitive radar faces multiple constraints.

Theorem 6 is omitted from the main text for readability.

(ii) Masking Radar’s Strategy from Adversarial IRL: Sec. III

contains our main meta-cognition results, namely, Theorems 2

for mitigating adversarial IRL by masking the radar’s strategy.

The key idea is for the radar to deliberately deviate from its

optimal (naive) response to ensure:

(1) its true strategy almost fails to rationalize its perturbed

responses (masked from adversarial IRL), and

(2) its performance degradation due to sub-optimal responses

does not exceed a particular threshold. Theorem 7 in Ap-

pendix F extends Theorem 2 to the case where the cognitive

radar has multiple constraints. Theorem 8 provides perfor-

mance bounds on the cognition masking scheme of Theorem 2

when the adversary has misspeciﬁed measurements of the

radar’s response.

(iii) Masking Radar’s Strategy from Adversarial IRL Detectors

in noise. Sec. IV extends our IRL and cognition masking

results to the case where the adversary has noisy measurements

of the radar’s response. First, we deﬁne IRL detectors (Deﬁni-

tion 5) that detect radar’s cognition in noise. Then, we enhance

our cognition masking scheme of Theorem 2 to mitigate the

IRL detectors. The radar’s cognition masking objective now is

to maximize the detectors’ conditional Type-I error probability,

subject to a bound on its deliberate performance degradation.

(iv) Numerical illustration of masking cognition by meta-

cognitive radars. Sec. V illustrates our meta-cognition results

on two target tracking functionalities, namely, waveform adap-

tation and beam allocation. Our numerical experiments show

that the meta-cognition algorithms in this paper can effectively

mask both the radar’s utility function and resource constraint

when the cognitive radar is probed by the adversarial target.

Our main ﬁnding is that a small deliberate performance loss of

the meta-cognitive radar sufﬁces to mask the radar’s strategy

from the adversary to a large extent.

II. BACKGROUND. IRL TO ESTIMATE COGNITIVE RADAR

Since this paper investigates how to construct a cognitive

radar that hides its utility from an adversarial IRL system, this

section gives the background on how an adversarial system can

use IRL to estimate the radar’s utility. An important aspect

of the IRL framework below is that it is a necessary and

sufﬁcient condition for identifying cognition (utility maximiza-

tion behavior); hence it can be considered as an optimal IRL

scheme. Appendix H and G discuss cognition masking when

the adversary performs sub-optimal IRL.

A. Radar-Adversary Dynamics

Deﬁnition 1 (Radar-Target Interaction).The cognitive radar-

adversary interaction has the following dynamics:

target probe: αk∈Rd

radar action: βk∈Rd

target state: xk={xk(t), t = 1,2, . . .},

xk(t+ 1) ∼pαk(x|xk(t)), x0∼π0

radar observation: yk∼pβk(y|xk)

radar tracker: πk=T(πk−1, yk)

observed radar action: ˆ

βk=βk+ωk, ωk∼fω

(1)

Remarks. We now give examples for the abstract model (1).

1. A widely used example [28], [29] for the radar-adversary

dynamics model (1) is that of linear Gaussian dynamics for

target kinematics and linear Gaussian measurements:

xk(t+ 1) = Axk(t) + wt(αk), xk(0) ∼π0=N(ˆx0,Σ0)

yk(t) = Cxk(t) + vt(βk), k = 1,2, . . . , K (2)

Here xk(t)∈ X =RX,yk(t)∈ Y =RY.Ais a

block diagonal matrix [30] when the target state represents

its position and velocity in Euclidean space. The variables

wt∼ N(0, Q(αk)) and vt∼ N(0, R(βk)) are mutually

independent Gaussian noise processes.

2. In this paper, we are only concerned with the asymptotic

statistics of the radar tracker T(1) for our cognition-masking

algorithms. One example is that of a Bayesian tracker (Kalman

ﬁlter) where the asymptotic covariance of the state estimate

is the unique positive semi-deﬁnite solution of the algebraic

Riccati equation (ARE). Other tracker examples include the

particle ﬁlter, interacting multiple model (IMM) ﬁlter etc.

We now proceed to deﬁne a cognitive radar which we

assume in this paper to be a constrained utility maximizer.

Deﬁnition 2 (Cognitive Radar).Consider the radar-adversary

interaction dynamics of Deﬁnition 1. The cognitive radar

chooses its response β∗

k(1) at time kby maximizing a utility

function u(αk,·)subject to constraint g(αk,·)≤0:

β∗

k∈argmax u(αk, β),

g(αk, β)≤0,(3)

We assume that g(·)is an increasing function of β.

Remarks.

1. In the main text of this paper, we consider a single con-

straint. This is consistent with most works in cognitive radar

literature which also assume a single operating constraint. For

example, in [31], the cognitive radar is constrained by a bound

on the target dwell time (monotone in the time the radar spends

tracking each target). In [32], the radar’s constraint is a bound

on the receiver sensor processing cost (monotone in the radar’s

choice of sensor accuracy for target tracking). Hence, we only

consider the operating cost of the radar in the main text which

is reﬂected in the radar’s scalar-valued constraint gin (3).

2. Multiple resource constraints. Our IRL methodology dis-

cussed below can be extended to multiple resource constraints

(gis vector-valued). However, for readability, we only con-

sider a scalar-valued constraint gin the main text of this paper.

We consider multiple resource constraints in Appendix F. The

notation for IRL and cognition masking results is complicated

for vector-valued gand hence omitted from the main text.

B. Adversarial IRL for Identifying Strategy of Cognitive Radar

We now review the main results for adversarial IRL, namely,

how an adversary can identify and reconstruct the radar’s

strategy by observing the radar’s responses. The adversarial

IRL system is schematically shown in Fig. 1. The key idea

is to formulate the adversary’s task of identifying the radar’s

strategy as a linear feasibility problem in terms of the radar’s

responses. This paper considers two distinct scenarios in terms

of the dependency of the adversary’s probe αkon the radar’s

utility uand resource constraint gin (3). The two scenarios are

formalized in Assumptions 1 and 2 below in our IRL results,

Theorems 1 and 5, and justiﬁed in Sec. II-C in the tracking

examples of waveform adaptation and beam allocation.

IRL for Identifying Utility Function

Theorem 1 below provides a set-valued reconstruction algo-

rithm to estimate the radar’s utility function when the adver-

sary controls the radar’s resource constraint. Such scenarios

where the adversary knows the radar’s resource constraint is

formalized below in Assumption 1:

Assumption 1. The radar’s resource constraint g(·)in (3) is

linear in the adversary’s probe αkand the radar’s utility u(·)

is independent of αk:

g(αk, β) = α0

kβ−1,u(αk, β)≡u(β)(4)

IRL objective. The adversary aims to reconstruct the radar’s

utility u(·)using the dataset Dg, where Dgis deﬁned as:

Dg={g(αk,·), βk}K

k=1,(5)

where g(αk,·)is deﬁned in (4).

Let us now state Theorem 1 for achieving IRL when

assumption 1 holds.

Theorem 1 (IRL for Identifying Radar’s Utility Function).

Consider the cognitive radar described in Deﬁnition 1. Sup-

pose assumption 1 holds. Then:

(a) The adversary checks for the existence of a feasible utility

function that satisﬁes (3) by checking the feasibility of a set

of linear inequalities:

There exists a feasible θ∈R2K

+s.t. A(θ, Dg)≤0,

⇔∃ us.t. βk∈argmax u(β), α0

kβ≤1∀k, (6)

where dataset Dgis deﬁned in (5) and the set of inequalities

A(·)≤0is deﬁned in Appendix A.

(b) If A(·,Dg)has a feasible solution, the set-valued IRL

estimate of the radar’s utility uis given by:

uIRL(β)≡ {uIRL(β;θ) : A(θ, Dg)≤0},

uIRL(β;θ) = min

k∈{1,2,...,K}{θk+θk+Kα0

k(β−βk)}.(7)

Theorem 1 is well known in micro-economics as Afriat’s

theorem [9], [16] and widely used for set-valued estimation

of consumer utilities from logged ofﬂine data. In complete

analogy, the adversary also performs IRL on a batch of probe-

response exchanges with the cognitive radar to reconstruct

the radar’s utility 2. Abstractly, Theorem 1 says that given a

ﬁnite dataset, the adversary can at best construct a polytope of

feasible strategies that rationalize the adversary’s dataset. The-

orem 1 achieves IRL when the radar faces a single operating

constraint. We discuss adversarial IRL for multiple resource

constraints in Theorem 6 in Appendix F. Then the linear

feasibility test of (6) generalizes to a mixed-integer linear

feasibility test, linear in the real-valued feasible variables in

the multi-constraint case.

IRL for Identifying Radar’s Resource Constraints

In certain scenarios, the utility of the radar is well known

(e.g., signal-to-noise ratio), but the operational constraints of

the radar are not known. We formalize such scenarios where

the adversary knows the radar’s utility function below as

Assumption 2:

Assumption 2. The radar’s utility function u(·)(3) is con-

trolled by the adversary’s probe αk, the radar’s resource

constraint gis independent of αkand has the following form:

g(αk, β)≡g(β)−γk, γk>0,(8)

where γk, g are independent of αk.

IRL objective. The adversary aims to reconstruct g(·)using

the dataset Du, where Duis deﬁned as:

Du={u(αk,·), βk}K

k=1.(9)

IRL for estimating the radar resource constraints has the

same structure as that of Theorem 1 and is discussed in

the appendix. IRL for Assumption 2 is formally stated in

Theorem 5 in Appendix B and summarized below:

gIRL(β)≡ {gIRL(β;θ) : A(θ, Du)≥0},(10)

gIRL(β;θ) = max

k∈{1,2,...,K}{θk+θK+k(u(αk, β)−u(αk, βk))},

where gIRL is the adversary’s set-valued estimate of the radar’s

constraint g, dataset Duis deﬁned in (9) and θ∈R2K

+is a

feasible vector wrt the feasibility test A(·,Du)≥0. Note how

the IRL feasibility inequalities in (10) are identical to that of

(6) in Theorem 1 but with the inequality direction reversed.

2Afriat’s theorem with linear constraints (4) has been generalized to non-

linear monotone constraints in literature [33]. For the radar context in this

paper, it sufﬁces to assume a linear constraint when the adversary is trying

to estimate the radar’s utility

C. Examples of IRL for Identifying Radar Cognition

Below, we discuss two examples of cognitive radar func-

tionalities, namely, waveform adaptation and beam allocation.

Throughout this paper, we will use the two examples below

for contextualizing our cognition masking results.

1) Example 1. Waveform Adaptation for Cognitive Radar:

Waveform adaptation [34], [35], [36], [37] is a crucial func-

tionality of a cognitive radar. Consider a cognitive radar

with linear Gaussian dynamics and measurements (2). The

cognitive radar’s aim is to choose the optimal sensor mode (ob-

servation noise covariance) based on the target’s maneuvers.

A more accurate sensor results in more precise observations,

but is also costlier to deploy. Appendix C formalizes the

optimal waveform adaptation and abstracts the problem as the

constrained utility maximization problem of (3). The key idea

is to assume that adversary’s probe αkand radar’s response

βkare the eigenvalues of covariance matrices Qand R−1,

respectively, and hence, parameterize the state and observation

noise covariance in the state space model of (2). Appendix C

then shows the equivalence between an upper bound on the

radar’s asymptotic covariance (Σ∗(αk, βk))−1and the linear

constraint α0

kβ≤1. In summary, the cognitive radar’s optimal

waveform adaptation strategy can be abstracted as:

βk∈argmax u(β), α0

kβ≤1,(11)

where uis the radar’s utility, and the linear constraint α0

kβ≤1

equivalently bounds the asymptotic precision of the radar.

IRL for optimal waveform adaptation. The adversary’s aim

is to identify the radar’s utility function u. Also, the setup of

(11) falls under Assumption 1. Hence, the adversary uses the

IRL test of (6) in Theorem 1 for identifying u.

2) Example 2. Beam allocation for Cognitive Radar:

Appendix D discusses optimal beam allocation [38], [39], [40],

[41], [42], [43]. The cognitive radar’s aim is to allocate its

beam intensity optimally between multiple targets. Compared

to a target with less jerky maneuvers, a target with unpre-

dictable maneuvers requires a more focused beam for the

SNR to lie above a certain threshold. Appendix D formalizes

the beam allocation problem and abstracts the problem as a

constrained utility maximization problem (3). The key idea is

to relate the adversary’s probe αkto the asymptotic predicted

precision of the radar tracker. In summary, the cognitive radar’s

optimal waveform adaptation problem can be abstracted as:

βk∈argmax u(αk, β)≡

i=1

β(i)αk(i),kβkκ≤γk,(12)

where the radar maximizes a Cobb-Douglas utility subject to

a bound γkon the total transmit beam intensity (κ-norm of

intensity vector) for all k.

IRL for optimal beam allocation. Since the adversary knows

the radar’s utility (Assumption 2), its aim is to identify the

radar’s constraint g(·)−γk≤0using the IRL test (48) in

Theorem 5.

Summary. This section discussed how an adversary can

deploy IRL to estimate a cognitive radar’s utility and con-

straint. While IRL with a single operational constraint is

discussed in [4], the IRL algorithm for multiple constraints

(in Appendix F) is new.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1HowcanaRadarMaskitsCognition?KunalPattanayak,StudentMember,IEEE,VikramKrishnamurthy,Fellow,IEEEandChristopherBerryAbstractAcognitiveradarisaconstrainedutilitymaximizerthatadaptsitssensingmodeinresponsetoachangingenvi-ronment.Ifanadversarycanestimatetheutilityfunctionofacognitiveradar,itcandetermin...

展开>> 收起<<

1 How can a Radar Mask its Cognition Kunal Pattanayak Student Member IEEE Vikram Krishnamurthy Fellow IEEE and Christopher Berry.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 How can a Radar Mask its Cognition Kunal Pattanayak Student Member IEEE Vikram Krishnamurthy Fellow IEEE and Christopher Berry

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: