Model-X Sequential Testing for Conditional Independence via Testing by Betting Shalev Shaer1Gal Maman1Yaniv Romano12

2025-05-06 0 0 3.27MB 33 页 10玖币

侵权投诉

Model-X Sequential Testing for Conditional Independence

via Testing by Betting

Shalev Shaer∗,1Gal Maman∗,1Yaniv Romano1,2

1Department of Electrical and Computer Engineering, Technion–Israel Institute of Technology

2Department of Computer Science, Technion–Israel Institute of Technology

Abstract

This paper develops a model-free sequential test

for conditional independence. The proposed test

allows researchers to analyze an incoming i.i.d.

data stream with any arbitrary dependency struc-

ture, and safely conclude whether a feature is

conditionally associated with the response under

study. We allow the processing of data points on-

line, as soon as they arrive, and stop data acqui-

sition once signiﬁcant results are detected, rigor-

ously controlling the type-I error rate. Our test

can work with any sophisticated machine learn-

ing algorithm to enhance data efﬁciency to the

extent possible. The developed method is in-

spired by two statistical frameworks. The ﬁrst

is the model-X conditional randomization test,

a test for conditional independence that is valid

in ofﬂine settings where the sample size is ﬁxed

in advance. The second is testing by betting,

a “game-theoretic” approach for sequential hy-

pothesis testing. We conduct synthetic experi-

ments to demonstrate the advantage of our test

over out-of-the-box sequential tests that account

for the multiplicity of tests in the time horizon,

and demonstrate the practicality of our proposal

by applying it to real-world tasks.

1 INTRODUCTION

A central problem in data analysis is to rigorously ﬁnd con-

ditional associations in complex data sets with nonlinear

dependencies. This problem lies at the heart of causal dis-

covery (Pearl et al., 2000; Peters et al., 2017), variable se-

lection (Barber and Cand`

es, 2015; Candes et al., 2018), ma-

chine learning interpretability (Burns et al., 2020; Lu et al.,

*Equal contribution.

2018), economics (Angrist and Kuersteiner, 2011; Wang

and Hong, 2018), and genetics studies (Sesia et al., 2019;

Bates et al., 2020), to name a few. In such applications, the

data are often collected online, and, naturally, researchers

are interested in analyzing the data points immediately af-

ter they are observed so that further data acquisition can be

terminated as soon as signiﬁcant results are detected. This

experimental setting, for example, is typical in decision-

making (Nikolakopoulou et al., 2018; Bhui, 2019) and clin-

ical trials (Park et al., 2018), where the need for additional

samples to obtain accurate statistical inference must fre-

quently be balanced with experimental costs.

To formalize the problem, suppose we are given a stream of

data points (Xt, Yt, Zt)for t∈N= 1,2, . . . , where each

triplet contains a response Yt∈R, a feature Xt∈R, and

a vector of covariates Zt∈Rd. We assume the observa-

tions are sampled i.i.d. from PY XZ =PY|XZ PXZ , where

PY|XZ is unknown. Given such an online data stream, our

goal is to test for conditional independence (CI), where the

null hypothesis is given by

H0:Xt⊥⊥ Yt|Ztfor all t∈N.

In words, we say that H0is true if Xtis independent of the

response Ytafter accounting for the effect of the covariates

Zt,simultaneously for all time steps t. We refer to Xtthat

satisﬁes H0as an ‘unimportant’ feature. Analogously, the

alternative hypothesis implies that Xtcarries new informa-

tion on the response Ytbeyond what is already contained

in Zt, i.e., Xt6⊥⊥ Yt|Zt. Therefore, we say that such a

feature Xtis ‘important’.

The goal of sequential hypothesis testing is to formulate a

concrete decision rule on whether we can conﬁdently reject

the null at each time step t, by monitoring and accumulat-

ing the evidence collected at each step against the null us-

ing past data {(Xs, Ys, Zs)}t

s=1 (Wald, 1945). This allows

the analyst great ﬂexibility, as she can decide, at each step,

whether new data should be collected to support the ques-

tion under study. Key to this setting is the need to provide

the analyst with a tool that rigorously controls the type-I

error rate—deﬁned as the probability of rejecting the null

when it is in fact true—at any given desired level α, simul-

arXiv:2210.00354v2 [stat.ME] 19 Feb 2023

Model-X Sequential Testing for Conditional Independence via Testing by Betting

taneously for all time steps t. This requirement should not

be confused with the premise of classic ofﬂine tests for CI

that attain type-I error rate control only when the sample

size is ﬁxed in advance. We refer to these as ofﬂine tests,

emphasizing that one cannot naively monitor the outcome

of a classic test—a p-value—and reject the null at an op-

tional time step twithout accounting for the multiplicity of

the tests across the time horizon; this strategy would result

in inﬂation of the type-I error rate. Beyond online type-I

error rate control, ideally, we wish to have a powerful test

that would reject the null when it is false, and we want it to

do so as early as possible.

Our contribution

In this paper, we develop a novel sequential test for CI.

Our proposal takes inspiration from two powerful and at-

tractive statistical tools that are gaining increasing atten-

tion in recent years. The ﬁrst is the model-X conditional

randomization test (CRT) by Candes et al. (2018), an of-

ﬂine test for CI. The second is testing by betting (Shafer

and Vovk, 2019; Gr¨

unwald et al., 2020), a “game-theoretic”

approach for sequential hypothesis testing, where our pro-

posal is very much inspired by the line of work reported

in (Ramdas and Wenbe, 2020; Ramdas et al., 2022). The

method we introduce in this paper, presented in Section 4,

generalizes the ofﬂine CRT to the challenging online set-

ting, resulting in a new test with the following features.

Safe testing with early stopping: building on recent ad-

vances in sequential testing using e-values and martingales,

detailed in Section 3, the proposed CI test is guaranteed to

control the type-I error rate at any time step. In particular,

the analyst is allowed to track the outcome of the test over

time, and safely reject the null if it exceeds a user-deﬁned

signiﬁcance level, preventing a wasteful collection of un-

necessary new data points.

Model-X setting: similar to the ofﬂine CRT method, de-

scribed in Section 2, the online test we propose does not

make any assumptions on the conditional distribution of

Y|X, Z. For instance, we do not make unrealistic as-

sumptions that the relationship between Yand (X, Z)is

linear, or that Y|X, Z is Gaussian. However, this advan-

tage comes at the cost of assuming that the distribution of

X|Zis known. This assumption is common to all tests be-

longing to the family of model-X knockoffs, including the

CRT, and it is manageable when (i) large unlabeled data are

available in contrast to labeled data, or (ii) when we have

good prior knowledge about the distribution of X|Z(Can-

des et al., 2018; Sesia et al., 2019; Romano et al., 2020).

We discuss this in more detail in Section 4.2.

Online learning from past experience: the proposed test

can leverage any machine learning algorithm to powerfully

discover violations of the CI null. In particular, when a

new triplet (Yt, Xt, Zt)is observed, we use online learn-

ing techniques to efﬁciently update the running predictive

model, instead of ﬁtting a new model from scratch. This

way, the whole data stream is used for training in a compu-

tationally efﬁcient manner. The proposed framework also

falls under the umbrella of interactive tests (Lei and Fithian,

2018; Lei et al., 2021; Duan et al., 2022), providing the an-

alyst the liberty to look at past data and decide how to mod-

ify the learning algorithm at any time step—e.g., to switch

to a model that is more robust to outliers—to better dis-

criminate the null and alternative hypotheses when applied

to future test points.

Optimized software package: we provide a python code

that implements our testing framework, is available at

https://github.com/shaersh/ecrt. The pack-

age includes important design principles: an automatic

hyper-parameter tuning that does not require ﬁtting the ma-

chine learning model from scratch (Supplementary Sec-

tion E); an ensemble procedure for improving the power

of the test by averaging multiple martingales (Section 4.2);

and a de-randomization procedure that also improves

power by reducing inherent algorithmic randomness due to

a sampling mechanism that is necessary to formulate the

test (Section 4.2).

2 MODEL-X CI TESTING

The CRT, developed by Candes et al. (2018), is an ofﬂine

test for CI that we build upon in this work. A key advan-

tage of the CRT is that it assumes nothing on the condi-

tional distributions of Y|X, Z and Y|Z. This test, how-

ever, assumes that the conditional distribution of X|Zis

known. The CRT procedure, described in Algorithm 3 in

Supplementary Section B, resembles classic permutation

tests and has two key components: a test statistic func-

tion T(·)and a function that samples dummy features ˜

from PX|Z. Since ˜

Xis sampled without looking at Y, the

dummy triplets (˜

X, Y, Z)satisfy ˜

X⊥⊥ Y|Zby construc-

tion. Hence, by comparing the test statistic evaluated on the

original {(Xi, Yi, Zi)}n

i=1 and dummy {(˜

Xi, Yi, Zi)}}n

i=1

triplets, the CRT generates a valid p-value pn, controlling

the CI null at level αwhen the sample size nis ﬁxed in

advance (Candes et al., 2018), i.e.,

P[pn≤α|the null is true]≤αfor a ﬁxed n. (1)

Put differently, when all nobservations are available be-

fore testing, one can use pnto rigorously control the type-I

error. However, future observations cannot be utilized to

generate a new p-value (e.g., in cases where the null is not

rejected) without a proper correction that ensures the valid-

ity of the sequential test. To see this, suppose for simplicity

that under the null pn∼Uniform(0,1) is distributed uni-

formly over the [0,1] interval for any ﬁxed n, satisfying (1).

Next, let τbe a data-dependent stopping time, given by

τ={min n:pn≤α, n ∈N}.

Shalev Shaer, Gal Maman, Yaniv Romano

Now, observe that with this choice of stopping time,

P[pτ≤α|the null is true]cannot be bounded by αany-

more: there exists τsuch that a rejection rule pτ≤αwould

result in an invalid α-level test.

In many applications, however, one is interested in apply-

ing the test online to obtain reliable data-driven conclusions

as soon as possible. This motivates us to adopt a fresh sta-

tistical approach for hypothesis testing, called testing by

betting, brieﬂy described in the next section.

3 TESTING BY BETTING

Before diving into the mathematical principles of the test-

ing by betting approach, we follow Shafer and Vovk (2019)

and Shafer (2021) and present an intuitive interpretation of

this framework. Imagine we are playing a game, in which

we start with initial toy money. At each time step, we place

a bet against the null hypothesis, and then reality reveals

the truth. If this bet turns out to be correct, our wealth is

increased by the money we risk in the bet; otherwise, we

lose and the wealth is decreased accordingly. If our wealth

at time tis at least 1/α times as large as the initial toy

money we started with (e.g., we have managed to multiply

our initial money by a factor of 1/0.05 = 20 for α= 0.05)

we can conﬁdently reject the null, knowing that the type-I

error is guaranteed to be controlled at level α. A property

important to the formulation of the above game is this: if

the null is true, the game must be fair in the sense that it is

unlikely we will be able to signiﬁcantly increase our initial

toy money, no matter how sophisticated our betting strategy

is.

A mathematical object that is crucial to formalize a fair

game is a test martingale, deﬁned below.

Deﬁnition 1. A random process {St:t∈N0}is a test

martingale for a given null hypothesis H0if it satisﬁes the

following conditions: (i) S0= 1, (ii) St≥0,∀t∈N0,

and (iii) {St:t∈N0}is a supermartingale under H0.

In the view of testing by betting, the initial value of the test

martingale S0represents the initial toy money in the game,

and Stcorresponds to our wealth at time t. Now, suppose

we are handed a valid test martingale {St:t∈N0}, and

let τ≥1be a data-dependent optional stopping time. By

invoking the optional stopping theorem we get

EH0[Sτ]≤EH0[S0]=1,(2)

meaning that Sτis a non-negative random variable whose

expected value is bounded by one for any stopping time

τ≥1. In the literature on testing by betting, Sτis often

referred to as an e-value (Vovk and Wang, 2021; Wang and

Ramdas, 2022; Gr¨

unwald et al., 2020). Importantly, the

consequence of (2) is that, under the null, the game is fair

since the expected value of our wealth Stat any time step

tis bounded by the initial toy money S0. Moreover, since

{St:t∈N0}is a non-negative supermartingale under H0,

we can apply Ville’s inequality (Ville, 1939) and get

PH0(∃t≥1 : St≥1/α)≤αEH0[S0] = α, (3)

for any α∈(0,1). Therefore, the ability to form a valid test

martingale allows us to rigorously test for H0and reject the

null if St≥1/α at any time step, with the premise that the

type-I error would not exceed the level α. Crucially, when

the null is false, Stcan largely grow depending on how suc-

cessful our betting strategy is. In Section 4 we formulate a

valid test martingale and design a powerful betting strategy.

Related work. Sequential testing has a long standing his-

tory (Wald, 1945; Lai, 1984; Naghshvar and Javidi, 2013;

Lh´

eritier and Cazals, 2018), where the sequential probabil-

ity ratio test of Wald (1945) is perhaps one of the ﬁrst se-

quential hypothesis tests. More recently, the testing by bet-

ting methodology (Shafer and Vovk, 2019; Shafer, 2021)

has led to the design of new powerful nonparametric ap-

proaches for constructing conﬁdence sequences, e.g., (Jun

and Orabona, 2019; Waudby-Smith and Ramdas, 2023), for

testing a single hypothesis, as well as for testing multiple

hypotheses; see (Waudby-Smith and Ramdas, 2023, Sec-

tion 6) for a detailed summary.

Related methods to our proposal are ofﬂine and online

two-sample tests that are based on martingales (Balsub-

ramani and Ramdas, 2016; Turner et al., 2021; Shekhar

and Ramdas, 2021; Duan et al., 2022). Speciﬁcally,

Shekhar and Ramdas (2021) studied the problem of design-

ing martingale-based sequential nonparametric one- and

two-sample tests that are consistent, i.e., these sequential

tests can attain power one under certain conditions. In our

work, we build on the foundations of Shekhar and Ram-

das (2021), and extend this framework to CI testing. Re-

cently, Ren and Barber (2022) suggested using e-values to

de-randomize the outcome of the knockoff ﬁlter—a sister

method to the CRT that focuses on false discovery rate con-

trol (FDR) in an ofﬂine setting. In our work, we aggregate

e-values to de-randomize our test, where the e-values we

deﬁne take a different form than those proposed by Ren

and Barber (2022), as we focus on sequential testing of

a single feature. Lastly, independent work by Gr¨

unwald

et al. (2022), which has been developed and posted in par-

allel to ours, also offers a martingale-based sequential test

under the model-X setting, although suggesting a different

test martingale. In Supplementary Section D we provide a

more detailed discussion about the relation of our proposal

to that of Gr¨

unwald et al. (2022), along with empirical com-

parisons.

4 THE PROPOSED e-CRT

In this section, we introduce e-CRT: a sequential test for CI

based on martingales and e-values. Suppose we are given a

Model-X Sequential Testing for Conditional Independence via Testing by Betting

Figure 1: Illustration of the test martingale (wealth) St

as a function of t. The blue (resp. green) curve represents

the test martingale evaluated on simulated null data

(resp. non-null data).

machine learning model ˆ

ft, ﬁtted on an initial batch of la-

beled data {(Xs, Ys, Zs) : s≤t−1}to provide an estimate

of Ygiven (X, Z). At a high level, the test is initialized

with toy money S0= 1 and proceeds as follows.

1. Collect a fresh test triplet (Xt, Yt, Zt).

2. Generate a dummy feature ˜

Xt∼PX|Z(Xt|Zt), and

form the dummy triplet (˜

Xt, Yt, Zt).

3. Compute a betting score Wt. Use ˆ

ftto bet against the

null, where the bet is that the prediction error of ˆ

ft(or

any other test statistic), evaluated on the dummy triplet

(˜

Xt, Yt, Zt), would be higher than that of the original

triplet (Xt, Yt, Zt). A positive (resp. negative) score

indicates that our bet is successful (resp. unsuccessful).

4. Update the current wealth (test martingale) St: if the

betting score is positive, the previous St−1is increased

by the money we risked on placing the bet; otherwise,

the previous wealth St−1is decreased analogously.

5. Update the predictive model ˆ

ftand get ˆ

ft+1, e.g., by

taking one (or more) gradient steps to minimize a loss

evaluated on {(Xs, Ys, Zs) : s≤t}.

6. If St≥1/α reject H0and stop. Otherwise, increase t

and return to step (1).

In what follows, we describe each of the above components

in depth, deﬁne the proposed test martingale, and prove its

validity. Later, in Section 4.2, we provide additional design

principles that improve the power of the test.

Before doing so, we pause to provide a small synthetic ex-

periment that showcases how the wealth process Stbe-

haves under the null and the alternative. To this end,

we generate two different data sets. The ﬁrst satisﬁes

H0, which we refer to as null data in which Xis

unimportant. The second satisﬁes the alternative, which we

call non-null data in which Xis important. The data

generation process for each case and the implementation

details are described in Section 5.1. Next, we apply e-CRT

on each data set, and present in Figure 1 the wealth pro-

cess Stas a function of t. When the test is applied to the

null data, the value of Stremains close to the initial

wealth S0= 1 for all presented time steps t. In particu-

lar, Stdoes not exceed the value 1/α = 20, and thus H0

cannot be rejected. By contrast, when the test is applied

to the non-null data, the wealth process grows as the

testing procedure proceeds, until reaching a target value of

1/α = 20. In this case, we reject the null and report that X

is indeed important. This experiment illustrates the advan-

tage of monitoring the value of Stover time: we can safely

terminate the test after collecting 300 samples and avoid a

wasteful collection of new data.

4.1 Formulating the Test Martingale

Our procedure exploits the dummy feature ˜

Xt, sampled

from the conditional distribution of X|Zto form a fair

game. In the sequel, we state key properties of the dum-

mies, which we will use to deﬁne our test. The proofs of

all statements given in this section are provided in Supple-

mentary Section A. We start by emphasizing that we sam-

ple ˜

Xt∼PX|Z(Xt|Zt)without looking at Yt, and so

Xt⊥⊥ Xt|Ztfor all t∈Nby construction. Therefore,

Xtand its dummy ˜

Xtare exchangeable conditional on Zt;

that is, (Xt,˜

Xt)|Zt

= ( ˜

Xt, Xt)|Zt, where d

=reads as

‘equal in distribution’. This implies that it is impossible to

distinguish between Xtand its dummy ˜

Xtwhen viewing

Zt, for any time step t. Furthermore, under the null, this ex-

changeability property holds not only conditionally on Zt

but also on Yt.

Lemma 1. Take (Xt, Yt, Zt)∼PXY Z , and let ˜

Xtbe

drawn independently from PX|Zwithout looking at Yt. If

Yt⊥⊥ Xt|Zt, then (Xt,˜

Xt, Yt, Zt)d

= ( ˜

Xt, Xt, Yt, Zt).

The above result lies at the heart of the knockoff and CRT

frameworks, and its proof follows (Candes et al., 2018,

Lemma 3.2), (Barber et al., 2020, Lemma 1). Lemma 1

implies that, if the null is true, it is impossible to tell which

is the original feature and which is the dummy when view-

ing the full observation, at any time step t. This result is

essential for proving the validity of the CRT p-value, as

well as for formulating our test martingale, as we do next.

Denote by Ft=σ({Xs, Ys, Zs}t

s=1)the sigma-algebra

generated by observations collected up to time t, where F0

is the trivial sigma-algebra. Let qt=T(Xt, Yt, Zt;ˆ

ft)∈R

and ˜qt=T(˜

Xt, Yt, Zt;ˆ

ft)∈Rbe the test statistics evalu-

ated on the original and dummy triplets, respectively. Im-

portantly, T(·;ˆ

ft)can be any function, and its choice may

affect the power of the test. For instance, one can deﬁne

T(·;ˆ

ft)as the squared prediction error evaluated on the

current sample T(x, y, z;ˆ

ft) = ( ˆ

ft(x, z)−y)2using a

model ˆ

fttrained on past data {Xs, Ys, Zs}t−1

s=1. Observe

that ˆ

ftis not ﬁtted on the new triplet (Xt, Yt, Zt), thus it is

considered as a ﬁxed function once conditioning on Ft−1.

We then proceed by evaluating a betting score

Wt=g(qt,˜qt),(4)

Shalev Shaer, Gal Maman, Yaniv Romano

where the function g(a, b)∈[−m, m]is antisymmetric

g(a, b) = −g(b, a), satisfying g(a, b)>0if b > a and

g(a, b1)≥g(a, b2)for b1≥b2. For example, g(a, b) =

m·sign(b−a). The hyper-parameter 0< m ≤1controls

the magnitude of the score. As in the knockoff ﬁlter, our

design of gensures it follows the ﬂip sign property, requir-

ing that a swap of the original feature Xtand its dummy

Xtwill ﬂip the sign of Wt(Candes et al., 2018).

Under the alternative, one should interpret a strictly posi-

tive betting score Wt>0as a successful bet, which will

increase our wealth. This means that we gain some evi-

dence that Xtcarries extra predictive power about Ytbe-

yond what is already known in Zt. Analogously, a strictly

negative Wt<0indicates an erroneous bet, which will

reduce our wealth even though the null is false. Crucially,

under the null, Wtwill be zero on average, no matter how

accurate ˆ

ftis. In other words, it is impossible to have a

systematically positive Wtwhen H0is true.

Lemma 2. Under the same conditions as in Lemma 1, if

H0is true then EH0[Wt| Ft−1]=0for all t∈N.

The core idea behind the proof of the above lemma is that,

under the null, Wthas a symmetric distribution about zero

conditional on Ft−1, and thereby its expected value is zero;

see (Ramdas et al., 2020) for a related property of symmet-

ric distributions. In particular, Wtis equally likely to have

positive and negative values, which is a well-known result

in the knockoff literature with the important difference that

in our case we show it holds conditionally on Ft−1.

Armed with the betting score Wtat time t, we turn to deﬁne

a test martingale {St:t∈N0}for H0. The martingale

can be thought of as the wealth process, initialized by toy

money S0= 1, and our ultimate goal is to maximize it. We

begin with deﬁning the base martingale as follows:

t:=

j=1

(1 + v·Wj),(5)

where v∈[0,1] is a ﬁxed amount of toy money that we are

willing to risk at step t.1Proposition 2 in Supplementary

Section C.1 shows that {Sv

t:t∈N0}in (5) is a valid test

martingale. As a result, following Ville’s inequality in (3),

one can monitor Sv

tand control the type-I error for any

choice of v∈[0,1]. Importantly, the amount of toy money

vthat we risk when placing the bet affects the power.

The above immediately raises the question of how should

we choose v? Ideally, we want to set the best constant

v∗so that Sv∗

tis maximized under the alternative. The

problem is that we are not allowed to look at the current

1We can set a different vtfor each time step, yet vtmust be

chosen without looking at the current (Xt, Yt, Zt)as otherwise

the test will cease to be valid. Intuitively, in such a case one can

always set vt= 0 when Wtis negative and vt= 1 otherwise, and

increase the wealth regardless on whether the null is true or false.

betting score Wt, so it is impossible to ﬁnd such an ideal

data-dependent v∗in foresight. As a thought experiment,

consider the simplest choice for gas the sign function for

which Wt∈ {+1,−1}, and suppose we adopt an aggres-

sive betting strategy with v= 1. With this choice, when

we win a bet we will increase Sv

tby the maximal amount

possible at step t. However, if we lose a bet even once, we

will have Sv

t= 0, resulting in a powerless test; to see this,

assign Wt=−1in (5). We give a concrete example that

visualizes this discussion in Supplementary Section C.2.

As a way out, we formulate a powerful betting strategy us-

ing the mixture-method of Shekhar and Ramdas (2021),

which is intimately connected to universal portfolio opti-

mization (Cover, 2011). The mixture-method is deﬁned as

the average over Sv

tfor all v∈[0,1]:

St=Z1

t·h(v)dv, (6)

where h(v)is a probability density function (pdf) whose

support is on the [0,1] interval, e.g., a uniform distribution.

We adopt the mixture method betting strategy to formu-

late our test martingale since it has appealing power prop-

erties, which we discuss soon. Before doing so, however,

we shall ﬁrst prove that the test martingale in (6) is valid.

The theorem presented below states that by monitoring St

one can safely reject the null the ﬁrst time Stexceeds 1/α,

while rigorously controlling the type-I error simultaneously

for all optional stopping times. This result holds in ﬁnite

samples, without making any modeling assumptions on the

conditional distribution of Y|X, and for any machine

learning model ˆ

ft, which we use to bet against the null. The

proof follows (Shekhar and Ramdas, 2021, Section 2.2).

Theorem 1. Under the same conditions as in Lemma 1, if

the null hypothesis H0is true then for any α∈(0,1),

PH0(∃t:St≥1/α)≤α.

Having established the validity of the test, we turn to dis-

cuss the key advantage of the mixture method betting strat-

egy. The idea behind this approach is that one of the base

martingales Sv

tin (6) must hit the best constant v∗, which,

in turn, drives the average martingale Stupwards. We

demonstrate this visually in Supplementary Section C.2. In

fact, Shekhar and Ramdas (2021) proved that Stis not only

dominated by Sv∗

t, but can also provably form a consistent

test that achieves power one in the limit of inﬁnite data.

Proposition 1 (Shekhar and Ramdas (2021)).If

lim inft→∞ 1

tPt

s=1 Ws>0under the alternative

H1. Then, PH1(∃t:St≥1/α) = 1 for any α∈(0,1).

The condition of lim inft→∞ 1

tPt

s=1 Ws>0implies that

it sufﬁces that only on average the predictive model will be

able to tell apart the original and dummy triplets, so at the

limit of inﬁnite data we will attain a consistent test.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Model-XSequentialTestingforConditionalIndependenceviaTestingbyBettingShalevShaer;1GalMaman;1YanivRomano1;21DepartmentofElectricalandComputerEngineering,TechnionIsraelInstituteofTechnology2DepartmentofComputerScience,TechnionIsraelInstituteofTechnologyAbstractThispaperdevelopsamodel-freesequentia...

展开>> 收起<<

Model-X Sequential Testing for Conditional Independence via Testing by Betting Shalev Shaer1Gal Maman1Yaniv Romano12.pdf

共33页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Model-X Sequential Testing for Conditional Independence via Testing by Betting Shalev Shaer1Gal Maman1Yaniv Romano12

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: