Towards Explaining Distribution Shifts

2025-05-06 1 0 5.95MB 22 页 10玖币

侵权投诉

Sean Kulinski 1David I. Inouye 1

Abstract

A distribution shift can have fundamental conse-

quences such as signaling a change in the operat-

ing environment or signiﬁcantly reducing the ac-

curacy of downstream models. Thus, understand-

ing distribution shifts is critical for examining

and hopefully mitigating the effect of such a shift.

Most prior work focuses on merely detecting if a

shift has occurred and assumes any detected shift

can be understood and handled appropriately by

a human operator. We hope to aid in these man-

ual mitigation tasks by explaining the distribution

shift using interpretable transportation maps from

the original distribution to the shifted one. We

derive our interpretable mappings from a relax-

ation of optimal transport, where the candidate

mappings are restricted to a set of interpretable

mappings. We then inspect multiple quintessen-

tial use-cases of distribution shift in real-world

tabular, text, and image datasets to showcase how

our explanatory mappings provide a better bal-

ance between detail and interpretability than base-

line explanations by both visual inspection and

our PercentExplained metric.

1. Introduction

Most real-world environments are constantly changing, and

in many situations, understanding how a speciﬁc operating

environment has changed is crucial to making decisions

respective to such a change. Such a change might be due to

a new data distribution seen in deployment which causes a

machine-learning model to begin to fail. Another example

is a decrease in monthly sales data which could be due to

a temporary supply chain issue in distributing a product or

could mark a shift in consumer preferences. When these

changes are encountered, the burden is often placed on a

human operator to investigate the shift and determine the

Department of Electrical and Computer Engineering, Pur-

due University, West Lafayette, IN, USA. Correspondence

to: David I. Inouye

dinouye@purdue.edu

, Sean Kulinski

<skulinsk@purdue.edu>.

Proceedings of the

40 th

International Conference on Machine

Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright

2023 by the author(s).

appropriate reaction, if any, that needs to be taken. In this

work, our goal is to aid these operators in providing an

explanation of such a change.

This ubiquitous phenomenon of having a difference between

related distributions is known as distribution shift. Much

prior work focuses on detecting distribution shifts; however,

there is little prior work that looks into understanding a

detected distribution shift. As it is usually solely up to an

operator investigating a ﬂagged distribution shift to decide

what to do next, understanding the shift is critical for the

operator to more efﬁciently mitigate any harmful effects

of the distribution shift. Due to the fact that there are no

cohesive methods for understanding distribution shifts, as

well as, the potential high complexity of distribution shifts

(e.g., (Koh et al.,2021)), this important manual investiga-

tion task can be daunting. The current de facto standard

in analyzing a shift in tabular data is to look at how the

mean of the original, source, distribution shifted to the new,

target, distribution. However, this simple explanation can

miss crucial shift information due to being a coarse sum-

mary (e.g., Figure 2) or, in high-dimensional regimes, can

be uninterpretable. Thus, there is a need for methods that

can automatically provide detailed, yet interpretable, infor-

mation about a detected shift which ultimately can lead to

actionable insights about the shift.

Therefore, we propose a novel framework for explaining dis-

tribution shifts, such as showing how features have changed

or how groups within the distributions have shifted. Since a

distribution shift can be seen as a movement from a source

distribution

x∼Psrc

to a target distribution

y∼Ptgt

, we

deﬁne a distribution shift explanation as a transport map

T(x)

which maps a point in our source distribution onto

a point in our target distribution. For example, under this

framework, the typical distribution shift explanation via

mean shift can be written as

T(x) = x+ (µy−µx)

. Intu-

itively, these transport maps can be thought of as a functional

approximation of how the source distribution could have

moved in a distribution space to become our target distribu-

tion. However, without making assumptions on the type of

shift, there exist many possible mappings that explain the

shift (see subsection A.1 for examples). Thus, we leverage

prior optimal transport work to deﬁne an ideal distribution

shift explanation and develop practical algorithms for ﬁnd-

ing and validating such maps.

arXiv:2210.10275v2 [cs.LG] 20 Jun 2023

Towards Explaining Distribution Shifts

Given(' , )

𝑷𝒕𝒈𝒕

𝑷𝒔𝒓𝒄

Baseline Mean

Explain:

Proposed

𝑘-sparse Explain:

Baseline

Mean Explain:

Proposed

𝑘-cluster Explain:

Oracle Shift

from

𝑷𝒔𝒓𝒄 to 𝑷𝒕𝒈𝒕:

29K more entries

OT&!→&"

()* = [-women, -men, +man, +people, -like] which aligns 7.3% of non-toxic to toxic comments

OT&!→&"

()+, = [OT-

!→-

()* ] + [+trump, +just, +don’t, +black, -male] which accounts for 11.61% of the shift

𝜇&!→&" = [+trump, -women, …...... , -bishops, -000, +hell, -day, -government, + race, -role, +sick]

Solve for

𝑘-cluster

maps

Solve for

𝑘-sparse

maps

Solve for

distribution

translation

maps

Are the features

or samples

interpretable?

𝑃𝒙→+𝑃𝒙→/𝑃𝒙→0𝑃𝒙→1𝑃𝒙→*

𝑃

𝒙

𝑃

𝑃/

𝑃0

𝑃

+𝑃/𝑃0𝑃

1𝑃*

Baseline

Random

Set Explain:

Proposed

Distributional

Counterfactual

Explain:

Clusterable

Samples are interpretable

(e.g., images)

Features are

interpretable

(e.g., sales

data)

Not

Clusterable

𝑇&!

𝑇'!

𝑇

Our Methods

Are there

clusters in

the samples?

Figure 1: An overall look at our approach to explaining distribution shifts, where given a source

Psrc

and shifted

Ptgt

dataset

pair, a user can choose to explain the distribution shift using

-sparse maps (which are best suited for high dimensional or

feature-wise complex data),

-cluster maps (for tracking how heterogeneous groups change across the shift), or distribution

translation maps (for data which has uninterpretable raw features such as images). For details on the results seen in the three

boxes, please see experiments in section 5 and Appendix C.

We summarize our contributions as follows:

•

In section 3, we deﬁne intrinsically interpretable transport

maps by constraining a relaxed form of the optimal trans-

port problem to only search over a set of interpretable

mappings and suggest possible interpretable sets. Also,

we suggest methods for explaining image-based shifts

such as distributional counterfactual examples.

•

In section 4, we develop practical methods and a Percent-

Explained metric for ﬁnding intrinsically interpretable

mappings which allow us to adjust the interpretability of

an explanation to ﬁt the needs of a situation.

•

In section 5, we show empirical results on real-world

tabular, text, and image-based datasets demonstrating how

our explanations can aid an operator in understanding how

a distribution has shifted.

2. Related Works

The characterization of the problem of distribution shift has

been extensively studied (Qui

nonero-Candela et al.,2009;

Storkey,2009;Moreno-Torres et al.,2012) via breaking

down a joint distribution

P(x, y)

of features

and outputs

, into conditional factorizations such as

P(y|x)P(x)

P(x|y)P(y)

. For covariate shift (Sugiyama et al.,2007) the

P(x)

marginal differs from source to target, but the output

conditional

P(y|x)

the same, while label shift (also known

as prior probability shift) (Zhang et al.,2013;Lipton et al.,

2018) is when the

P(y)

marginals differ from source to

target, but the full-feature conditional

P(x|y)

remains the

same. In this work, we refer to general problem distribution

shift, i.e. a shift in the joint distribution (with no distinction

between

and

), and leave applications of explaining

speciﬁc sub-genres of distribution shift to future work.

As far as we are aware, this is the ﬁrst work speciﬁcally

tackling explaining distribution shifts, thus there are no ac-

cepted methods, baselines, or metrics for distribution shift

explanations. However, there are distinct works that can

be applied to explain distribution shifts. For example, one

could use feature attribution methods (Saarela & Jauhiainen,

2021;Molnar,2020) on a domain/distribution classiﬁer (e.g.,

Shanbhag et al. (2021) uses Shapley values (Shapley,1997)

to explain how changing input feature distributions affect

a classiﬁer’s behavior), or once could ﬁnd samples which

are most illustrative of the differences between distribu-

tions (Brockmeier et al.,2021). Additionally, one could

use counterfactual generation methods (Karras et al.,2019;

Sauer & Geiger,2021;Pawelczyk et al.,2020) and apply

them for “distributional counterfactuals” which would show

what a sample from

Ptgt

would have looked like if it in-

stead came from

Psrc

(e.g., Pawelczyk et al. (2020) uses a

classiﬁer-guided VAE to generate class counterfactuals on

tabular data). We explore this distributional counterfactual

explanation approach in subsection 3.4.

A sister ﬁeld is that of detecting distribution shifts. This is

commonly done using methods such as statistical hypothesis

testing of the input features (Nelson,2003;Rabanser et al.,

2018;Qui

nonero-Candela et al.,2009), training a domain

classiﬁer to test between source and non-source domain

samples (Lipton et al.,2018), etc. In Kulinski et al. (2020);

Budhathoki et al. (2021), the authors attempt to provide

more information beyond the binary “has a shift occurred?”

via localizing a shift to a subset of features or causal mecha-

nisms. Kulinski et al. (2020) does this by introducing the

Towards Explaining Distribution Shifts

notion of Feature Shift, which ﬁrst detects if a shift has

occurred and if so, localizes that shift to a speciﬁc subset

of features that have shifted from source to target. In Bud-

hathoki et al. (2021), the authors take a causal approach via

individually factoring the source and target distributions into

a product of their causal mechanisms (i.e. a variable condi-

tioned on its parents) using a shared causal graph, which is

assumed to be known a priori. Then, the authors “replace” a

subset of causal mechanisms from

Psrc

with

Ptgt

, and mea-

sure divergence from

Psrc

(i.e. measuring how much the

subset change affects the source distribution). Both of these

methods are still focused on detecting distribution shifts

(via identifying shifted causal mechanisms or feature-level

shifts), unlike explanatory mappings which help explain

how the data has shifted.

3. Explaining Shifts via Transport Maps

The underlying assumption of distribution shift is that there

exists a relationship between the source and target distribu-

tions. From a distributional standpoint, we can view distri-

bution shift as a movement, or transportation, of samples

from the source distribution

Psrc

to the target distribution

Ptgt

. Thus, we can capture this relationship between the

distributions via a transport map

from the source distri-

bution to the target, i.e., if

x∼Psrc

, then

T(x)∼Ptgt

. If

this mapping is understandable to an operator investigating

a distribution shift, this can serve as an explanation to the

operator on what changed between the environments; thus

allowing for more effective reactions to the shift. Therefore,

in this work, we deﬁne a distribution shift explanation as:

ﬁnding an interpretable transport map

which approxi-

mately maps a source distribution

Psrc

onto a target distri-

bution

Ptgt

such that

T♯Psrc ≈Ptgt

. Similar to ML model

interpretability (Molnar,2020), an interpretable map can ei-

ther be one that is intrinsically interpretable (subsection 3.1)

or a mapping that is explained via post-hoc methods such

as sets of input-output pairs (subsection 3.4).

3.1. Intrinsically Interpretable Transportation Maps

To ﬁnd such a mapping between distributions, it is natural to

look to Optimal Transport (OT) and its extensive prior work

in this ﬁeld (Cuturi,2013;Arjovsky et al.,2017;Torres

et al.,2021;Peyr

e & Cuturi,2019). An OT mapping given

a transportation cost function

is a method of optimally

moving points from one distribution to align with another

distribution and is deﬁned as:

TOT := arg min

EPsrc [c(x, T (x))] s.t. T♯Psrc =Ptgt

where

T♯Psrc

is the pushforward operator that can be viewed

as applying

to all points in

Psrc

, and

T♯Psrc =Ptgt

is the marginal constraint, which means the pushforward

distribution must match the target distribution. OT is a

natural starting point for shift explanations as it allows for a

rich geometric structure on the space of distributions, and

by ﬁnding a mapping that minimizes a transport cost we are

forcing our mapping to retain as much information about

the original

samples when aligning

Psrc

and

Ptgt

. For

more details about OT, please see (Villani,2009;Peyr

e &

Cuturi,2019).

However, since OT considers all possible mappings which

satisfy the marginal constraint, this means the resulting

TOT

can be arbitrarily complex and thus possibly uninterpretable

as a shift explanation. We can alleviate this by restricting the

candidate transport maps to belong to a set of user-deﬁned

interpretable mappings

Ω

. However, this problem can be

infeasible if

Ω

does not contain a mapping that satisﬁes the

marginal alignment constraint. Therefore, we can use La-

grangian relaxation to relax the marginal constraint, giving

us an Interpretable Transport mapping TIT :

TIT := arg min

T∈Ω

EPsrc [c(x, T (x))] + λ ϕ(PT(x), Ptgt)

(1)

where

ϕ(·,·)

is a distribution divergence function (e.g.,

KL or Wasserstein). In this paper, we will assume

is the squared Euclidean cost and

ϕ(·,·)

is the squared

Wasserstein-2 metric, unless stated otherwise. Due to the

heavily complex and context-speciﬁc nature of distribution

shift, it is likely that

Ω

would be initialized based on context.

However, we suggest two general methods in the next sec-

tion as a starting point and hope that future work can build

upon this framework for speciﬁc contexts.

3.2. Intrinsically Interpretable Transport Sets

The current common practice for explaining distribution

shifts is comparing the means of the source and the target

distributions. The mean shift explanation can be generalized

Ωvector ={T:T(x) = x+δ}

where

is a constant

vector and mean shift being the speciﬁc case where

is the

difference of the source and target means. By letting

be a

function of

, which further generalizes the notion of mean

shift by allowing each point to move a variable amount

per dimension, we arrive at a transport set that includes

any possible mapping

T:RD→RD

. However, even a

simple transport set like

Ωvector

can yield uninterpretable

mappings in high dimensional regimes (e.g., a shift vector

of over 100 dimensions). To combat this, we can constrain

the complexity of a mapping by forcing it to only move

points along a speciﬁed number of dimensions, which we

call k-Sparse Transport.

-Sparse Transport: For a given class of transport maps,

Ω

and a given

k∈ {1, ..., D}

, we can ﬁnd a subset

Ω(k)

sparse

which is the set of transport maps from

Ω

which only

transport points along

dimensions or less. Formally,

we deﬁne an active set

to be the set of dimensions

along which a given

moves points:

A(T)≜{j∈

{1, . . . , D}:∃x, T (x)j−xj̸= 0}

. Then, we deﬁne

Ω(k)

sparse ={T∈Ω : |A(T)| ≤ k}.

Towards Explaining Distribution Shifts

-sparse transport is most useful in situations where a dis-

tribution shift has happened along a subset of dimensions,

such as explaining a shift where some sensors in a network

are picking up a change in an environment. However, in

cases where points shift in different directions based on their

original value, e.g. when investigating how a heterogeneous

population responded to an advertising campaign,

-sparse

transport is not ideal. Thus, we provide a shift explanation

that breaks the source and target distributions into

sub-

populations and provides a vector-based shift explanation

per sub-population, which we call k-Cluster Transport.

-Cluster Transport: Given a

k∈ {1, . . . , D}

deﬁne

-cluster transport to be a mapping which moves

each point

by constant vector which is speciﬁc to

’s

cluster. More formally, we deﬁne a labeling function

σ(x;M)≜arg min j∥mj−x∥2

, which returns the

index of the column in

(i.e. the label of the cluster)

which

is closest to. With this, we deﬁne

Ω(k)

cluster =

T:T(x) = x+δσ(x;M), M ∈RD×k,∆∈RD×k

where δjis the jth column of ∆.

Since measuring the exact interpretability of a mapping

is heavily context-dependent, we can instead use

in the

above transport maps to deﬁne a partial ordering of inter-

pretability of mappings within a class of transport maps. Let

and

be the size of the active sets for

-sparse maps

(or the number of clusters for

-cluster maps) of

and

respectively. If

k1≤k2

, then

Inter(T1)≥Inter(T2)

, where

Inter(T)

is the interpretability of shift explanation

. For

example, we claim the interpretability of a

T1∈Ω(k=10)

sparse

greater than (or possibly equal to) the interpretability of a

T2∈Ω(k=100)

sparse

since a shift explanation in

Ω

which moves

points along only 10 dimensions is more interpretable than a

similar mapping which moves points along 100 dimensions.

A similar result can be shown for

-cluster transport since

an explanation of how 5 clusters moved under a shift is less

complicated than an explanation of how 10 clusters moved.

The above method allows us to deﬁne a partial ordering on

interpretability without having to determine the absolute

value of interpretability of an individual explanation

, as

this requires expensive context-speciﬁc human evaluations,

which are out of scope for this paper.

3.3. Intrinsically Interpretable Maps For Images

To ﬁnd interpretable transport mappings for images, we

could ﬁrst project

Psrc

and

Ptgt

onto a low-dimensional in-

terpretable latent space (e.g., a space which has disentangled

and semantically meaningful dimensions) and then apply the

methods above in this latent space. Concretely, let us denote

the

(pseudo-)invertible

encoder as

g:RD→RD′

where

D′< D

(e.g., an autoencoder). Given this encoder, we de-

ﬁne our set of high dimensional interpretable transport maps:

Ωhigh-dim :=nT:T=g−1˜

T(g(x)),˜

T∈Ω, g ∈ Io

where

Ω

the set of interpretable mappings (e.g.,

-sparse

mappings) and

is the set of

(pseudo-)invertible

functions

with an interpretable (i.e. semantically meaningful) latent

space. Finally, given an interpretable

g∈ I

, this gives us

High-dimensional Interpretable Transport:THIT .

As seen in the Stanford Wilds dataset (Koh et al.,2021),

which contains benchmark examples of real-world image-

based distribution shifts, image-based shifts can be im-

mensely complex. In order to provide an adequate intrin-

sically interpretable mapping explanation of a distribution

shift in high dimensional data (e.g., images), multiple new

advancements must ﬁrst be met (e.g., ﬁnding a disentangled

latent space with semantically meaningful dimensions, ap-

proximating high dimensional empirical optimal transport

maps, etc.), which are out of scope of this paper. We further

explore details about

THIT

, its variants, and the results of

using

THIT

to explain Colorized-MNIST in Appendix D,

and we hope future work could build upon this framework.

3.4. Post-Hoc Explanations of Image-Based Mappings

via Counterfactual Examples

As mentioned above, in some cases, solving for an inter-

pretable latent space can be too difﬁcult or costly, and thus

a shift cannot be expressed by an interpretable mapping

function. However, if the samples themselves are easy to

interpret (e.g., images), we can still explain a transport map-

ping by visualizing translated samples. Speciﬁcally, we can

remove the interpretable constraint on the mapping itself

and leverage methods from the unpaired Image-to-Image

translation (I2I) literature to translate between the source

and target domain while preserving the content. For a com-

prehensive summary of the recent I2I works and methods,

please see (Pang et al.,2021).

Once an I2I mapping is found, to serve as an explanation,

we can provide an operator with a set of counterfactual pairs

{(x, T (x)) : x∼Psrc, T (x)∼Ptgt}

. Then, by determin-

ing what commonly stays invariant and what commonly

changes across the set of counterfactual pairs, this can serve

as an explanation of how the source distribution shifted to

the target distribution. While more broadly applicable, this

approach could put a higher load on the operator than an

intrinsically interpretable mapping approach.

4. Practical Methods for Finding and

Validating Shift Explanations

In this section, we discuss practical methods for ﬁnding

these maps via empirical OT (Sec. 4.1,4.2, and 4.3) and

introduce a PercentExplained metric which can assist the

operator in selecting the hyperparameter

-sparse and

k-cluster transport (Sec. 4.4).

Towards Explaining Distribution Shifts

4.1. Empirical Interpretable Transport Upper Bound

As the divergence term in our interpretable transport ob-

jective (Equation 1) can be computationally-expensive to

optimize in practice, we propose to optimize the follow-

ing simpliﬁcation, which simply computes the difference

between the map and the sample-based OT solution

TOT

(which can be computed efﬁciently for samples or approxi-

mated via the Sinkhorn algorithm (Cuturi,2013)):

arg min

T∈Ω

i=1

cx(i), T (x(i))+λdT(x(i)), TOT (x(i))

(2)

where

is the squared

ℓ2

function. Notably, the diver-

gence value in Equation 1 is replaced with the average over

a sample-speciﬁc distance between

T(x)

and the optimal

transport mapping TOT (x). This is computationally attrac-

tive as the optimal transport solution only needs to be calcu-

lated once, rather than calculating the Wasserstein distance

once per iteration as would be required if directly optimizing

the Interpretable Transport problem. Additionally, we prove

in subsection A.2 that the second term in Equation 2 is an

upper bound when the divergence is the squared Wasserstein

distance, i.e., when ϕ=W2

4.2. Finding k-Sparse Maps

The

-sparse algorithm can be broken down into two steps.

First, given

, we estimate the active set

by simply tak-

ing the

dimensions with the largest difference of means

between two distributions. This is a simple approach that

avoids optimization over an exponential number of possi-

ble subsets for

and can be optimal for some cases, as

explained below. Second, given the active set

, we need to

estimate the map. While estimating

-sparse solutions to the

original interpretable transport problem (Equation 1) is chal-

lenging, we prove that the solution with optimal alignment

to the upper bound above (Equation 2) can be computed in

closed-form for two special cases. If the optimization set

is restricted to only shifting the mean, i.e.,

Ω(k)= Ω(k)

vector

then the solution with optimal alignment is:

∀j, [T(x)]j=xj+ (µtgt

j−µsrc

j),if j∈ A

xj,if j̸∈ A ,(3)

where

µsrc

and

µtgt

are the mean of the source and target

distributions respectively. Similarly, if

Ω(k)

is unconstrained

except for sparsity, then the solution with optimal alignment

is simply:

∀j, [T(x)]j=[TOT (x)]j,if j∈ A

xj,if j̸∈ A ,(4)

where

[TOT (x)]j

is the

-th coordinate of the sample-based

OT solution. The proofs of alignment optimality w.r.t. the

divergence upper bound in Equation 2 are based on decom-

posability of the squared Euclidean distance and can be

found in Appendix A. The ﬁnal algorithm for both sparse

maps can be found in Algorithm 1.

Algorithm 1 Finding k-Sparse Maps

Input: Domain datasets

X∈RN×D

and

YN×D

with

samples of dimensionality

each, the desired sparsity

and interpretable set type, i.e., Ω.

// Select active set based on means

µdiff ←µtgt −µsrc =1

NPN

i=1 Yi−1

NPN

i=1 Xi

A ← TopKIndices(abs(µdiff), k)

// Create dimension-wise maps based on active set

if Ω=Ωvector then

∀j, [T(x)]j=xj+µdiff

j,if j∈ A

xj,if j̸∈ A

else

TOT (·)←OptimalTransportAlg(X, Y )

∀j, [T(x)]j=[TOT (x)]j,if j∈ A

xj,if j̸∈ A

end if

Output: T(·)

4.3. Finding k-Cluster Maps

Similar to

-sparse maps, we split this algorithm into two

parts: (1) estimate pairs of source and target clusters and

then (2) compute mean shift for each pair of clusters. For the

ﬁrst step, na

ıvely one might expect that independent cluster-

ing on each domain dataset followed by post-hoc pairing of

these clusters may be sufﬁcient. However, this could yield

very poor clustering pairs that are signiﬁcantly mismatched

because the domain-speciﬁc clustering may not be optimal

in terms of the alignment objective. For example, the source

domain may have one large and one small cluster and the

target domain could have equal-sized clusters. Therefore,

it is important to cluster the source and domain samples

jointly. To estimate paired (i.e., dependent) clusterings of

the source and target domain samples, we ﬁrst ﬁnd the OT

mapping from source to target. We then cluster an paired

dataset formed by concatenating each source sample with

its OT mapped sample (which actually corresponds to one

of the target samples). The clustering on these paired sam-

ples gives paired cluster centroids for the source and target,

denoted

µsrc

ℓ

and

µtgt

ℓ

respectively, which we use to construct

a cluster-speciﬁc mean shift map deﬁned as:

T(x) = x+ (µtgt

σ(x)−µsrc

σ(x))(5)

where

σ(x) = arg min ℓ∥x−µsrc

ℓ∥2

is the cluster label

function. This map applies a simple shift to every source

domain cluster to map to the target domain. Algorithm 2

shows pseudo-code for both steps in our k-cluster method.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TowardsExplainingDistributionShiftsSeanKulinski1DavidI.Inouye1AbstractAdistributionshiftcanhavefundamentalconse-quencessuchassignalingachangeintheoperat-ingenvironmentorsignificantlyreducingtheac-curacyofdownstreammodels.Thus,understand-ingdistributionshiftsiscriticalforexaminingandhopefullymitigati...

展开>> 收起<<

Towards Explaining Distribution Shifts.pdf

共22页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Towards Explaining Distribution Shifts

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: