Towards Explaining Distribution Shifts

2025-05-06 0 0 5.95MB 22 页 10玖币
侵权投诉
Towards Explaining Distribution Shifts
Sean Kulinski 1David I. Inouye 1
Abstract
A distribution shift can have fundamental conse-
quences such as signaling a change in the operat-
ing environment or significantly reducing the ac-
curacy of downstream models. Thus, understand-
ing distribution shifts is critical for examining
and hopefully mitigating the effect of such a shift.
Most prior work focuses on merely detecting if a
shift has occurred and assumes any detected shift
can be understood and handled appropriately by
a human operator. We hope to aid in these man-
ual mitigation tasks by explaining the distribution
shift using interpretable transportation maps from
the original distribution to the shifted one. We
derive our interpretable mappings from a relax-
ation of optimal transport, where the candidate
mappings are restricted to a set of interpretable
mappings. We then inspect multiple quintessen-
tial use-cases of distribution shift in real-world
tabular, text, and image datasets to showcase how
our explanatory mappings provide a better bal-
ance between detail and interpretability than base-
line explanations by both visual inspection and
our PercentExplained metric.
1. Introduction
Most real-world environments are constantly changing, and
in many situations, understanding how a specific operating
environment has changed is crucial to making decisions
respective to such a change. Such a change might be due to
a new data distribution seen in deployment which causes a
machine-learning model to begin to fail. Another example
is a decrease in monthly sales data which could be due to
a temporary supply chain issue in distributing a product or
could mark a shift in consumer preferences. When these
changes are encountered, the burden is often placed on a
human operator to investigate the shift and determine the
1
Department of Electrical and Computer Engineering, Pur-
due University, West Lafayette, IN, USA. Correspondence
to: David I. Inouye
<
dinouye@purdue.edu
>
, Sean Kulinski
<skulinsk@purdue.edu>.
Proceedings of the
40 th
International Conference on Machine
Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright
2023 by the author(s).
appropriate reaction, if any, that needs to be taken. In this
work, our goal is to aid these operators in providing an
explanation of such a change.
This ubiquitous phenomenon of having a difference between
related distributions is known as distribution shift. Much
prior work focuses on detecting distribution shifts; however,
there is little prior work that looks into understanding a
detected distribution shift. As it is usually solely up to an
operator investigating a flagged distribution shift to decide
what to do next, understanding the shift is critical for the
operator to more efficiently mitigate any harmful effects
of the distribution shift. Due to the fact that there are no
cohesive methods for understanding distribution shifts, as
well as, the potential high complexity of distribution shifts
(e.g., (Koh et al.,2021)), this important manual investiga-
tion task can be daunting. The current de facto standard
in analyzing a shift in tabular data is to look at how the
mean of the original, source, distribution shifted to the new,
target, distribution. However, this simple explanation can
miss crucial shift information due to being a coarse sum-
mary (e.g., Figure 2) or, in high-dimensional regimes, can
be uninterpretable. Thus, there is a need for methods that
can automatically provide detailed, yet interpretable, infor-
mation about a detected shift which ultimately can lead to
actionable insights about the shift.
Therefore, we propose a novel framework for explaining dis-
tribution shifts, such as showing how features have changed
or how groups within the distributions have shifted. Since a
distribution shift can be seen as a movement from a source
distribution
xPsrc
to a target distribution
yPtgt
, we
define a distribution shift explanation as a transport map
T(x)
which maps a point in our source distribution onto
a point in our target distribution. For example, under this
framework, the typical distribution shift explanation via
mean shift can be written as
T(x) = x+ (µyµx)
. Intu-
itively, these transport maps can be thought of as a functional
approximation of how the source distribution could have
moved in a distribution space to become our target distribu-
tion. However, without making assumptions on the type of
shift, there exist many possible mappings that explain the
shift (see subsection A.1 for examples). Thus, we leverage
prior optimal transport work to define an ideal distribution
shift explanation and develop practical algorithms for find-
ing and validating such maps.
1
arXiv:2210.10275v2 [cs.LG] 20 Jun 2023
Towards Explaining Distribution Shifts
Given(' , )
𝑷𝒕𝒈𝒕
𝑷𝒔𝒓𝒄
Baseline Mean
Explain:
Proposed
𝑘-sparse Explain:
Baseline
Mean Explain:
Proposed
𝑘-cluster Explain:
Oracle Shift
from
𝑷𝒔𝒓𝒄 to 𝑷𝒕𝒈𝒕:
29K more entries
OT&!&"
()* = [-women, -men, +man, +people, -like] which aligns 7.3% of non-toxic to toxic comments
OT&!&"
()+, = [OT-
!→-
"
()* ] + [+trump, +just, +don’t, +black, -male] which accounts for 11.61% of the shift
𝜇&!&" = [+trump, -women, …...... , -bishops, -000, +hell, -day, -government, + race, -role, +sick]
Solve for
𝑘-cluster
maps
Solve for
𝑘-sparse
maps
Solve for
distribution
translation
maps
Are the features
or samples
interpretable?
𝑃𝒙+𝑃𝒙/𝑃𝒙0𝑃𝒙1𝑃𝒙*
𝑃
𝒙
𝑃
+
𝑃/
𝑃0
𝑃
+𝑃/𝑃0𝑃
1𝑃*
Baseline
Random
Set Explain:
Proposed
Distributional
Counterfactual
Explain:
Clusterable
Samples are interpretable
(e.g., images)
Features are
interpretable
(e.g., sales
data)
Not
Clusterable
𝑇&!
𝑇'!
𝑇
Our Methods
Are there
clusters in
the samples?
Figure 1: An overall look at our approach to explaining distribution shifts, where given a source
Psrc
and shifted
Ptgt
dataset
pair, a user can choose to explain the distribution shift using
k
-sparse maps (which are best suited for high dimensional or
feature-wise complex data),
k
-cluster maps (for tracking how heterogeneous groups change across the shift), or distribution
translation maps (for data which has uninterpretable raw features such as images). For details on the results seen in the three
boxes, please see experiments in section 5 and Appendix C.
We summarize our contributions as follows:
In section 3, we define intrinsically interpretable transport
maps by constraining a relaxed form of the optimal trans-
port problem to only search over a set of interpretable
mappings and suggest possible interpretable sets. Also,
we suggest methods for explaining image-based shifts
such as distributional counterfactual examples.
In section 4, we develop practical methods and a Percent-
Explained metric for finding intrinsically interpretable
mappings which allow us to adjust the interpretability of
an explanation to fit the needs of a situation.
In section 5, we show empirical results on real-world
tabular, text, and image-based datasets demonstrating how
our explanations can aid an operator in understanding how
a distribution has shifted.
2. Related Works
The characterization of the problem of distribution shift has
been extensively studied (Qui
˜
nonero-Candela et al.,2009;
Storkey,2009;Moreno-Torres et al.,2012) via breaking
down a joint distribution
P(x, y)
of features
x
and outputs
y
, into conditional factorizations such as
P(y|x)P(x)
or
P(x|y)P(y)
. For covariate shift (Sugiyama et al.,2007) the
P(x)
marginal differs from source to target, but the output
conditional
P(y|x)
the same, while label shift (also known
as prior probability shift) (Zhang et al.,2013;Lipton et al.,
2018) is when the
P(y)
marginals differ from source to
target, but the full-feature conditional
P(x|y)
remains the
same. In this work, we refer to general problem distribution
shift, i.e. a shift in the joint distribution (with no distinction
between
y
and
x
), and leave applications of explaining
specific sub-genres of distribution shift to future work.
As far as we are aware, this is the first work specifically
tackling explaining distribution shifts, thus there are no ac-
cepted methods, baselines, or metrics for distribution shift
explanations. However, there are distinct works that can
be applied to explain distribution shifts. For example, one
could use feature attribution methods (Saarela & Jauhiainen,
2021;Molnar,2020) on a domain/distribution classifier (e.g.,
Shanbhag et al. (2021) uses Shapley values (Shapley,1997)
to explain how changing input feature distributions affect
a classifier’s behavior), or once could find samples which
are most illustrative of the differences between distribu-
tions (Brockmeier et al.,2021). Additionally, one could
use counterfactual generation methods (Karras et al.,2019;
Sauer & Geiger,2021;Pawelczyk et al.,2020) and apply
them for “distributional counterfactuals” which would show
what a sample from
Ptgt
would have looked like if it in-
stead came from
Psrc
(e.g., Pawelczyk et al. (2020) uses a
classifier-guided VAE to generate class counterfactuals on
tabular data). We explore this distributional counterfactual
explanation approach in subsection 3.4.
A sister field is that of detecting distribution shifts. This is
commonly done using methods such as statistical hypothesis
testing of the input features (Nelson,2003;Rabanser et al.,
2018;Qui
˜
nonero-Candela et al.,2009), training a domain
classifier to test between source and non-source domain
samples (Lipton et al.,2018), etc. In Kulinski et al. (2020);
Budhathoki et al. (2021), the authors attempt to provide
more information beyond the binary “has a shift occurred?”
via localizing a shift to a subset of features or causal mecha-
nisms. Kulinski et al. (2020) does this by introducing the
2
Towards Explaining Distribution Shifts
notion of Feature Shift, which first detects if a shift has
occurred and if so, localizes that shift to a specific subset
of features that have shifted from source to target. In Bud-
hathoki et al. (2021), the authors take a causal approach via
individually factoring the source and target distributions into
a product of their causal mechanisms (i.e. a variable condi-
tioned on its parents) using a shared causal graph, which is
assumed to be known a priori. Then, the authors “replace” a
subset of causal mechanisms from
Psrc
with
Ptgt
, and mea-
sure divergence from
Psrc
(i.e. measuring how much the
subset change affects the source distribution). Both of these
methods are still focused on detecting distribution shifts
(via identifying shifted causal mechanisms or feature-level
shifts), unlike explanatory mappings which help explain
how the data has shifted.
3. Explaining Shifts via Transport Maps
The underlying assumption of distribution shift is that there
exists a relationship between the source and target distribu-
tions. From a distributional standpoint, we can view distri-
bution shift as a movement, or transportation, of samples
from the source distribution
Psrc
to the target distribution
Ptgt
. Thus, we can capture this relationship between the
distributions via a transport map
T
from the source distri-
bution to the target, i.e., if
xPsrc
, then
T(x)Ptgt
. If
this mapping is understandable to an operator investigating
a distribution shift, this can serve as an explanation to the
operator on what changed between the environments; thus
allowing for more effective reactions to the shift. Therefore,
in this work, we define a distribution shift explanation as:
finding an interpretable transport map
T
which approxi-
mately maps a source distribution
Psrc
onto a target distri-
bution
Ptgt
such that
TPsrc Ptgt
. Similar to ML model
interpretability (Molnar,2020), an interpretable map can ei-
ther be one that is intrinsically interpretable (subsection 3.1)
or a mapping that is explained via post-hoc methods such
as sets of input-output pairs (subsection 3.4).
3.1. Intrinsically Interpretable Transportation Maps
To find such a mapping between distributions, it is natural to
look to Optimal Transport (OT) and its extensive prior work
in this field (Cuturi,2013;Arjovsky et al.,2017;Torres
et al.,2021;Peyr
´
e & Cuturi,2019). An OT mapping given
a transportation cost function
c
is a method of optimally
moving points from one distribution to align with another
distribution and is defined as:
TOT := arg min
T
EPsrc [c(x, T (x))] s.t. TPsrc =Ptgt
where
TPsrc
is the pushforward operator that can be viewed
as applying
T
to all points in
Psrc
, and
TPsrc =Ptgt
is the marginal constraint, which means the pushforward
distribution must match the target distribution. OT is a
natural starting point for shift explanations as it allows for a
rich geometric structure on the space of distributions, and
by finding a mapping that minimizes a transport cost we are
forcing our mapping to retain as much information about
the original
x
samples when aligning
Psrc
and
Ptgt
. For
more details about OT, please see (Villani,2009;Peyr
´
e &
Cuturi,2019).
However, since OT considers all possible mappings which
satisfy the marginal constraint, this means the resulting
TOT
can be arbitrarily complex and thus possibly uninterpretable
as a shift explanation. We can alleviate this by restricting the
candidate transport maps to belong to a set of user-defined
interpretable mappings
. However, this problem can be
infeasible if
does not contain a mapping that satisfies the
marginal alignment constraint. Therefore, we can use La-
grangian relaxation to relax the marginal constraint, giving
us an Interpretable Transport mapping TIT :
TIT := arg min
T
EPsrc [c(x, T (x))] + λ ϕ(PT(x), Ptgt)
(1)
where
ϕ(·,·)
is a distribution divergence function (e.g.,
KL or Wasserstein). In this paper, we will assume
c
is the squared Euclidean cost and
ϕ(·,·)
is the squared
Wasserstein-2 metric, unless stated otherwise. Due to the
heavily complex and context-specific nature of distribution
shift, it is likely that
would be initialized based on context.
However, we suggest two general methods in the next sec-
tion as a starting point and hope that future work can build
upon this framework for specific contexts.
3.2. Intrinsically Interpretable Transport Sets
The current common practice for explaining distribution
shifts is comparing the means of the source and the target
distributions. The mean shift explanation can be generalized
as
vector ={T:T(x) = x+δ}
where
δ
is a constant
vector and mean shift being the specific case where
δ
is the
difference of the source and target means. By letting
δ
be a
function of
x
, which further generalizes the notion of mean
shift by allowing each point to move a variable amount
per dimension, we arrive at a transport set that includes
any possible mapping
T:RDRD
. However, even a
simple transport set like
vector
can yield uninterpretable
mappings in high dimensional regimes (e.g., a shift vector
of over 100 dimensions). To combat this, we can constrain
the complexity of a mapping by forcing it to only move
points along a specified number of dimensions, which we
call k-Sparse Transport.
k
-Sparse Transport: For a given class of transport maps,
and a given
k∈ {1, ..., D}
, we can find a subset
(k)
sparse
which is the set of transport maps from
which only
transport points along
k
dimensions or less. Formally,
we define an active set
A
to be the set of dimensions
along which a given
T
moves points:
A(T){j
{1, . . . , D}:x, T (x)jxj̸= 0}
. Then, we define
(k)
sparse ={TΩ : |A(T)| ≤ k}.
3
Towards Explaining Distribution Shifts
k
-sparse transport is most useful in situations where a dis-
tribution shift has happened along a subset of dimensions,
such as explaining a shift where some sensors in a network
are picking up a change in an environment. However, in
cases where points shift in different directions based on their
original value, e.g. when investigating how a heterogeneous
population responded to an advertising campaign,
k
-sparse
transport is not ideal. Thus, we provide a shift explanation
that breaks the source and target distributions into
k
sub-
populations and provides a vector-based shift explanation
per sub-population, which we call k-Cluster Transport.
k
-Cluster Transport: Given a
k∈ {1, . . . , D}
we
define
k
-cluster transport to be a mapping which moves
each point
x
by constant vector which is specific to
x
s
cluster. More formally, we define a labeling function
σ(x;M)arg min jmjx2
, which returns the
index of the column in
M
(i.e. the label of the cluster)
which
x
is closest to. With this, we define
(k)
cluster =
T:T(x) = x+δσ(x;M), M RD×k,RD×k
,
where δjis the jth column of .
Since measuring the exact interpretability of a mapping
is heavily context-dependent, we can instead use
k
in the
above transport maps to define a partial ordering of inter-
pretability of mappings within a class of transport maps. Let
k1
and
k2
be the size of the active sets for
k
-sparse maps
(or the number of clusters for
k
-cluster maps) of
T1
and
T2
respectively. If
k1k2
, then
Inter(T1)Inter(T2)
, where
Inter(T)
is the interpretability of shift explanation
T
. For
example, we claim the interpretability of a
T1(k=10)
sparse
is
greater than (or possibly equal to) the interpretability of a
T2(k=100)
sparse
since a shift explanation in
which moves
points along only 10 dimensions is more interpretable than a
similar mapping which moves points along 100 dimensions.
A similar result can be shown for
k
-cluster transport since
an explanation of how 5 clusters moved under a shift is less
complicated than an explanation of how 10 clusters moved.
The above method allows us to define a partial ordering on
interpretability without having to determine the absolute
value of interpretability of an individual explanation
T
, as
this requires expensive context-specific human evaluations,
which are out of scope for this paper.
3.3. Intrinsically Interpretable Maps For Images
To find interpretable transport mappings for images, we
could first project
Psrc
and
Ptgt
onto a low-dimensional in-
terpretable latent space (e.g., a space which has disentangled
and semantically meaningful dimensions) and then apply the
methods above in this latent space. Concretely, let us denote
the
(pseudo-)invertible
encoder as
g:RDRD
where
D< D
(e.g., an autoencoder). Given this encoder, we de-
fine our set of high dimensional interpretable transport maps:
high-dim :=nT:T=g1˜
T(g(x)),˜
T, g ∈ Io
where
the set of interpretable mappings (e.g.,
k
-sparse
mappings) and
I
is the set of
(pseudo-)invertible
functions
with an interpretable (i.e. semantically meaningful) latent
space. Finally, given an interpretable
g∈ I
, this gives us
High-dimensional Interpretable Transport:THIT .
As seen in the Stanford Wilds dataset (Koh et al.,2021),
which contains benchmark examples of real-world image-
based distribution shifts, image-based shifts can be im-
mensely complex. In order to provide an adequate intrin-
sically interpretable mapping explanation of a distribution
shift in high dimensional data (e.g., images), multiple new
advancements must first be met (e.g., finding a disentangled
latent space with semantically meaningful dimensions, ap-
proximating high dimensional empirical optimal transport
maps, etc.), which are out of scope of this paper. We further
explore details about
THIT
, its variants, and the results of
using
THIT
to explain Colorized-MNIST in Appendix D,
and we hope future work could build upon this framework.
3.4. Post-Hoc Explanations of Image-Based Mappings
via Counterfactual Examples
As mentioned above, in some cases, solving for an inter-
pretable latent space can be too difficult or costly, and thus
a shift cannot be expressed by an interpretable mapping
function. However, if the samples themselves are easy to
interpret (e.g., images), we can still explain a transport map-
ping by visualizing translated samples. Specifically, we can
remove the interpretable constraint on the mapping itself
and leverage methods from the unpaired Image-to-Image
translation (I2I) literature to translate between the source
and target domain while preserving the content. For a com-
prehensive summary of the recent I2I works and methods,
please see (Pang et al.,2021).
Once an I2I mapping is found, to serve as an explanation,
we can provide an operator with a set of counterfactual pairs
{(x, T (x)) : xPsrc, T (x)Ptgt}
. Then, by determin-
ing what commonly stays invariant and what commonly
changes across the set of counterfactual pairs, this can serve
as an explanation of how the source distribution shifted to
the target distribution. While more broadly applicable, this
approach could put a higher load on the operator than an
intrinsically interpretable mapping approach.
4. Practical Methods for Finding and
Validating Shift Explanations
In this section, we discuss practical methods for finding
these maps via empirical OT (Sec. 4.1,4.2, and 4.3) and
introduce a PercentExplained metric which can assist the
operator in selecting the hyperparameter
k
in
k
-sparse and
k-cluster transport (Sec. 4.4).
4
Towards Explaining Distribution Shifts
4.1. Empirical Interpretable Transport Upper Bound
As the divergence term in our interpretable transport ob-
jective (Equation 1) can be computationally-expensive to
optimize in practice, we propose to optimize the follow-
ing simplification, which simply computes the difference
between the map and the sample-based OT solution
TOT
(which can be computed efficiently for samples or approxi-
mated via the Sinkhorn algorithm (Cuturi,2013)):
arg min
T
1
N
N
X
i=1
cx(i), T (x(i))+λdT(x(i)), TOT (x(i))
(2)
where
d
is the squared
2
function. Notably, the diver-
gence value in Equation 1 is replaced with the average over
a sample-specific distance between
T(x)
and the optimal
transport mapping TOT (x). This is computationally attrac-
tive as the optimal transport solution only needs to be calcu-
lated once, rather than calculating the Wasserstein distance
once per iteration as would be required if directly optimizing
the Interpretable Transport problem. Additionally, we prove
in subsection A.2 that the second term in Equation 2 is an
upper bound when the divergence is the squared Wasserstein
distance, i.e., when ϕ=W2
2.
4.2. Finding k-Sparse Maps
The
k
-sparse algorithm can be broken down into two steps.
First, given
k
, we estimate the active set
A
by simply tak-
ing the
k
dimensions with the largest difference of means
between two distributions. This is a simple approach that
avoids optimization over an exponential number of possi-
ble subsets for
A
and can be optimal for some cases, as
explained below. Second, given the active set
A
, we need to
estimate the map. While estimating
k
-sparse solutions to the
original interpretable transport problem (Equation 1) is chal-
lenging, we prove that the solution with optimal alignment
to the upper bound above (Equation 2) can be computed in
closed-form for two special cases. If the optimization set
is restricted to only shifting the mean, i.e.,
(k)= Ω(k)
vector
,
then the solution with optimal alignment is:
j, [T(x)]j=xj+ (µtgt
jµsrc
j),if j∈ A
xj,if j̸∈ A ,(3)
where
µsrc
and
µtgt
are the mean of the source and target
distributions respectively. Similarly, if
(k)
is unconstrained
except for sparsity, then the solution with optimal alignment
is simply:
j, [T(x)]j=[TOT (x)]j,if j∈ A
xj,if j̸∈ A ,(4)
where
[TOT (x)]j
is the
j
-th coordinate of the sample-based
OT solution. The proofs of alignment optimality w.r.t. the
divergence upper bound in Equation 2 are based on decom-
posability of the squared Euclidean distance and can be
found in Appendix A. The final algorithm for both sparse
maps can be found in Algorithm 1.
Algorithm 1 Finding k-Sparse Maps
Input: Domain datasets
XRN×D
and
YN×D
with
N
samples of dimensionality
D
each, the desired sparsity
k
,
and interpretable set type, i.e., .
// Select active set based on means
µdiff µtgt µsrc =1
NPN
i=1 Yi1
NPN
i=1 Xi
A ← TopKIndices(abs(µdiff), k)
// Create dimension-wise maps based on active set
if Ω=Ωvector then
j, [T(x)]j=xj+µdiff
j,if j∈ A
xj,if j̸∈ A
else
TOT (·)OptimalTransportAlg(X, Y )
j, [T(x)]j=[TOT (x)]j,if j∈ A
xj,if j̸∈ A
end if
Output: T(·)
4.3. Finding k-Cluster Maps
Similar to
k
-sparse maps, we split this algorithm into two
parts: (1) estimate pairs of source and target clusters and
then (2) compute mean shift for each pair of clusters. For the
first step, na
¨
ıvely one might expect that independent cluster-
ing on each domain dataset followed by post-hoc pairing of
these clusters may be sufficient. However, this could yield
very poor clustering pairs that are significantly mismatched
because the domain-specific clustering may not be optimal
in terms of the alignment objective. For example, the source
domain may have one large and one small cluster and the
target domain could have equal-sized clusters. Therefore,
it is important to cluster the source and domain samples
jointly. To estimate paired (i.e., dependent) clusterings of
the source and target domain samples, we first find the OT
mapping from source to target. We then cluster an paired
dataset formed by concatenating each source sample with
its OT mapped sample (which actually corresponds to one
of the target samples). The clustering on these paired sam-
ples gives paired cluster centroids for the source and target,
denoted
µsrc
and
µtgt
respectively, which we use to construct
a cluster-specific mean shift map defined as:
T(x) = x+ (µtgt
σ(x)µsrc
σ(x))(5)
where
σ(x) = arg min xµsrc
2
2
is the cluster label
function. This map applies a simple shift to every source
domain cluster to map to the target domain. Algorithm 2
shows pseudo-code for both steps in our k-cluster method.
5
摘要:

TowardsExplainingDistributionShiftsSeanKulinski1DavidI.Inouye1AbstractAdistributionshiftcanhavefundamentalconse-quencessuchassignalingachangeintheoperat-ingenvironmentorsignificantlyreducingtheac-curacyofdownstreammodels.Thus,understand-ingdistributionshiftsiscriticalforexaminingandhopefullymitigati...

展开>> 收起<<
Towards Explaining Distribution Shifts.pdf

共22页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:22 页 大小:5.95MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 22
客服
关注