Why Random Pruning Is All We Need to Start Sparse

2025-04-15 1 0 1.18MB 29 页 10玖币
侵权投诉
Why Random Pruning Is All We Need to Start Sparse
Advait Gadhikar 1Sohom Mukherjee 1Rebekka Burkholz 1
Abstract
Random masks define surprisingly effective
sparse neural network models, as has been shown
empirically. The resulting sparse networks can
often compete with dense architectures and state-
of-the-art lottery ticket pruning algorithms, even
though they do not rely on computationally ex-
pensive prune-train iterations and can be drawn
initially without significant computational over-
head. We offer a theoretical explanation of how
random masks can approximate arbitrary target
networks if they are wider by a logarithmic factor
in the inverse sparsity 1/log(1/sparsity). This
overparameterization factor is necessary at least
for 3-layer random networks, which elucidates
the observed degrading performance of random
networks at higher sparsity. At moderate to
high sparsity levels, however, our results imply
that sparser networks are contained within ran-
dom source networks so that any dense-to-sparse
training scheme can be turned into a computa-
tionally more efficient sparse to sparse one by
constraining the search to a fixed random mask.
We demonstrate the feasibility of this approach
in experiments for different pruning methods and
propose particularly effective choices of initial
layer-wise sparsity ratios of the random source
network. As a special case, we show theoreti-
cally and experimentally that random source net-
works also contain strong lottery tickets. Our
code is available at https://github.com/
RelationalML/sparse_to_sparse.
1. Introduction
The impressive breakthroughs achieved by deep learn-
ing have largely been attributed to the extensive over-
parametrization of deep neural networks, as it seems to
*Equal contribution 1CISPA Helmholtz Center for Informa-
tion Security, Saarbr¨
ucken, Germany. Correspondence to: Advait
Gadhikar <advait.gadhikar@cispa.de>.
Proceedings of the 40 th International Conference on Machine
Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright
2023 by the author(s).
have multiple benefits for their representational power and
optimization (Belkin et al.,2019). The resulting trend to-
wards ever larger models and datasets, however, imposes
increasing computational and energy costs that are difficult
to meet. This raises the question: Is this high degree of
overparameterization truly necessary?
Training general small-scale or sparse deep neural network
architectures from scratch remains a challenge for stan-
dard initialization schemes (Li et al.,2016;Han et al.,
2015). However, (Frankle & Carbin,2019) have recently
demonstrated that there exist sparse architectures that can
be trained to solve standard benchmark problems competi-
tively. According to their Lottery Ticket Hypothesis (LTH),
dense randomly initialized networks contain subnetworks
that can be trained in isolation to a test accuracy that is com-
parable with the one of the original dense network. Such
subnetworks, the lottery tickets (LTs), have since been ob-
tained by pruning algorithms that require computationally
expensive pruning-retraining iterations (Frankle & Carbin,
2019;Tanaka et al.,2020) or mask learning procedures
(Savarese et al.,2020;Sreenivasan et al.,2022b). While
these can lead to computational gains at training and infer-
ence time and reduce memory requirements (Hassibi et al.,
1993;Han et al.,2015), the real goal remains to identify
sparse trainable architectures before training, as this could
lead to significant computational savings. Yet, contempo-
rary pruning at initialization approaches (Lee et al.,2018;
Wang et al.,2020;Tanaka et al.,2020;Fischer & Burkholz,
2022;Frankle et al.,2021) achieve less competitive perfor-
mance. For that reason it is so remarkable that even iter-
ative state-of-the-art approaches struggle to outperform a
simple, computationally cheap, and data independent al-
ternative: random pruning at initialization (Su et al.,2020).
Liu et al. (2021) have provided systematic experimental ev-
idence for its ’unreasonable‘ effectiveness in multiple set-
tings, including complex, large scale architectures and data.
We explain theoretically why they can be effective by
proving that a randomly masked network can approximate
an arbitrary target network if it is wider by a logarith-
mic factor in its sparsity 1/log(1/sparsity). By deriv-
ing a lower bound on the required width of a random 1-
hidden layer network, we further show that this degree
of overparameterization is necessary in general. This im-
plies that sparse random networks have the universal func-
1
arXiv:2210.02412v2 [cs.LG] 31 May 2023
Why Random Pruning Is All We Need to Start Sparse
tion approximation property like dense networks and are
at least as expressive as potential target networks. How-
ever, it also highlights the limitations of random pruning
in case of extremely high sparsities, as the width require-
ment scales then approximately as 1/log(1/sparsity)
1/(1 sparsity)(see also Fig. 2for an example). In prac-
tice, we observe a similar degradation in performance for
high sparsity levels.
Even for moderate to high sparsities, the randomness of
the connections result in a considerable number of excess
weights that are not needed for the representation of a tar-
get network. This insight suggests that, on the one hand,
additional pruning could further enhance the sparsity of
the resulting neural network structure, as random masks
are likely not optimally sparse. On the other hand, any
dense-to-sparse training approach would not need to start
from a dense network but could also start training from a
sparser random network and thus be turned into a sparse-
to-sparse learning method. The main idea is visualized in
Figure 1. Sparse training with randomly masked (ER) networks:
A visual representation of the main implication of our theory -
sparse to sparse training can be effective by starting from a ran-
domly masked (ER) network.
Fig. 1and verified in extensive experiments with differ-
ent lottery ticket pruning and continuous sparsification ap-
proaches. Our main results could also be interpreted as the-
oretical justification for Dynamic Sparse Training (DST)
(Evci et al.,2020;Liu et al.,2021.;Bellec et al.,2018),
which prunes random networks of moderate sparsity. How-
ever, it further relies on edge rewiring steps that sometimes
require the computation of gradients of the corresponding
dense network (Evci et al.,2020). Our derived limitations
of random pruning indicate that this rewiring might be nec-
essary at extreme sparsities but likely not for moderately
sparse random starting points, as we also highlight in addi-
tional experiments.
As a special case of the main idea to prune random net-
works, we also consider strong lottery tickets (SLTs) (Zhou
et al.,2019;Ramanujan et al.,2020). These are sub-
networks of large, randomly initialized source networks,
which do not require any further training after pruning.
Theoretical (Malach et al.,2020;Pensia et al.,2020;Fis-
cher et al.,2021;da Cunha et al.,2022;Burkholz,2022a;b;
Burkholz et al.,2022) as well as empirical Ramanujan
et al. (2020); Zhou et al. (2019); Diffenderfer & Kailkhura
(2021); Sreenivasan et al. (2022a) existence proofs so far
have solely focused on pruning dense source networks. We
highlight the potential for computational resource savings
in the search for SLTs by proving their existence within
sparse random networks instead. The main component of
our results is Lemma 2.2, which extends subset sum ap-
proximations to the sparse random graph setting. This en-
ables the direct transfer of most SLT existence results for
different architectures and activation functions to sparse
source networks. Furthermore, we modify the algorithm
edge-popup (EP) (Ramanujan et al.,2020) to find SLTs ac-
cordingly which leads to the first sparse-to-sparse pruning
approach for SLTs, up to our knowledge. We demonstrate
in experiments that starting even at sparsities as high as 0.8
does not hamper the overall performance of EP.
Note that our general theory applies to any layerwise spar-
sity ratios of the random source network and we validate
this fact in various experiments on standard benchmark im-
age data and commonly used neural network architectures,
complementing results by Liu et al. (2021) for additional
choices of sparsity ratios. Our two proposals, balanced
and pyramidal sparsity ratios, seem to perform competi-
tively across multiple settings, especially, at higher sparsity
regimes.
Contributions
1. We prove that randomly pruned random networks are
sufficiently expressive and can approximate an arbi-
trary target network if they are wider by a factor of
1/log(1/sparsity). This overparametrization factor is
necessary in general, as our lower bound for univariate
target networks indicates.
2. Inspired by our proofs, we empirically demonstrate
that, without significant loss in performance, starting
any dense-to-sparse training scheme can be translated
into a sparse-to-sparse one by starting from a random
source network instead of a dense one.
3. As a special case, we also prove the existence of
Strong Lottery Tickets (SLTs) within sparse random
source networks, if the source network is wider than a
target by a factor 1/log(1/sparsity). Our modification
of the edge-popup (EP) algorithm (Ramanujan et al.,
2020) leads to the first sparse-to-spare SLT pruning
method, which validates our theory and highlights po-
tential for computational savings.
4. To demonstrate that our theory applies to various
choices of sparsity ratios, we introduce two addi-
tional proposals that outperform state-of-the-art ones
2
Why Random Pruning Is All We Need to Start Sparse
on multiple benchmarks and are thus promising can-
didates for starting points of sparse-to-sparse learning
schemes.
1.1. Related Work
Algorithms to prune neural networks for unstructured spar-
sity can be broadly categorized into two groups, pruning
after training and pruning before (or during) training. The
first group of algorithms that prune after training are effec-
tive in speeding up inference, but they still rely on a com-
putationally expensive training procedure (Hassibi et al.,
1993;LeCun et al.,1989;Molchanov et al.,2016;Dong
et al.,2017;Yu et al.,2022). The second group of al-
gorithms prune at initialization (Lee et al.,2018;Wang
et al.,2020;Tanaka et al.,2020;Sreenivasan et al.,2022b;
de Jorge et al.,2020) or follow a computationally expensive
cycle of pruning and retraining for multiple iterations (Gale
et al.,2019;Savarese et al.,2020;You et al.,2019;Fran-
kle & Carbin,2019;Renda et al.,2019). These methods
find trainable subnetworks also known as Lottery Tickets
(Frankle & Carbin,2019). Single shot pruning approaches
are computationally cheaper but are susceptible to prob-
lems like layer collapse which render the pruned network
untrainable (Lee et al.,2018;Wang et al.,2020). Tanaka
et al. (2020) address this issue by preserving flow in the
network through their scoring mechanism. The best per-
forming sparse networks are still obtained by expensive it-
erative pruning methods like Iterative Magnitude Pruning
(IMP), Iterative Synflow (Frankle & Carbin,2019;Fischer
& Burkholz,2022) or continuous sparsification methods
(Sreenivasan et al.,2022b;Savarese et al.,2020;Kusupati
et al.,2020;Louizos et al.,2018).
However, Su et al. (2020) found that randomly pruned
masks can outperform expensive iterative pruning strate-
gies in different situations. Inspired by this finding, Gol-
ubeva et al. (2021); Chang et al. (2021) have hypothesized
that sparse overparameterized networks are more effective
than smaller networks with the same number of parameters.
Liu et al. (2021) have further demonstrated the competi-
tiveness of random masks for different data independent
choices of layerwise sparsity ratios across a wide range of
neural network architectures and datasets, including com-
plex ones. Our analysis identifies the conditions under
which the effectiveness of random masks is reasonable. We
show that a sparse random source network can approximate
a target network if it is wider by a factor proportional to the
inverse log sparsity. Complementing experiments by Liu
et al. (2021), we highlight that random masks are compet-
itive for various choices of layerwise sparsity ratios. How-
ever, we also show that their randomness also likely in-
duces potential for further pruning.
We build on the lottery ticket existence theory (Malach
et al.,2020;Pensia et al.,2020;Orseau et al.,2020;Fis-
cher et al.,2021;Burkholz et al.,2022;Burkholz,2022b;
Ferbach et al.,2022) to prove that sparse random source
networks actually contain strong lottery tickets (SLTs) if
their width exceeds a value that is proportional to the width
of a target network. This theory has been inspired by exper-
imental evidence for SLTs (Ramanujan et al.,2020;Zhou
et al.,2019;Diffenderfer & Kailkhura,2021;Sreenivasan
et al.,2022a). The underlying algorithm edge-popup (Ra-
manujan et al.,2020) finds SLTs by training scores for each
parameter of the dense source network and is thus computa-
tionally as expensive as dense training. We show that train-
ing smaller random sparse source networks is sufficient,
thus, reducing effectively the computational requirements
for finding SLTs.
However, our theory suggests that random ER networks
face a fundamental limitation at extreme sparsities, as
the overparameterization factor scales in this regime as
1/log(1/(sparsity)) 1/(1 sparsity). This shortcom-
ing could be potentially addressed by targeted rewiring of
random edges with Dynamical Sparse Training (DST) that
starts pruning from an ER network (Liu et al.,2021.;Mo-
canu et al.,2018;Yuan et al.,2021). So far, sparse-to-
sparse training methods like Evci et al. (2020); Dettmers &
Zettlemoyer (2019) still require dense gradients for there
edge rewiring operation. Zhou et al. (2021) obtain sparse
training by estimating a sparse gradient using two forward
passes. We empirically show that in light of the expressive
power of random networks, we can also achieve sparse-to-
sparse training by simply constraining any pruning method
or gradient to a fixed initial sparse random mask.
2. Expressiveness of Random Networks
Our theoretical investigations of the next section have the
purpose to explain why the effectiveness of random net-
works is reasonable given their high expressive power. We
show that we can approximate any target network with the
help of a random network, provided that it is wider by a
logarithmic factor in the inverse sparsity. First, the only
constraint that we face in our explicit construction of a rep-
resentative subnetwork is that edges are randomly available
or unavailable. But we can choose the remaining network
parameters, i.e., the weights and biases, in such a way that
we can optimally represent a target network. As common
in results on expressiveness and representational power,
we make statements about the existence of such parame-
ters, not necessarily, if they can be found algorithmically.
In practice, the parameters would usually be identified by
standard neural network training or prune-train iterations.
Our experiments validate that this is actually feasible in ad-
dition to plenty of other experimental evidence (Su et al.,
2020;Ma et al.,2021;Liu et al.,2021). Second, we prove
3
Why Random Pruning Is All We Need to Start Sparse
the existence of strong lottery tickets (SLTs), which as-
sumes that we have to approximate the target parameters
by pruning the sparse random source network. Up to our
knowledge, we are the first to provide experimental and
theoretical evidence for the feasibility of this case.
Background, Notation, and Proof-Setup Let x=
(x1, x2, .., xd)[a1, b1]dbe a bounded d-dimensional in-
put vector, where a1, b1Rwith a1< b1.f:[a1, b1]d
RnLis a fully-connected feed forward neural network with
architecture (n0, n1, .., nL), i.e., depth Land nlneurons
in Layer l. Every layer l∈ {1,2, .., L}computes neuron
states x(l)=ϕh(l),h(l)=W(l1)x(l1) +b(l1).
h(l)is called the pre-activation, W(l)Rnl×nl1is the
weight matrix and b(l)is the bias vector. We also write
f(x;θ)to emphasize the dependence of the neural network
on its parameters θ= (W(l),b(l))L
l=1. For simplicity, we
restrict ourselves to the common ReLU ϕ(x) = max{x, 0}
activation function, but most of our results can be eas-
ily extended to more general activation functions as in
(Burkholz,2022b;a). In addition to fully-connected layers,
we also consider convolutional layers. For a convenient
notation, without loss of generality, we flatten the weight
tensors so that W(l)
TRcl×cl1×klwhere cl, cl1, klare
the output channels, input channels and filter dimension re-
spectively. For instance, a 2-dimensional convolution on
image data would result in kl=k
1,lk
2,l, where k
1,l, k
2,l
define the filter size.
We distinguish three kinds of neural networks, a target
network fT, a source network fS, and a subnetwork fP
of fS.fTis approximated or exactly represented by fP,
which is obtained by masking the parameters of the source
fS.fSis said to contain a SLT if this subnetwork does
not require further training after obtaining the mask (by
pruning). We assume that fThas depth Land parameters
W(l)
T,b(l)
T, nT,l, mT,lare the weight, bias, number of
neurons and number of nonzero parameters of the weight
matrix in Layer l∈ {1,2, .., L}. Note that this implies
mlnlnl1. Similarly, fShas depth L+ 1 with param-
eters W(l)
S,b(l)
S, nS,l, mS,lL
l=0. Note that lranges from
0to Lfor the source network, while it only ranges from
1to Lfor the target network. The extra source network
layer l= 0 accounts for an extra layer that we need in our
construction to prove existence.
ER Networks Even though common, the terminology ’ran-
dom network‘ is imprecise with respect to the random dis-
tribution from which a graph is drawn. In line with gen-
eral graph theory, we therefore use the term Erd¨
os-R´
enyi
(ER) (Erdos et al.,1960) network in the following. An
ER neural network fER ER(p)is characterized by lay-
erwise sparsity ratios pl. An ER source fER is defined as
a subnetwork of a complete source network using a binary
mask S(l)
ER ∈ {0,1}nl×nl1or S(l)
ER ∈ {0,1}nl×nl1×kl
for every layer. The mask entries are drawn from indepen-
dent Bernoulli distributions with layerwise success proba-
bility pl>0, i.e., s(l)
ij,ER Ber(pl). The random pruning
is performed initially with negligible computational over-
head and the mask stays fixed during training. Note that pl
is also the expected density of that layer. The overall ex-
pected density of the network is given as p=Plmlpl
Pkmk=
1sparsity. In case of uniform sparsity, pl=p, we
also write ER(p)instead of ER(p). An ER network is de-
fined as fER =fS(x;W·SER). Different to conventional
SLT existence proofs (Ramanujan et al.,2020), we refer to
fER ER(p)as the source network, and show that the SLT
is contained within this ER network. The SLT is then de-
fined by the mask SP, which is a subnetwork of SER, i.e., a
zero entry sij,ER = 0 implies also a zero in sij,P= 0, but
the converse is not true. We skip the subscripts if the nature
of the mask is clear from the context. In the following anal-
ysis of expressiveness in ER networks, we continue to use
of SER and SPto denote a random ER source network and
a sparse subnetwork within the ER network respectively.
Sparsity Ratios There are plenty of reasonable choices for
the layerwise sparsity ratios and thus ER probabilities pl.
Our theory applies to all of them. The optimal choice for
a given source network architecture depends on the target
network and thus the solution to a learning problem, which
is usually unknown a-priori in practice. To demonstrate
that our theory holds for different approaches, we investi-
gate the following layerwise sparsity ratios in experiments.
The simplest baseline is a globally uniform choice pl=p.
Liu et al. (2021) have compared this choice in extensive
experiments with their main proposal, ERK, which assigns
plnin+nout
ninnout to a linear and plcl+cl1+kl
clcl1kl(Mocanu
et al.,2017) to a convolutional layer. In addition, we pro-
pose a pyramidal and balanced approach, which are visu-
alized in Appendix A.15.
Pyramidal: This method emulates a property of pruned net-
works that are obtained by IMP (Frankle & Carbin,2019)
i.e. the layer densities decay with increasing depth of the
network. For a network of depth L, we use pl= (p1)l, pl
(0,1) so that Pl=L
l=1 plml
Pl=L
l=1 ml=p. Given the architecture, we
use a polynomial equation solver (Harris et al.,2020) to
obtain p1for the first layer such that p1(0,1).
Balanced: The second layerwise sparsity method aims to
maintain the same number of parameters in every layer for
a given network sparsity pand source network architec-
ture. Each neuron has a similar in- and out-degree on aver-
age. Every layer has x=p
LPl=L
l=1 mlnonzero parameters.
Such an ER network can be realized with pl=x/ml. In
case that xml, we set pl= 1.
4
Why Random Pruning Is All We Need to Start Sparse
2.1. General Expressiveness of ER Networks
Our main goal in this section is to derive probabilistic state-
ments about the existence of edges in an ER source network
that enable us to approximate a given target network. As
every connection in the source network only exists with a
probability pl, for each target weight, we need to create
multiple candidate edges, of which at least one is nonzero
with high enough probability. This can be achieved by
ensuring that each target edge has multiple potential start-
ing points in the ER source network. Our construction re-
alizes this idea with multiple copies of each neuron in a
layer. The required number of neuron copies depends on
the sparsity of the ER source network and introduces an
overparametrization factor pertaining to the width of the
network. To create multiple copies of input neurons as
well, our construction relies on one additional layer in the
source network in comparison with a target network, as vi-
sualized in Fig. 4in the Appendix. We first explain the
construction for a single target layer and extend it after-
wards to deeper architectures.
Single Hidden Layer Targets We start with constructing
a single hidden layer fully-connected target network with a
subnetwork of a random ER source network that consists of
one more layer. Our proof strategy is visually explained by
Fig. 4in the Appendix. The following theorem states the
precise width requirement that our construction requires.
Theorem 2.1 (Single Hidden Layer Target Construction).
Assume that a single hidden-layer fully-connected target
network fT(x) = W(2)
Tϕ(W(1)
Tx+b(1)
T) + b(2)
T, an al-
lowed failure probability δ(0,1), source densities p
and a 2-layer ER source network fSER(p)with widths
nS,0=q0d, nS,1=q1nT,1, nS,2=q2nT,2are given. If
q01
log(1/(1 p1)) log 2mT,1q1
δ,
q11
log(1/(1 p2)) log 2mT,2
δand q2= 1
then with probability 1δ, the random source network fS
contains a subnetwork SPsuch that fS(x,W·SP) = fT.
Proof Outline:
The key idea is to create multiple copies (blocks in
Fig. 4(b) in the Appendix) in the source network for each
target neuron such that every target link is realized by point-
ing to at least one of these copies in the ER source. To
create multiple candidates of input neurons, we create an
univariate first layer in the source network as explained
in Fig. 4. In the appendix, we derive the corresponding
weight and bias parameters of the source network so that it
can represent the target network exactly. Naturally, many
of the available links will receive zero weights if they are
not needed in the specific construction but are required
for a high enough probability that at least one weight can
be set to nonzero. Our main task in the proof is to es-
timate the probability that we can find representatives of
all target links in the ER source network, i.e., every neu-
ron in Layer l= 1 has at least one edge to every block
in l= 0 of size q0, as shown in Fig. 4(b). This proba-
bility is given by (1 (1 p1)q0)mT ,1q1. For the second
layer, we repeat a similar argument to bound the proba-
bility (1 (1 p2)q1)mT,2with q2= 1, since we do not
require multiple copies of the output neurons. Bounding
this probability by 1δcompletes the proof, as detailed in
Appendix A.3.
Deep Target Networks Theorem 2.1 shows that q0and
q1depend on 1/log(1/sparsity). We now generalize the
idea to create multiple copies of target neurons in every
layer to a fully connected network of depth L(proofs are
in Appendix A.4) and convolutional networks of depth L
as stated in Appendix A.5, which yields a similar result as
above. The additional challenge of the extension is to han-
dle the dependencies of layers, as the construction of every
layer needs to be feasible.
Theorem 2.2 (ER networks can represent L-layer target
networks.).Given a fully-connected target network fTof
depth L,δ(0,1), source densities pand a L+ 1-layer
ER source network fSER(p)with widths nS,0=q0d
and nS,l =qlnT,l, l ∈ {1,2, .., L}, where
ql1
log(1/(1 pl+1)) log LmT,l+1ql+1
δ
for l∈ {0,1, .., L 1}and qL= 1,
then with probability 1δthe random source network fS
contains a subnetwork SPsuch that fS(x,W·SP) = fT.
Lower Bound on Overparameterization While our ex-
istence results prove that ER networks have the univer-
sal function approximation property like dense neural net-
works, in order to achieve that, our construction requests
a considerable amount of overparametrization in compari-
son with a dense target network. In particularly extremely
sparse ER networks seem to face a natural limitation, since
for sparsities 1p0.9, the overparameterization factor
scales approximately as 1/log(1/(1p)) 1/p. Fig. 2vi-
sualizes how this scaling becomes problematic for increas-
ing sparsity. The next theorem establishes that, unfortu-
nately, we cannot expect to get around this 1/log(1/(1
pl)) limitation.
Theorem 2.3 (Lower bound on Overparametrization in
ER Networks).There exist univariate target networks
fT(x) = ϕ(wT
Tx+bT)that cannot be represented by a
random 1-hidden-layer ER source network fSER(p)
with probability at least 1δ, if its width is nS,1<
1
log(1/(1p)) log 1
1(1δ)1/d .
5
摘要:

WhyRandomPruningIsAllWeNeedtoStartSparseAdvaitGadhikar1SohomMukherjee1RebekkaBurkholz1AbstractRandommasksdefinesurprisinglyeffectivesparseneuralnetworkmodels,ashasbeenshownempirically.Theresultingsparsenetworkscanoftencompetewithdensearchitecturesandstate-of-the-artlotteryticketpruningalgorithms,eve...

展开>> 收起<<
Why Random Pruning Is All We Need to Start Sparse.pdf

共29页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:学术论文 价格:10玖币 属性:29 页 大小:1.18MB 格式:PDF 时间:2025-04-15

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 29
客服
关注