Measures of Information Reflect Memorization Patterns

2025-05-02 0 0 1.86MB 22 页 10玖币
侵权投诉
Measures of Information Reflect
Memorization Patterns
Rachit BansalX
Delhi Technological University
racbansa@gmail.com
Danish PruthiY
Amazon Web Services
danish@hey.com
Yonatan BelinkovZ
Technion – Israel Institute of Technology
belinkov@technion.ac.il
Abstract
Neural networks are known to exploit spurious artifacts (or shortcuts) that co-occur
with a target label, exhibiting heuristic memorization. On the other hand, networks
have been shown to memorize training examples, resulting in example-level memo-
rization. These kinds of memorization impede generalization of networks beyond
their training distributions. Detecting such memorization could be challenging, of-
ten requiring researchers to curate tailored test sets. In this work, we hypothesize—
and subsequently show—that the diversity in the activation patterns of different
neurons is reflective of model generalization and memorization. We quantify the
diversity in the neural activations through information-theoretic measures and find
support for our hypothesis in experiments spanning several natural language and
vision tasks. Importantly, we discover that information organization points to the
two forms of memorization, even for neural activations computed on unlabeled in-
distribution examples. Lastly, we demonstrate the utility of our findings for the prob-
lem of model selection. The associated code and other resources for this work are
available at https://rachitbansal.github.io/information-measures.
1 Introduction
Current day deep learning networks are limited in their ability to generalize across different domains
and settings. Prior studies found that these networks rely on spurious artifacts that are correlated with
a target label (Schölkopf et al.,2012;Lapuschkin et al.,2019;Geirhos et al.,2019,2020, inter alia).
We refer to learning of such artifacts (also known as heuristics or shortcuts) as heuristic memorization.
Further, neural networks can also memorize individual training examples and their labels; for instance,
when a subset of the examples are incorrectly labeled (Zhang et al.,2017;Arpit et al.,2017;Tänzer
et al.,2021). We refer to this behavior as example-level memorization. A large body of past work
has established that these facets of memorization pose a threat to generalization, especially in out-of-
distribution (OOD) scenarios where the memorized input features and corresponding target mappings
do not hold (Ben-David et al.,2010;Wang et al.,2021b;Hendrycks et al.,2021a;Shen et al.,2021). To
simulate such OOD distributions, however, researchers are required to laboriously collect specialized
and labeled datasets to measure the extent of suspected fallacies in models. While these sets make it
possible to assess model behavior over a chosen set of features, the larger remaining features remain
XWork done during a visit at the Technion, Israel. The author is now at Google Research India.
YWork done while at Carnegie Mellon University, prior to joining Amazon.
ZSupported by the Viterbi Fellowship in the Center for Computer Engineering at the Technion.
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.09404v4 [cs.LG] 1 Feb 2024
Activation
Neuron Index
High
inter-neuron and intra-neuron diversity
(a) (c)
(d) (e) (f)
(g) (h)
Heuristic
Memorization
Example-level
Memorization
Neuron Entropy
Mutual Information
Heuristic
Memorization
Example-level
Memorization
Density of Neuron Pairs
Examples
Neuron Index
Low
inter-neuron and intra-neuron diversity
Neuron Index
(b)
Induced
heuristic
Neuron activations for the base task
Neuron Entropy as a proxy to intra-neuron diversity Mutual Information as a proxy to inter-neuron diversity
Base Base
Shuffled
labels
Base
Task
Figure 1: (a) A toy setup of separating concentric circles; (b) An additional feature spuriously
simplifies the task, inciting heuristic memorization; (c) Shuffled target labels induce example-level
memorization; (d) Neuron activations for a two-layered feed-forward network trained for the base task
in (a); (e) Activation patterns for the network reflect low intra-neuron and inter-neuron diversity when
trained on (b); (f) High intra-neuron and inter-neuron diversity is seen when the network is trained
on (c); (g)Entropy acts as a proxy to intra-neuron diversity; (h)Mutual Information acts as a proxy
to inter-neuron diversity. Distinguishable patterns for the three networks are seen in (g) and (h).
hard to identify and study. Moreover, these sets are truly extrinsic in nature, necessitating the use of
performance measures, which in turn lack interpretability and are not indicative of internal workings
that manifest certain model behaviors. These considerations motivate evaluation strategies that are
intrinsic to a network and indicate model generalization while not posing practical bottlenecks in terms
of specialized labeled sets. Here, we study information organization as one such potential strategy.
In this work, we posit that organization of information across internal activations of a network could
be indicative of memorization. Consider a sample task of separating concentric circles, illustrated
in Figure 1a. A two-layered feed-forward network can learn the circular decision boundary for this
task. However, if the nature of this learning task is changed, the network may resort to memorization.
When a spurious feature is introduced in this dataset such that its value (
+
/
) correlates to the
label (
0
/
1
) (Figure 1b), the network memorizes the feature-to-label mapping, reflected in a uniform
activation pattern across neurons (Figure 1e). In contrast, when labels for the original set are shuffled
(Figure 1c), the same network memorizes individual examples during training and shows a high
amount of diversity in its activation patterns (Figure 1f). This example demonstrates how memorizing
behavior is observed through diversity in neuron activations.
We formalize the notion of diversity across neuron activations through two measures: (i) intra-neuron
diversity: the variation of activations for a neuron across a set of examples, and (ii) inter-neuron
diversity: the dissimilarity between pairwise neuron activations on the same set of examples. We
hypothesize that the nature of these quantities for two networks could point to underlying differences
in their generalizing behavior. In order to quantify intra-neuron and inter-neuron diversity, we adopt
the information-theoretic measures of entropy and mutual information (MI), respectively.
Throughout this work, we investigate if diversity across neural activations (§2) reflects model
generalizability. We compare networks with varying levels of heuristic (§3) or example-level (§4)
memorization across a variety of settings: synthetic setups based on the
IMDb
(Maas et al.,2011)
2
and
MNIST
(Lecun et al.,1998) datasets for both memorization types, as well as naturally occurring
scenarios of gender bias on
Bias-in-Bios
(De-Arteaga et al.,2019) and OOD image classification
on
NICO
(Zhang et al.,2022). We find that the information measures consistently capture differences
among networks with varying degrees of memorization: Low entropy and high MI are characteristic of
networks that show heuristic memorization, while high entropy and low MI are indicative of example-
level memorization. Lastly, we evaluate these measures from the viewpoint of model selection and
note strong correlations to rankings from domain-specific evaluation metrics (§5).
2 Methods
As per the data processing inequality (Beaudry & Renner,2012), a part of the neural network (referred
to as the encoder) compresses the most relevant information of a given input
X
, into a representation
H
. This compressed information is processed by a classification head (or, a decoder) to produce
an output
Y
corresponding to the given input. We hypothesize that the organization of information
across neurons of the encoder is indicative of model generalization. We study two complementary
properties that capture this information organization for a given network:
(i)
Intra-neuron diversity: How do activations of a given neuron vary across different input
examples. We measure the entropy of neural activations (across examples) as a proxy.
(ii)
Inter-neuron diversity: How unique is the activation of a neuron compared to other neurons.
We quantify this via mutual information between activations of pairwise neurons.
Below, we discuss the information measures formally.
2.1 Information Measures
For any given encoder (consisting of
N
neurons) that maps the input to a dense hidden representation,
we denote the activation of the
ith
neuron as a random variable,
Ai∈ {a1
i, . . . , aS
i}
, where each
measurement is an activation over an example from a set of size
S
. The probability over this
continuous activation space is computed by binning it into discrete ranges (Darbellay & Vajda,1999),
and we denote each discretized activation value as
ˆa
. Importantly, the set of examples on which the
activations are computed come from a distribution that is similar to the underlying training set itself.
Entropy We measure the Shannon entropy for each neuron in the concerned network, as a proxy of
intra-neuron diversity. Following the definition of Shannon entropy, this is given as:
H(Ai) = E
ˆas
iAi
[has
i)] =
Nbins
X
j=1
paj
i) log( 1
paj
i))(1)
Mutual Information We compute the mutual information (MI) between underlying neurons as
a proxy to inter-neuron diversity. Specifically, we compute the MI between all neuron pairs in the
network.1Thus, the set of MI values I(Ai)for a particular neuron Ai, is given as:
I(Ai) = {I(Ai;A1), . . . , I(Ai;AN)}(2)
where,
I(X;Y)
depicts the MI between variables
X
and
Y
. Unless stated otherwise, this
I(Ai)
is
computed i∈ {1, . . . , N}, resulting into a square matrix of size (N×N).
This process of computing the information measures for a network on a given set of examples is
summarized in Algorithm 1. Further details on the computation are given in appendix A.
2.2 Toy Setup: Concentric Circles
Here, we briefly discuss the information-theoretic metrics for the example of concentric circles from
the introduction (Figure 1). To recap, we consider a setup to compare networks showing the two
forms of memorization and observe discernible differences in their activation patterns: heuristic
1
In principle, we would compute MI across neuron sets; we approximate this through individual neuron pairs.
3
Algorithm 1 Computation of information measures. Algorithmic procedures ENTROPY and MI are
specified by algorithms 2and 3in appendix A.
1: A1, . . . , AN← {f(xi)}S
i=1 Computing activations for all neurons
2: H← {};I← {} Initiating computations for Entropy and MI
3: for i∈ {1, . . . , N}do Iterating over the set of neurons
4: Ii← {} Initiating MI for a particular neuron
5: HiENTROPY(Ai)Following Equation 1and Algorithm 2
6: for j∈ {1, . . . , N}do Inner loop over the set of neurons
7: IiIiLMI(Ai, Aj)Following Equation 3and Algorithm 3
8: end for
9: HHLHi
10: IILIiFollowing Equation 2
11: end for
memorization corresponds to low intra-neuron and inter-neuron diversity, while example-level
memorization corresponds to high diversity (Figures 1e and 1f). We expect that this difference in
diversity would be captured through the above defined information measures.
Figure 1g presents the distribution of entropy values for each of the three networks with varying
generalization behaviors. Throughout this work, we visualize this distribution of entropy using similar
box-plots, where a black marker within the boxes depicts the median of the distribution and a notch
neighboring this marker depicts the
95%
confidence interval around the median. We observe that
entropy for the network exhibiting heuristic memorization is distributed around a lower point than
the others, whereas entropy for the network with example-level memorization is higher.
Furthermore, Figure 1h shows the distribution of MI for the three networks. To interpret the
distribution of MI (an
N×N
square matrix), we fit a Gaussian mixture model over all values and
visualize it through a density plot, where the density (y-axis) at each point corresponds to the number
of neurons pairs that exhibit that MI value (x-axis). Larger peaks in these density plots suggest a
large number of neurons pairs are concentrated in that region. Interestingly, we see such peaks for
the three networks at distinct values of MI. For the network showing example-level memorization
(high inter-neuron diversity), most of the neuron pairs show low values of MI. In contrast, heuristic
memorization (low inter-neuron diversity) has high neuron pair density for higher MI values.2
Based on these findings, we formulate two hypotheses, summarized in Table 1:
H1
Networks exhibiting heuristic memorization
would show low inter- and intra-neuron diversity,
reflected through low entropy and high MI values.
H2
Networks exhibiting example-level memorization
would show high inter- and intra-neuron diversity,
reflected through high entropy and low MI values.
Table 1: Summarizing our hypotheses.
Memorization Diversity
Intra-neuron
(Entropy)
Inter-neuron
(MI1)
Heuristic
Example-level
3 Heuristic Memorization
Here, we study different networks with varying degrees of heuristic memorization, and examine if
the information measures—aimed to capture neuron diversity—indicate the extent of memorization.
3.1 Semi-synthetic Setups
We synthetically introduce spurious artifacts in the training examples such that they co-occur with
target labels. Networks trained on such a set are prone to memorizing these artifacts. The same
correlations with an artifact do not hold in the validation sets. To obtain a set of networks with varying
2
This difference in neuron activation patterns for the two memorizing sets could be caused by several factors,
including functional complexity (Lee et al.,2020): Functions that encode individual data points (as in example-
level memorization) need to be much more complex than functions that learn shortcuts (heuristic memorization).
We make a comparison with standard complexity measures in appendix C.4 and observe that our information
measures correlate more strongly with generalization performance—especially for heuristic memorization.
4
Figure 2: The relation between entropy of neural activations and heuristic memorization. For both the
setups, networks trained on higher
α
show higher heuristic memorization (as depicted by the dipping
model accuracy line), accompanied with lower entropy values.
Figure 3: Distribution of mutual information (MI) of pairs of neurons for networks with varying
heuristic memorization. For both settings, networks trained on training sets with larger amounts of
spurious correlations (α) exhibit higher mutual information across their neuron pairs.
degrees of this heuristic memorization, we consider a parameter
α
that controls the fraction of the
training examples for which the spurious correlation holds true. We consider the following setups:
Colored MNIST In this setting, the MNIST dataset (Lecun et al.,1998) is configured such
that a network trained on this set simply learns to identify the color of images and not the digits
themselves (Arjovsky et al.,2019). Particularly, digits
0
4
are grouped as one label while
5
9
as the
other, and images for these labels are colored green and red, respectively. For this setup, we train
multi-layer perceptron (MLP) networks for varying values of
α
, which corresponds to the fraction of
training instances that abide to the color-to-label correlation. The considered values of
α
and other
details for this setup are given in appendix B.1.
Sentiment Adjectives In this setup, we sub-sample examples from the
IMDb
dataset (Maas et al.,
2011) that contain at least one adjective from a list of positive and negative adjectives. Then, examples
that contain any of the positive adjectives (“good”, “great”, etc.) are marked with the positive label,
whereas ones that contain any negative adjectives (“bad”, “awful”, etc.) are labeled as negative.
We exclude examples that contain adjectives from both lists. The motivation to use this setup is
to introduce heuristics in the form of adjectives in the training set. We fine-tune DistilBERT-base
models (Sanh et al.,2019) on this task for different values of
α
(fraction of examples that obey the
heuristic). The full set of adjectives considered and further details are outlined in appendix B.2.
Results: Through these experiments, we first note that low entropy across neural activations
indicates heuristic memorization in networks. This is evident from Figure 2, where we see that (1)
as we increase
α
the validation performance decreases, indicating heuristic memorization (see the
solid line in the plots); and (2) with an increase in this heuristic memorization, we see lower entropy
across neural activations. We show the entropy values of neural activations for the
3
layers of an MLP
5
摘要:

MeasuresofInformationReflectMemorizationPatternsRachitBansalXDelhiTechnologicalUniversityracbansa@gmail.comDanishPruthiYAmazonWebServicesdanish@hey.comYonatanBelinkovZTechnion–IsraelInstituteofTechnologybelinkov@technion.ac.ilAbstractNeuralnetworksareknowntoexploitspuriousartifacts(orshortcuts)thatc...

展开>> 收起<<
Measures of Information Reflect Memorization Patterns.pdf

共22页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:22 页 大小:1.86MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 22
客服
关注