Measures of Information Reflect Memorization Patterns

2025-05-02 1 0 1.86MB 22 页 10玖币

侵权投诉

Measures of Information Reﬂect

Memorization Patterns

Rachit BansalX

Delhi Technological University

racbansa@gmail.com

Danish PruthiY

Amazon Web Services

danish@hey.com

Yonatan BelinkovZ

Technion – Israel Institute of Technology

belinkov@technion.ac.il

Abstract

Neural networks are known to exploit spurious artifacts (or shortcuts) that co-occur

with a target label, exhibiting heuristic memorization. On the other hand, networks

have been shown to memorize training examples, resulting in example-level memo-

rization. These kinds of memorization impede generalization of networks beyond

their training distributions. Detecting such memorization could be challenging, of-

ten requiring researchers to curate tailored test sets. In this work, we hypothesize—

and subsequently show—that the diversity in the activation patterns of different

neurons is reﬂective of model generalization and memorization. We quantify the

diversity in the neural activations through information-theoretic measures and ﬁnd

support for our hypothesis in experiments spanning several natural language and

vision tasks. Importantly, we discover that information organization points to the

two forms of memorization, even for neural activations computed on unlabeled in-

distribution examples. Lastly, we demonstrate the utility of our ﬁndings for the prob-

lem of model selection. The associated code and other resources for this work are

available at https://rachitbansal.github.io/information-measures.

1 Introduction

Current day deep learning networks are limited in their ability to generalize across different domains

and settings. Prior studies found that these networks rely on spurious artifacts that are correlated with

a target label (Schölkopf et al.,2012;Lapuschkin et al.,2019;Geirhos et al.,2019,2020, inter alia).

We refer to learning of such artifacts (also known as heuristics or shortcuts) as heuristic memorization.

Further, neural networks can also memorize individual training examples and their labels; for instance,

when a subset of the examples are incorrectly labeled (Zhang et al.,2017;Arpit et al.,2017;Tänzer

et al.,2021). We refer to this behavior as example-level memorization. A large body of past work

has established that these facets of memorization pose a threat to generalization, especially in out-of-

distribution (OOD) scenarios where the memorized input features and corresponding target mappings

do not hold (Ben-David et al.,2010;Wang et al.,2021b;Hendrycks et al.,2021a;Shen et al.,2021). To

simulate such OOD distributions, however, researchers are required to laboriously collect specialized

and labeled datasets to measure the extent of suspected fallacies in models. While these sets make it

possible to assess model behavior over a chosen set of features, the larger remaining features remain

XWork done during a visit at the Technion, Israel. The author is now at Google Research India.

YWork done while at Carnegie Mellon University, prior to joining Amazon.

ZSupported by the Viterbi Fellowship in the Center for Computer Engineering at the Technion.

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.09404v4 [cs.LG] 1 Feb 2024

Activation

Neuron Index

High

inter-neuron and intra-neuron diversity

(a) (c)

(d) (e) (f)

(g) (h)

Heuristic

Memorization

Example-level

Memorization

Neuron Entropy

Mutual Information

Heuristic

Memorization

Example-level

Memorization

Density of Neuron Pairs

Examples

Neuron Index

Low

inter-neuron and intra-neuron diversity

Neuron Index

(b)

Induced

heuristic

Neuron activations for the base task

Neuron Entropy as a proxy to intra-neuron diversity Mutual Information as a proxy to inter-neuron diversity

Base Base

Shuffled

labels

Base

Task

Figure 1: (a) A toy setup of separating concentric circles; (b) An additional feature spuriously

simpliﬁes the task, inciting heuristic memorization; (c) Shufﬂed target labels induce example-level

memorization; (d) Neuron activations for a two-layered feed-forward network trained for the base task

in (a); (e) Activation patterns for the network reﬂect low intra-neuron and inter-neuron diversity when

trained on (b); (f) High intra-neuron and inter-neuron diversity is seen when the network is trained

on (c); (g)Entropy acts as a proxy to intra-neuron diversity; (h)Mutual Information acts as a proxy

to inter-neuron diversity. Distinguishable patterns for the three networks are seen in (g) and (h).

hard to identify and study. Moreover, these sets are truly extrinsic in nature, necessitating the use of

performance measures, which in turn lack interpretability and are not indicative of internal workings

that manifest certain model behaviors. These considerations motivate evaluation strategies that are

intrinsic to a network and indicate model generalization while not posing practical bottlenecks in terms

of specialized labeled sets. Here, we study information organization as one such potential strategy.

In this work, we posit that organization of information across internal activations of a network could

be indicative of memorization. Consider a sample task of separating concentric circles, illustrated

in Figure 1a. A two-layered feed-forward network can learn the circular decision boundary for this

task. However, if the nature of this learning task is changed, the network may resort to memorization.

When a spurious feature is introduced in this dataset such that its value (

−

) correlates to the

label (

) (Figure 1b), the network memorizes the feature-to-label mapping, reﬂected in a uniform

activation pattern across neurons (Figure 1e). In contrast, when labels for the original set are shufﬂed

(Figure 1c), the same network memorizes individual examples during training and shows a high

amount of diversity in its activation patterns (Figure 1f). This example demonstrates how memorizing

behavior is observed through diversity in neuron activations.

We formalize the notion of diversity across neuron activations through two measures: (i) intra-neuron

diversity: the variation of activations for a neuron across a set of examples, and (ii) inter-neuron

diversity: the dissimilarity between pairwise neuron activations on the same set of examples. We

hypothesize that the nature of these quantities for two networks could point to underlying differences

in their generalizing behavior. In order to quantify intra-neuron and inter-neuron diversity, we adopt

the information-theoretic measures of entropy and mutual information (MI), respectively.

Throughout this work, we investigate if diversity across neural activations (§2) reﬂects model

generalizability. We compare networks with varying levels of heuristic (§3) or example-level (§4)

memorization across a variety of settings: synthetic setups based on the

IMDb

(Maas et al.,2011)

and

MNIST

(Lecun et al.,1998) datasets for both memorization types, as well as naturally occurring

scenarios of gender bias on

Bias-in-Bios

(De-Arteaga et al.,2019) and OOD image classiﬁcation

NICO

(Zhang et al.,2022). We ﬁnd that the information measures consistently capture differences

among networks with varying degrees of memorization: Low entropy and high MI are characteristic of

networks that show heuristic memorization, while high entropy and low MI are indicative of example-

level memorization. Lastly, we evaluate these measures from the viewpoint of model selection and

note strong correlations to rankings from domain-speciﬁc evaluation metrics (§5).

2 Methods

As per the data processing inequality (Beaudry & Renner,2012), a part of the neural network (referred

to as the encoder) compresses the most relevant information of a given input

, into a representation

. This compressed information is processed by a classiﬁcation head (or, a decoder) to produce

an output

corresponding to the given input. We hypothesize that the organization of information

across neurons of the encoder is indicative of model generalization. We study two complementary

properties that capture this information organization for a given network:

(i)

Intra-neuron diversity: How do activations of a given neuron vary across different input

examples. We measure the entropy of neural activations (across examples) as a proxy.

(ii)

Inter-neuron diversity: How unique is the activation of a neuron compared to other neurons.

We quantify this via mutual information between activations of pairwise neurons.

Below, we discuss the information measures formally.

2.1 Information Measures

For any given encoder (consisting of

neurons) that maps the input to a dense hidden representation,

we denote the activation of the

ith

neuron as a random variable,

Ai∈ {a1

i, . . . , aS

, where each

measurement is an activation over an example from a set of size

. The probability over this

continuous activation space is computed by binning it into discrete ranges (Darbellay & Vajda,1999),

and we denote each discretized activation value as

ˆa

. Importantly, the set of examples on which the

activations are computed come from a distribution that is similar to the underlying training set itself.

Entropy We measure the Shannon entropy for each neuron in the concerned network, as a proxy of

intra-neuron diversity. Following the deﬁnition of Shannon entropy, this is given as:

H(Ai) = E

ˆas

i∈Ai

[h(ˆas

i)] =

Nbins

j=1

p(ˆaj

i) log( 1

p(ˆaj

i))(1)

Mutual Information We compute the mutual information (MI) between underlying neurons as

a proxy to inter-neuron diversity. Speciﬁcally, we compute the MI between all neuron pairs in the

network.1Thus, the set of MI values I(Ai)for a particular neuron Ai, is given as:

I(Ai) = {I(Ai;A1), . . . , I(Ai;AN)}(2)

where,

I(X;Y)

depicts the MI between variables

and

. Unless stated otherwise, this

I(Ai)

computed ∀i∈ {1, . . . , N}, resulting into a square matrix of size (N×N).

This process of computing the information measures for a network on a given set of examples is

summarized in Algorithm 1. Further details on the computation are given in appendix A.

2.2 Toy Setup: Concentric Circles

Here, we brieﬂy discuss the information-theoretic metrics for the example of concentric circles from

the introduction (Figure 1). To recap, we consider a setup to compare networks showing the two

forms of memorization and observe discernible differences in their activation patterns: heuristic

In principle, we would compute MI across neuron sets; we approximate this through individual neuron pairs.

Algorithm 1 Computation of information measures. Algorithmic procedures ENTROPY and MI are

speciﬁed by algorithms 2and 3in appendix A.

1: A1, . . . , AN← {f(xi)}S

i=1 ▷Computing activations for all neurons

2: H← {};I← {} ▷Initiating computations for Entropy and MI

3: for i∈ {1, . . . , N}do ▷Iterating over the set of neurons

4: Ii← {} ▷Initiating MI for a particular neuron

5: Hi←ENTROPY(Ai)▷Following Equation 1and Algorithm 2

6: for j∈ {1, . . . , N}do ▷Inner loop over the set of neurons

7: Ii←IiLMI(Ai, Aj)▷Following Equation 3and Algorithm 3

8: end for

9: H←HLHi

10: I←ILIi▷Following Equation 2

11: end for

memorization corresponds to low intra-neuron and inter-neuron diversity, while example-level

memorization corresponds to high diversity (Figures 1e and 1f). We expect that this difference in

diversity would be captured through the above deﬁned information measures.

Figure 1g presents the distribution of entropy values for each of the three networks with varying

generalization behaviors. Throughout this work, we visualize this distribution of entropy using similar

box-plots, where a black marker within the boxes depicts the median of the distribution and a notch

neighboring this marker depicts the

95%

conﬁdence interval around the median. We observe that

entropy for the network exhibiting heuristic memorization is distributed around a lower point than

the others, whereas entropy for the network with example-level memorization is higher.

Furthermore, Figure 1h shows the distribution of MI for the three networks. To interpret the

distribution of MI (an

N×N

square matrix), we ﬁt a Gaussian mixture model over all values and

visualize it through a density plot, where the density (y-axis) at each point corresponds to the number

of neurons pairs that exhibit that MI value (x-axis). Larger peaks in these density plots suggest a

large number of neurons pairs are concentrated in that region. Interestingly, we see such peaks for

the three networks at distinct values of MI. For the network showing example-level memorization

(high inter-neuron diversity), most of the neuron pairs show low values of MI. In contrast, heuristic

memorization (low inter-neuron diversity) has high neuron pair density for higher MI values.2

Based on these ﬁndings, we formulate two hypotheses, summarized in Table 1:

Networks exhibiting heuristic memorization

would show low inter- and intra-neuron diversity,

reﬂected through low entropy and high MI values.

Networks exhibiting example-level memorization

would show high inter- and intra-neuron diversity,

reﬂected through high entropy and low MI values.

Table 1: Summarizing our hypotheses.

Memorization Diversity

Intra-neuron

(∝Entropy)

Inter-neuron

(∝MI−1)

Heuristic ↓↓↓↓

Example-level ↑↑↑↑

3 Heuristic Memorization

Here, we study different networks with varying degrees of heuristic memorization, and examine if

the information measures—aimed to capture neuron diversity—indicate the extent of memorization.

3.1 Semi-synthetic Setups

We synthetically introduce spurious artifacts in the training examples such that they co-occur with

target labels. Networks trained on such a set are prone to memorizing these artifacts. The same

correlations with an artifact do not hold in the validation sets. To obtain a set of networks with varying

This difference in neuron activation patterns for the two memorizing sets could be caused by several factors,

including functional complexity (Lee et al.,2020): Functions that encode individual data points (as in example-

level memorization) need to be much more complex than functions that learn shortcuts (heuristic memorization).

We make a comparison with standard complexity measures in appendix C.4 and observe that our information

measures correlate more strongly with generalization performance—especially for heuristic memorization.

Figure 2: The relation between entropy of neural activations and heuristic memorization. For both the

setups, networks trained on higher

show higher heuristic memorization (as depicted by the dipping

model accuracy line), accompanied with lower entropy values.

Figure 3: Distribution of mutual information (MI) of pairs of neurons for networks with varying

heuristic memorization. For both settings, networks trained on training sets with larger amounts of

spurious correlations (↑α) exhibit higher mutual information across their neuron pairs.

degrees of this heuristic memorization, we consider a parameter

that controls the fraction of the

training examples for which the spurious correlation holds true. We consider the following setups:

Colored MNIST In this setting, the MNIST dataset (Lecun et al.,1998) is conﬁgured such

that a network trained on this set simply learns to identify the color of images and not the digits

themselves (Arjovsky et al.,2019). Particularly, digits

–

are grouped as one label while

–

as the

other, and images for these labels are colored green and red, respectively. For this setup, we train

multi-layer perceptron (MLP) networks for varying values of

, which corresponds to the fraction of

training instances that abide to the color-to-label correlation. The considered values of

and other

details for this setup are given in appendix B.1.

Sentiment Adjectives In this setup, we sub-sample examples from the

IMDb

dataset (Maas et al.,

2011) that contain at least one adjective from a list of positive and negative adjectives. Then, examples

that contain any of the positive adjectives (“good”, “great”, etc.) are marked with the positive label,

whereas ones that contain any negative adjectives (“bad”, “awful”, etc.) are labeled as negative.

We exclude examples that contain adjectives from both lists. The motivation to use this setup is

to introduce heuristics in the form of adjectives in the training set. We ﬁne-tune DistilBERT-base

models (Sanh et al.,2019) on this task for different values of

(fraction of examples that obey the

heuristic). The full set of adjectives considered and further details are outlined in appendix B.2.

Results: Through these experiments, we ﬁrst note that low entropy across neural activations

indicates heuristic memorization in networks. This is evident from Figure 2, where we see that (1)

as we increase

the validation performance decreases, indicating heuristic memorization (see the

solid line in the plots); and (2) with an increase in this heuristic memorization, we see lower entropy

across neural activations. We show the entropy values of neural activations for the

layers of an MLP

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MeasuresofInformationReflectMemorizationPatternsRachitBansalXDelhiTechnologicalUniversityracbansa@gmail.comDanishPruthiYAmazonWebServicesdanish@hey.comYonatanBelinkovZTechnion–IsraelInstituteofTechnologybelinkov@technion.ac.ilAbstractNeuralnetworksareknowntoexploitspuriousartifacts(orshortcuts)thatc...

展开>> 收起<<

Measures of Information Reflect Memorization Patterns.pdf

共22页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Measures of Information Reflect Memorization Patterns

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: