EIGEN MEMORY TREES A P REPRINT Mark Rucker1 Jordan T. Ash2 John Langford2 Paul Mineiro2 and Ida Momennejad2

2025-05-03 0 0 1.46MB 14 页 10玖币

侵权投诉

EIGEN MEMORY TREES

A PREPRINT

Mark Rucker1, Jordan T. Ash2, John Langford2, Paul Mineiro2, and Ida Momennejad2

1University of Virginia

2Microsoft Research

November 1, 2022

ABSTRACT

This work introduces the Eigen Memory Tree (EMT), a novel online memory model for sequential

learning scenarios. EMTs store data at the leaves of a binary tree and route new samples through the

structure using the principal components of previous experiences, facilitating efﬁcient (logarithmic)

access to relevant memories. We demonstrate that EMT outperforms existing online memory

approaches, and provide a hybridized EMT-parametric algorithm that enjoys drastically improved

performance over purely parametric methods with nearly no downsides. Our ﬁndings are validated

using 206 datasets from the OpenML repository in both bounded and inﬁnite memory budget

situations.

1 Introduction

A sequential learning framework (also known as online or incremental learning (Hoi et al., 2021; Losing et al., 2018))

considers a setting in which data instances xt∈Rdarrive incrementally. After each instance, the agent is required

to make a decision from a set of |A| possibilities, at∈ A. The agent then receives scalar feedback ytregarding the

quality of the action, and the goal is for the agent to learn a mapping from xtto atthat maximizes the sum of all

observed yt.

This general paradigm accommodates a wide array of well-studied machine learning scenarios. For example, in online

supervised learning, Ais a set of labels—the agent is required to predict a label for each xt, and the feedback yt,

indicates the quality of the prediction.

In a contextual bandit or reinforcement learning setting, xtacts as a context or state, atis an action, and ytcorresponds

to a reward provided by the environment. Contextual Bandits have proven useful in a wide variety of settings; their

properties are extremely well studied (Langford & Zhang, 2007) and have tremendous theoretical and real-world

applications (Bouneffouf et al., 2020).

Regardless of the particulars of the learning scenario, a primary consideration is sample complexity. That is, how can

we obtain the highest-performing model given a ﬁxed interaction budget? This often arises when agents only receive

feedback corresponding to the chosen action at, i.e. partial feedback. Here, after an interaction with the environment,

the agent does not get access to what the best action in hindsight would have been. As a consequence, learners in a

partial-feedback setting need to explore different actions even for a ﬁxed xtin order to discover optimal behavior.

Recent work in reinforcement learning has demonstrated that episodic memory mechanisms can facilitate more efﬁ-

cient learning (Lengyel & Dayan, 2007; Blundell et al., 2016; Pritzel et al., 2017; Hansen et al., 2018; Lin et al., 2018;

Zhu et al., 2020). Episodic memory (Tulving, 1972) refers to memory of speciﬁc past experiences (e.g., what did I

have for breakfast yesterday). This is in contrast to semantic memory, which generalizes across many experiences

(e.g., what is my favorite meal for breakfast). Semantic memory is functionally closer to parametric approaches to

learning, which also rely on generalizations while discarding information about speciﬁc events or items.

This paper investigates the use of episodic memory for accelerating learning in sequential problems. We introduce

Eigen Memory Trees (EMT), a model that stores past observations in the leaves of a binary tree. Each leaf contains

arXiv:2210.14077v3 [cs.LG] 31 Oct 2022

Eigen Memory Trees A PREPRINT

Figure 1: A schematic of the Eigen Memory Tree (EMT) al-

gorithm. The outer boxes indicate the two operations supported

by EMT, Learn and Query. The inner boxes show the high-level

subroutines that occur to accomplish the outer operation.

Table 1: Cells indicate the number of datasets in which the

row algorithm beats the column algorithm with statistical signif-

icance. The stacked algorithm we propose (PEMT) has the most

wins in each column as indicated by bold.

PEMT

EMT

PCMT

CMT

Parametric

PEMT — 167 87 196 110

EMT 31 — 42 177 44

PCMT 4 156 — 192 75

CMT 7 11 10 — 17

Parametric 8 151 10 184 —

experiences that are somewhat similar to each other, and the EMT is structured such that new samples are routed

through the tree based on the statistical properties of previously encountered data. When the EMT is queried with a

new observation, this property affords an efﬁcient way to compare it with only the most relevant memories. A learned

“scoring” function wis used to identify the most salient memory in the leaf to be used for decision making.

Speciﬁcally, this work

• introduces the Eigen Memory Tree, an efﬁcient tool for storing, accessing, and comparing memories to

current observations (see Figure 1).

• shows that the EMT gives drastically improved performance over comparable episodic memory data struc-

tures, and sometimes even outperforms parametric approaches that have no explicit memory mechanism.

• proposes a simple combination EMT-parametric (PEMT) approach, which outperforms both purely paramet-

ric and EMT methods with nearly no downsides (see Table 1).

In the following section, we introduce the Eigen Memory Tree and overview the algorithms required for storing and

retrieving memories. A schematic of EMT and its high-level algorithms can be seen in Figure 1. With this in mind,

Section 3 describes related work.

Section 4 follows with an exhaustive set of experiments, demonstrating the superiority of EMT to previous episodic

memory models and motivating the EMT-parametric (PEMT) method, a simple and powerful hybrid strategy for ob-

taining competitive performance on sequential learning problems. Importantly, we show that the PEMT performance

advantage holds even when it is constrained to have a ﬁxed limit on the number of memories it is allowed to store.

All experiments here consider the contextual bandit setting, but EMTs are applicable in broader domains as well.

This section continually references Table 1, identifying which methods outperform which other methods by a sta-

tistically signiﬁcant amount over all datasets and replicates. We consider a substantial number of datasets from the

OpenML (Vanschoren et al., 2014) repository.

Section 5 summarizes our ﬁndings, discusses limitations, and overviews directions for future work.

2 Eigen Memory Tree

As with episodic memory, EMT is structured around storing and retrieving exact memories. We formalize this notion

as the self-consistency property: if a memory has been previously Inserted, then a Query with the same key should

return the previously inserted value. The self-consistency property encodes the assumption that the optimal memory

to return, if possible, is an exact match for the current observation.

EMT is a memory model with four key characteristics: (1) self-consistency, (2) incremental memory growth, (3)

incremental query improvement via supervised feedback, and (4) sub-linear computational complexity with respect to

the number of memories in the tree. As we discuss in the literature review below, this combination of characteristics

separates EMT from previous approaches.

Eigen Memory Trees A PREPRINT

Algorithm 1 Primary Functions

1: class Node (router ∈Rd, boundary ∈R, left: Node, right: Node, M ⊆Rd×R):

2: Initialize: root ←Node()

3: Initialize: w←random vector in Rd

4: Initialize: c←leaf capacity

5: function QUERY(x∈Rd)

6: n←root

7: while nis not leaf n←n.left if hn.router, xi ≤ n.boundary else n.right

8: return arg min(xm,ym)∈n.MPREDICTSCORER(x,xm)

9: end function

10: function LEARN(x∈Rd,y∈R)

11: n←root

12: while nis not leaf n←n.left if hn.router, xi ≤ n.boundary else n.right

13: UPDATESCORER(x,y)

14: n.M←n.M∪(x, y)Insert memory into leaf.

15: if |n.M| ≥ cthen

16: SPLITLEAF(n)

17: end if

18: end function

Memory. EMT represents memories as a mapping from keys to values, M:= Rd→R, where dis the dimension-

ality of the context xt, and yt∈Rcorresponds to observed feedback. A query to the EMT requires an xtand returns

a previously observed value ˆy∈Rfrom its bank of memories. EMT learning, which updates both the underlying data

structure and the scoring mechanism, requires a full (xt, yt)observation pair M.

Data structure. The underlying data structure used by the EMT is a binary tree. The tree is composed of a set of

nodes, N, each of which is either an internal node or a leaf node. If a node n∈ N is internal it possesses two child

nodes (n.left ∈ N and n.right ∈ N ), a decision boundary n.boundary ∈R, and a top eigenvector n.router ∈Rd. The

decision boundary and top eigenvector, as discussed later, are used to route queries through the tree. If a node nis a

leaf node then it will instead possess a ﬁnite set of memories n.M=(x, y)∈Rd→R. In addition to the nodes and

routers EMT also possesses a single global scoring function responsible for selecting which memory to return given a

query and leaf node. This function is parameterised in by the weight vector w∈Rd.

Learn. Newly acquired information is stored during the learning operation (Line 10 in Algorithm 1), which takes an

item Mand traverses the tree until a suitable insertion leaf is identiﬁed (Line 14 in Algorithm 1). Immediately before

insertion, the scoring function weights ware updated such that the scores of the insertion leaf’s memories improve

with respect to the observed xtand yt(Line 13 in Algorithm 1)

Scorer. EMT’s scorer, w, ranks candidate memories at query time. The scorer supports both predictions (see Line 12

Algorithm 2) and updates (see Line 1 Algorithm 2). Intuitively, the scorer can be thought of as a dissimilarity metric,

assigning small values to similar query pairs and large values to dissimilar query pairs.

A pair of identical query vectors is guaranteed to result in a score of 0, satisfying the self-consistency property men-

tioned earlier. This is achieved for any two x1and x2by applying the scorer on the coordinate-wise absolute value

difference between pairs, z← |(x1)i−(x2)i|}d

i=1, which is all zeros if the two contexts are identical. Predictions are

then made by linearly regressing this vector onto the scorer’s weights, and clipping predictions hw, zisuch that the

minimum is 0. The clipping operation ensures that predictions will be 0even when weights are negative.

Updating the scoring function is done via a ranking loss, which considers (1) (xbest, ybest), which is the memory-

reward pair that would be retrieved by the scorer currently and (2) (xalt, yalt), an alternative memory in the same leaf

that has a reward most similar to the current observation (omitting the retrieved memory xbest). If yalt gives a better

prediction of the observed reward than ybest, we adjust the scorer to decrease the learned dissimilarity between xalt and

the current target xt. Alternatively, if the retrieved memory ybest is closer to the observed reward, we do the opposite,

emphasizing the similarity between xtand xbest and making xtand xalt more dissimilar. Speciﬁcally, this is done by

taking a gradient step with respect to an L2loss that encourages more similar pairs to have a score near 0and more

dissimilar pairs to have a score near 1.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EIGENMEMORYTREESAPREPRINTMarkRucker1,JordanT.Ash2,JohnLangford2,PaulMineiro2,andIdaMomennejad21UniversityofVirginia2MicrosoftResearchNovember1,2022ABSTRACTThisworkintroducestheEigenMemoryTree(EMT),anovelonlinememorymodelforsequentiallearningscenarios.EMTsstoredataattheleavesofabinarytreeandroutenews...

展开>> 收起<<

EIGEN MEMORY TREES A P REPRINT Mark Rucker1 Jordan T. Ash2 John Langford2 Paul Mineiro2 and Ida Momennejad2.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

EIGEN MEMORY TREES A P REPRINT Mark Rucker1 Jordan T. Ash2 John Langford2 Paul Mineiro2 and Ida Momennejad2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: