Bayesian Hyperbolic Multidimensional Scaling Bolun Liu Department of Biostatistics

2025-05-06 0 0 2.79MB 42 页 10玖币

侵权投诉

Bayesian Hyperbolic Multidimensional Scaling

Bolun Liu∗

Department of Biostatistics

Bloomberg School of Public Health

Johns Hopkins University

Shane Lubold

Department of Statistics

University of Washington

Adrian E. Raftery

Department of Statistics

Department of Sociology

University of Washington

Tyler H. McCormick

Department of Statistics

Department of Sociology

University of Washington

Abstract

Multidimensional scaling (MDS) is a widely used approach to representing high-dimensional,

dependent data. MDS works by assigning each observation a location on a low-dimensional

geometric manifold, with distance on the manifold representing similarity. We propose a

Bayesian approach to multidimensional scaling when the low-dimensional manifold is hyper-

bolic. Using hyperbolic space facilitates representing tree-like structures common in many

settings (e.g. text or genetic data with hierarchical structure). A Bayesian approach pro-

vides regularization that minimizes the impact of measurement error in the observed data

and assesses uncertainty. We also propose a case-control likelihood approximation that al-

lows for eﬃcient sampling from the posterior distribution in larger data settings, reducing

computational complexity from approximately O(n2) to O(n). We evaluate the proposed

method against state-of-the-art alternatives using simulations, canonical reference datasets,

Indian village network data, and human gene expression data. Code to reproduce the result

in the paper is available at https://github.com/peterliu599/BHMDS.

Keywords: Bayesian methods, Hyperbolic geometry, Multidimensional scaling

1 Introduction

Multidimensional scaling (MDS) methods represent high-dimensional data in a low-dimensional

space, using dissimilarities between the observations as a means of identifying positions (Kruskal,

∗The authors gratefully acknowledge support from NIH grants R01 HD070936 (Raftery) and DP2 MH122405

(McCormick). Correspondence: tylermc@uw.edu.

arXiv:2210.15081v3 [stat.ME] 15 Aug 2023

1964). Observations with small dissimilarities will be placed close together, while those with larger

dissimilarities will be placed further apart. A long literature on MDS methodology illustrates

the utility of MDS as a means of summarizing complex, dependent data and for downstream

applications, such as detecting clusters from the dissimilarities (Borg and Groenen,1997;Davison,

1983;Cox and Cox,2001).

In many settings, the observed dissimilarities we wish to apply MDS methods to are likely to

contain measurement error. These errors could arise from misreporting in the context of the social

sciences (e.g. a retrospective behavioral inventory) or from miscalibration of machinery or operator

error in industrial settings. Using a probabilistic model is one way to account for this additional

uncertainty. Takane and Caroll (1981); Groenen (1993); MacKay (1989) and others have proposed

maximum likelihood MDS methods for handling measurement error. The use of these methods

relies on asymptotic theory, which might not apply to sample sizes used in applications, and the

problem requires solving a non-linear optimization problem where the number of parameters grows

quickly as the sample size grows (Cox and Cox,2001). One potential framework to address these

potential issues with MDS is a Bayesian framework. Oh and Raftery (2001) provided a Bayesian

procedure to estimate the conﬁguration of objects given (potentially noisy) dissimilarities between

objects. Extensions of Bayesian MDS to the case of large datasets were discussed in Holbrook et

al. (2021).

Along with the statistical framework, another critical, but often ignored, choice in implement-

ing MDS is the choice of geometry for the low-dimensional manifold. Multidimensional scaling

methods often assume the observed dissimilarities are computed using Euclidean distances among

objects in a Euclidean space. Oh and Raftery (2001), for example, assume a Euclidean distance

model with a Gaussian measurement error and propose a Markov-Chain Monte Carlo algorithm

to compute a Bayesian solution for the object conﬁguration. Yet there is a growing literature

showing that representing objects in other embedding spaces might lead to better representations

and therefore be more useful in downstream tasks (De Sa et al.,2018;Chami et al.,2020;Smith et

al.,2019;Lubold et al.,2020;Keller-Ressel and Nargang,2020). In particular, hyperbolic spaces,

deﬁned in Section 3, have been shown to produce embeddings with lower distortion, especially for

data that is hierarchical or tree-like.

In this work, we combine the hyperbolic MDS methods with a Bayesian procedure. Speciﬁ-

cally, we apply the Bayesian MDS procedure from Oh and Raftery (2001) to a hyperbolic space.

Our model posits that the objects of interest are located in positions on a low-dimensional hyper-

bolic space, with the true dissimilarities between objects being the hyperbolic distances between

positions. We further assume that the observed dissimilarities are the true dissimilarities plus a

Gaussian error. We then derive a Markov-chain Monte Carlo (MCMC) method we use to obtain

Bayesian estimates of the object positions, which allows practitioners to calculate distance-based

measures (e.g. distances between positions, distances to the hyperbolic origin, etc.) that are often

the pragmatic goals in applications.

The paper is organized as follows. In Section 3, we formally deﬁne the Hyperbolic space

model used in this work and posit a model for observed dissimilarities computed from points in

Hyperbolic space. We then discuss a prior distribution over this space and derive an MCMC

algorithm to draw samples from the posterior in Section 4. Sections 5and 6contain simulations

and applications of our method to real datasets in genomics. We conclude in Section 7.

2 Hyperbolic Geometry

We now discuss the mathematical details of the hyperbolic geometry. The hyperbolic geometry is a

non-Euclidean geometry that has a constant negative curvature and is commonly visualized as the

upper sheet of the unit two-sheet hyperboloid. There exist multiple equivalent hyperbolic models,

such as the Klein model, the Poincar´e disk model, and the Lorentz (Hyperboloid/Minkowski)

model. We use the Lorentz model to parameterize the hyperbolic geometry, which parallels the

representation used in existing hyperbolic MDS algorithms (Keller-Ressel and Nargang,2020;

De Sa et al.,2018). This representation also facilitates convenient priors for our Bayesian model

in Section 3.

To deﬁne the Lorentz model, we begin with the deﬁnition of the Lorentzian product. For any

x= (x0, . . . , xp) and y= (y0, . . . , yp)∈Rp+1, the Lorentzian product ⟨x,y⟩Lis deﬁned as

⟨x,y⟩L≡ −x0y0+

i=1

xiyi.

The p-dimensional Lorentz model with curvature −κ, which we denote by Hp(κ), can be repre-

sented as a collection of coordinates x∈Rp+1 with x0>0 such that its Lorentzian product with

itself is −1 and equipped with the hyperbolic distance proportional to the square root of κ. That

is,

Hp(κ)≡x∈Rp+1 :x0>0,⟨x,x⟩L=−1, κ > 0,

equipped with the hyperbolic distance

dHp(κ)(x,y)≡1

√κarccosh (−⟨x,y⟩L),(1)

which is the geodesic distance between xand yon Hp(κ). Speciﬁcally, the curvature −κcontrols

the hyperbolicity of the geometry so that the space becomes more hyperbolic as −κbecomes more

negative and becomes ﬂatter as κshrinks to zero (Euclidean geometry has curvature exactly 0).

3 Bayesian Modeling Framework

We now describe our statistical framework for Bayesian Hyperbolic Multidimensional Scaling

(BHMDS), which represents the objects of interest by coordinates in hyperbolic geometry, so that

the hyperbolic distances between objects resemble their true dissimilarity measures. We suppose

the objects dwell on Hp(κ) with coordinates x1,x2,··· ,xn, and denote by δij the dissimilarity

measure between object iand object jas well as the hyperbolic distance between xiand xj:

δij ≡dHp(κ)(xi,xj) = 1

√κarccos −⟨xi,xj⟩L, i, j = 1,··· , n . (2)

In many settings, the observed dissimilarities contain measurement errors. As in Oh and

Raftery (2001), we represent the observed dissimilarity dij as the true dissimilarity plus a Gaussian

error, with the constraint that the observed dissimilarity is always positive. We therefore assume

that dij follows a truncated normal distribution:

dij ∼Nδij, σ2I(dij >0) , i < j, i, j = 1, . . . , n , (3)

where δij is as deﬁned in (2), xiare unobserved, and σ2is the variance of the measurement error.

Given the statistical model, we now specify priors for X={x1,··· ,xn}and σ2. First note

that, using (3), the likelihood of the unknown parameters Xand σ2is

lD|X, σ2∝σ2−m/2exp "−1

2σ2SSR −X

i<j

log Φ δij

σ#,(4)

where m=n(n−1)/2 is the number of dissimilarities, SSR =Pi<j (δij −dij)2is the sum

of squared residuals, Φ is the cumulative distribution function of the standard normal random

variable, and D={dij }i,j=1,··· ,n is the matrix of observed dissimilarities. The square root of the

SSR term in the likelihood is often referred to as stress in the MDS literature, meaning that our

approach falls under the umbrella of stress-minimizing approaches to MDS.

We now elaborate on the prior choice for hyperbolic coordinates. Due to the constraint in (2),

hyperbolic coordinate distributions are generally asymmetric and complex in form, often leading to

intractable full conditional posterior distributions. In such settings, Metropolis-Hasting samplers

are computationally challenging. Therefore, the choice of a prior structure has substantial implica-

tions for the feasibility of posterior sampling. We use the hyperbolic wrapped normal distribution

centered on the hyperbolic origin (Nagano et al.,2019) as our hyperbolic prior. This prior distri-

bution equalizes sampling on Hpto sampling on Rpby deﬁning a one-to-one transformation from

Rpto Hp. Speciﬁcally, to sample from the hyperbolic prior, we ﬁrst sample v∼ Np(0,Λ) ∈Rp,

then transform v∈Rpto x∈Hpby transformation T(·) speciﬁed by the prior,

x=T(v) = cosh (∥˜

v∥L)µp

0+ sinh (∥˜

v∥L)˜

∥˜

v∥L

,(5)

where µp

0= (1,0,··· ,0) ∈Hpis the hyperbolic origin, ˜

v= (0,v)∈Rp+1, and ∥˜

v∥L≡p⟨˜

v,˜

v⟩L.

The procedure produces, x, which is a sample from the hyperbolic wrapped normal distribution.

The transformation T(·) enables us to reparameterize the likelihood in (4) using V={v1,··· ,vn},

such that

lD|V, σ2=lD|X, σ2∝σ2−m/2exp "−1

2σ2SSR −X

i<j

log Φ δij

σ#,(6)

where δij =dHp(κ)(T(vi), T (vj)). Imposing a Gaussian prior on Vis equivalent to a hyperbolic

wrapped normal prior on X. The transformation allows us to sample in the space of the Gaussian

prior, meaning we can use a simple random walk Metropolis-Hasting algorithm to sample from

the posterior.

We next deﬁne the prior for the observation error variance, σ2. We use a conjugate prior

σ2∼IG(a, b), the Inverse Gamma distribution with mode b/(a+ 1). For the hyperprior on the

diagonal entries of the auxiliary variable variance matrix Λ = Diag (λ1, . . . , λp), we also assume a

conjugate Inverse Gamma prior, λj∼IG (α, βj), independently for j= 1,··· , p. We will further

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

BayesianHyperbolicMultidimensionalScalingBolunLiu∗DepartmentofBiostatisticsBloombergSchoolofPublicHealthJohnsHopkinsUniversityShaneLuboldDepartmentofStatisticsUniversityofWashingtonAdrianE.RafteryDepartmentofStatisticsDepartmentofSociologyUniversityofWashingtonTylerH.McCormickDepartmentofStatisticsD...

展开>> 收起<<

Bayesian Hyperbolic Multidimensional Scaling Bolun Liu Department of Biostatistics.pdf

共42页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Bayesian Hyperbolic Multidimensional Scaling Bolun Liu Department of Biostatistics

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: