Context-aware Bayesian Mixed Multinomial Logit Model Mirosława Łukawskay Anders Fjendbo Jenseny and Filipe Rodriguesy Corresponding author mirludtu.dk

2025-04-27 0 0 443.83KB 18 页 10玖币

侵权投诉

Context-aware Bayesian Mixed Multinomial Logit Model

Mirosława Łukawska∗,†, Anders Fjendbo Jensen†, and Filipe Rodrigues†

∗Corresponding author (mirlu@dtu.dk)

†Technical University of Denmark, Department of Technology, Management and Economics, Bygningstorvet 116b,

2800 Kgs. Lyngby

Declarations of interest: none

Highlights

• We propose an approach to model context-dependent intra-respondent heterogeneity.

• Contextual information is mapped to additive shifts on preference parameters.

• The model allows non-linear interactions between continuous and discrete variables.

• Analyses of different context scenarios are possible without model reestimation.

• We estimate a C-MMNL bicycle route choice model on 119,448 trips in 47 minutes.

Abstract

The mixed multinomial logit model assumes constant preference parameters of a decision-maker through-

out different choice situations, which may be considered too strong for certain choice modelling applica-

tions. This paper proposes an effective approach to model context-dependent intra-respondent hetero-

geneity, thereby introducing the concept of the Context-aware Bayesian mixed multinomial logit model,

where a neural network maps contextual information to interpretable shifts in the preference parame-

ters of each individual in each choice occasion. The proposed model offers several key advantages.

First, it supports both continuous and discrete variables, as well as complex non-linear interactions be-

tween both types of variables. Secondly, each context speciﬁcation is considered jointly as a whole by

the neural network rather than each variable being considered independently. Finally, since the neu-

ral network parameters are shared across all decision-makers, it can leverage information from other

decision-makers to infer the effect of a particular context on a particular decision-maker. Even though

the context-aware Bayesian mixed multinomial logit model allows for ﬂexible interactions between at-

tributes, the increase in computational complexity is minor, compared to the mixed multinomial logit

model. We illustrate the concept and interpretation of the proposed model in a simulation study. We

furthermore present a real-world case study from the travel behaviour domain - a bicycle route choice

model, based on a large-scale, crowdsourced dataset of GPS trajectories including 119,448 trips made

by 8,555 cyclists.

Keywords: choice context, bayesian modelling, neural networks, bicycle route choice, big data

1 Introduction

A long-standing major concern in discrete choice modelling is heterogeneity in the decision-making

process. To overcome this, the mixed multinomial logit (MMNL) model (McFadden and Train 2000;

Revelt and Train 1998) adds ﬂexibility to the original multinomial logit (MNL) model formulation (Boyd

and Mellman 1980) by allowing each decision-maker nto have their own preference parameters βn,

but constraining them to the population density f(β). However, this entails the assumption that the

arXiv:2210.05737v2 [stat.ML] 29 Mar 2023

preference parameters of the decision-maker are constant throughout time and throughout different

choice situations, which may be deemed too strong for certain choice modelling applications.

Individual preferences are complex and heterogeneous; they depend on different choice scenarios,

and might evolve over time (Castells et al. 2021; Hess and Giergiczny 2015; Krueger et al. 2021). Even

when facing the same choice situation multiple times, an individual might make different decisions de-

pending, for example, on the circumstances in the moment of choice (context). One can think of a

context as an external factor that varies across individuals, as well as choice situations, and captures

temporal (e.g. weather) or long-term (e.g. pandemics) circumstances. This temporal element might

cause behaviour related to certain effects not to remain stable over time which can be important in

explaining the variation in explanatory variables (Mannering 2018). This has been evident in many be-

havioural research ﬁelds, including transport (Mannering et al. 1994), economics (Meier and Sprenger

2015), or accident research (Islam et al. 2020; Mannering 2018).

Empirical evidence suggests that even though the major part of heterogeneity relates to inter-

respondent heterogeneity, incorporating intra-respondent heterogeneity can lead to further gains in ﬁt

(Hess and Giergiczny 2015). Capturing this intra-heterogeneity often requires more complex model

speciﬁcation considerations. Hess and Rose (2009) extended the MMNL model formulation, allowing

for intra-respondent heterogeneity on top of the inter-respondent heterogeneity and assuming additional

random variation around the mean taste across multiple choice scenarios for the same individual. This

model outperforms the standard multinomial logit model in terms of prediction accuracy (in- and out-

of-sample; Danaf et al. 2019; Xie et al. 2020), but Krueger et al. (2021) found only minor improvement

when compared to the mixed multinomial logit model. Becker et al. (2018) further provided a Bayesian

treatment of this intra-inter heterogeneity approach and proposed a Markov-chain Monte Carlo (MCMC)

procedure for performing inference. Danaf et al. (2020) extended this procedure by relaxing the con-

straint of normality assumptions. However, these approaches do not allow for systematic variations in

preference parameters as a function of contextual variables.

A further approach to model heterogeneity is based on classes of individuals, with homogeneous

preferences within each class. The most common example is the Latent Class Choice Model (LCCM);

see, e.g. Greene and Hensher (2003) or Hess (2014) for a discussion. A similar idea was employed

in a collaborative learning framework with time-varying parameters in the context of personal recom-

mendations (Zhu et al. 2020). Unlike for LCCM, where an individual is perceived as belonging to one

class, the individual’s membership vector in a more ﬂexible collaborative model represents a combi-

nation of multiple preference patterns identiﬁed by the model, enabling personalisation of preferences.

For a detailed comparison between these two approaches, we refer to Zhu et al. (2020). Moreover,

the collaborative learning approach overcomes the limitation of inter-intra heterogeneity model, where

personalised predictions and recommendations are not possible because the individual parameters for

each choice situation are not estimated.

With the developments in computational hardware, Bayesian approaches to choice modelling have

been gaining research interest. Washington et al. (2009) elaborated on the theory and speciﬁed the

task to route choice modelling. Train (2001) compared the Bayesian approach to mixed multinomial

logit with Maximum Likelihood Simulation, an experiment repeated and extended by Elshiewy et al.

(2017). When using non-informative priors, estimates in both approaches are similar, especially for

large datasets (Congdon 2007; J. Huber and Train 2001). However, utilising the Bayesian approach

allows researchers to include more information within the estimation procedure, thus improving the

behavioural explanation of the model. It also gives a possibility to obtain full posterior distributions

over the model parameters (including the individual-speciﬁc taste parameters) and take advantage of

modern approaches, for example utility generation (Rodrigues et al. 2020) or inference (Rodrigues

2022). Further extensions of the classical mixed logit model are possible using Variational Bayes for

posterior inference, as shown in Krueger et al. (2019), where a method for including unobserved inter-

and intra-individual heterogeneity in behaviour was derived.

A substantial effort has been made in extending discrete choice models with machine learning

frameworks, with recent papers suggesting a joint perspective and a complementarity of these two

approaches (Salas et al. 2022; Wang et al. 2021). An example of this synergy is leveraging neural

networks for the utility speciﬁcation (”Neural-embedded Discrete Choice Model”, Han 2019) through

representation learning, and thus enhancing the capability to capture the inter-respondent heterogene-

ity (Han et al. 2022; Sifringer et al. 2020; Van Cranenburgh and Alwosheel 2019). Despite the im-

proved prediction accuracy compared to traditional DCM methods (Salas et al. 2022), these methods

do not focus on improving the predictability on the individual level (Krueger et al. 2021). Thus, previous

approaches had limited practicality, and efforts should be made to make these more advanced and

non-linear mechanisms more applicable and interpretable in diverse settings.

In this work, we propose the context-aware Bayesian mixed multinomial logit (C-MMNL) model which

allows to model context-dependent intra-respondent heterogeneity by introducing a neural network that

maps contextual information to interpretable shifts in the preference parameters of the decision-maker

in a Bayesian mixed multinomial logit framework. This framework allows for a joint treatment of both

discrete and continuous variables and is able to capture non-linear interactions between them. Addi-

tionally, the neural network can extrapolate the behaviour of individuals to unseen context situations by

leveraging information from other individuals. By making use of Stochastic Variation Inference (SVI)

and GPU-hardware acceleration, we are able to handle a large-scale revealed preference (RP) dataset

for bicycle route choice modelling. Lastly, despite relying on a black-box function approximator (neu-

ral network), whose non-interpretabilitable mechanism was subject to criticism in some studies, (e.g.

Salas et al. 2022), we show that the proposed model is highly interpretable and preserves the links to

economic theories of the original MMNL. The C-MMNL model improves the individual (conditional) and

general (unconditional) predictions on a hold-out sample over variations of the traditional MMNL with

interaction terms. By introducing context shifts that are consistent across individuals, the estimates are

not as dependent on the sample structure (number of observations per individual) as in the case of

inter-intra models, where the variation of preferences for a given individual highly depends on the panel

structure.

The remainder of this paper consists of four sections. Section 2brieﬂy describes the standard MMNL

model from the Bayesian perspective and introduces the proposed C-MMNL model as an extension to

the MMNL model. Section 3provides an extensive simulation study illustrating the estimation of the

C-MMNL model and the interpretation of the results. Section 4estimates a large-scale bicycle route

choice model with the proposed method. Section 5concludes the paper.

2 Method

2.1 Bayesian Mixed Multinomial Logit Model

We consider a standard mixed multinomial logit model (MMNL) setup where on each choice occasion

t∈ {1, . . . , T }a decision-maker n∈ {1, . . . , N}derives a random utility Untj =V(xntj ,ηn) + ntj from

each alternative jin the choice set Cnt. The systematic utility term V(xntj ,ηn)is assumed to be a

function of covariates xntj and a collection of taste parameters ηn, while ntj is a random error term,

following a type-I Extreme Value distribution. We consider the general setting under which the tastes

ηncan be decomposed into a vector of ﬁxed taste parameters α∈RLshared across decision-makers,

and individual-speciﬁc random taste parameters βn∈RK.

In the Bayesian framework, we assume the individual-speciﬁc taste parameters βnto follow a mul-

tivariate normal distribution, i.e. βn∼ N (ζ,Ω). We further assume the ﬁxed taste parameters α

and the mean vector ζto follow a multivariate normal distribution, respectively: α∼ N (λ0,Ξ0)and

ζ∼ N (µ0,Σ0). As for the covariance matrix Ω, we decompose our prior into a scale and a correlation

matrix as follows: Ω=diag(τ)×Ψ×diag(τ), where Ψis a correlation matrix and τis the vector

of coefﬁcient scales (Barnard et al. 2000; Hilbe 2009). For the components of the scale vector τwe

employ a vague half-Cauchy prior, e.g. τk∼half-Cauchy(10), while for the correlation matrix - a LKJ

prior (Lewandowski et al. 2009), such that Ψ∼LKJ(ν). The hyper-parameter νdirectly controls the

amount of correlation favoured by the prior.

The generative process of the MMNL model can be summarised as follows:

1. Draw ﬁxed taste parameters α∼ N (λ0,Ξ0)

2. Draw mean vector ζ∼ N (µ0,Σ0)

3. Draw scales vector θ∼half-Cauchy(σ0)

4. Draw correlation matrix Ψ∼LKJ(ν)

5. For each decision-maker n∈ {1, . . . , N}

(a) Draw random taste parameters βn∼ N (ζ,Ω)

(b) For each choice occasion t∈ {1, . . . , T }

i. Draw observed choice ynt ∼MNL(ηn,Xnt)

yn,t

xn,t

ηn,t

βn

Ω

µt

θNN

Fig. 1: Graphical model representation of the proposed C-MMNL model, where the key changes to the

original MMNL model are highlighted in blue.

where Ω=diag(θ)×Ψ×diag(θ)and ηn= [α,βn].

2.2 Context-aware Bayesian Mixed Multinomial Logit Model

We present the idea of the context-aware Bayesian mixed multinomial logit (C-MMNL) model, where the

context information is included in the form of an easily interpretable context-speciﬁc bias term µt, a non-

linear function of the context information that shifts the preference parameters of each individual nin

each choice occasion t, i.e. ηnt =ηn+µt, where ηn= [α,βn]. The adjustment term µtis assumed to be

determined by a neural network that takes as input the context information ct, i.e.: µt=NNetθNN (ct). In

order to share statistical strength across individuals, we assume that all individuals shift their preference

parameters in the same way when faced with a given choice context ct, and therefore the parameters

of the neural network, θNN , are shared for all individuals. However, we note that, if this assumption is

considered too strong for some applications, one can relax it by allowing the neural network to also take

into account, for example, the socio-demographic characteristics of the decision-maker. This would

allow for complex interactions between the latter and the context information ct. Figure 1shows the

graphical model representation of the proposed C-MMNL model.

The generative process of the proposed C-MMNL model can then be summarised as follows, where

the main changes to the generative process assumed by the original MMNL model (described in Sec-

tion 2.1) have been highlighted:

1. Draw ﬁxed taste parameters α∼ N (λ0,Ξ0)

2. Draw mean vector ζ∼ N (µ0,Σ0)

3. Draw scales vector θ∼half-Cauchy(σ0)

4. Draw correlation matrix Ψ∼LKJ(ν)

5. For each choice occasion t∈ {1, . . . , T }

(a) Determine context-speciﬁc shift term µt=NNetθNN (ct)

6. For each decision-maker n∈ {1, . . . , N}

(a) Draw random taste parameters βn∼ N (ζ,Ω)

(b) For each choice occasion t∈ {1, . . . , T }

i. Compute context-adjusted taste parameters: ηnt =ηn+µt, with ηn= [α,βn]

ii. Draw observed choice ynt ∼MNL(ηnt,Xnt)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Context-awareBayesianMixedMultinomialLogitModelMirosawaukawska,y,AndersFjendboJenseny,andFilipeRodriguesyCorrespondingauthor(mirlu@dtu.dk)yTechnicalUniversityofDenmark,DepartmentofTechnology,ManagementandEconomics,Bygningstorvet116b,2800Kgs.LyngbyDeclarationsofinterest:noneHighlightsWeproposean...

展开>> 收起<<

Context-aware Bayesian Mixed Multinomial Logit Model Mirosława Łukawskay Anders Fjendbo Jenseny and Filipe Rodriguesy Corresponding author mirludtu.dk.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Context-aware Bayesian Mixed Multinomial Logit Model Mirosława Łukawskay Anders Fjendbo Jenseny and Filipe Rodriguesy Corresponding author mirludtu.dk

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: