Context-aware Bayesian Mixed Multinomial Logit Model Mirosława Łukawskay Anders Fjendbo Jenseny and Filipe Rodriguesy Corresponding author mirludtu.dk

2025-04-27 0 0 443.83KB 18 页 10玖币
侵权投诉
Context-aware Bayesian Mixed Multinomial Logit Model
Mirosława Łukawska,, Anders Fjendbo Jensen, and Filipe Rodrigues
Corresponding author (mirlu@dtu.dk)
Technical University of Denmark, Department of Technology, Management and Economics, Bygningstorvet 116b,
2800 Kgs. Lyngby
Declarations of interest: none
Highlights
We propose an approach to model context-dependent intra-respondent heterogeneity.
Contextual information is mapped to additive shifts on preference parameters.
The model allows non-linear interactions between continuous and discrete variables.
Analyses of different context scenarios are possible without model reestimation.
We estimate a C-MMNL bicycle route choice model on 119,448 trips in 47 minutes.
Abstract
The mixed multinomial logit model assumes constant preference parameters of a decision-maker through-
out different choice situations, which may be considered too strong for certain choice modelling applica-
tions. This paper proposes an effective approach to model context-dependent intra-respondent hetero-
geneity, thereby introducing the concept of the Context-aware Bayesian mixed multinomial logit model,
where a neural network maps contextual information to interpretable shifts in the preference parame-
ters of each individual in each choice occasion. The proposed model offers several key advantages.
First, it supports both continuous and discrete variables, as well as complex non-linear interactions be-
tween both types of variables. Secondly, each context specification is considered jointly as a whole by
the neural network rather than each variable being considered independently. Finally, since the neu-
ral network parameters are shared across all decision-makers, it can leverage information from other
decision-makers to infer the effect of a particular context on a particular decision-maker. Even though
the context-aware Bayesian mixed multinomial logit model allows for flexible interactions between at-
tributes, the increase in computational complexity is minor, compared to the mixed multinomial logit
model. We illustrate the concept and interpretation of the proposed model in a simulation study. We
furthermore present a real-world case study from the travel behaviour domain - a bicycle route choice
model, based on a large-scale, crowdsourced dataset of GPS trajectories including 119,448 trips made
by 8,555 cyclists.
Keywords: choice context, bayesian modelling, neural networks, bicycle route choice, big data
1 Introduction
A long-standing major concern in discrete choice modelling is heterogeneity in the decision-making
process. To overcome this, the mixed multinomial logit (MMNL) model (McFadden and Train 2000;
Revelt and Train 1998) adds flexibility to the original multinomial logit (MNL) model formulation (Boyd
and Mellman 1980) by allowing each decision-maker nto have their own preference parameters βn,
but constraining them to the population density f(β). However, this entails the assumption that the
1
arXiv:2210.05737v2 [stat.ML] 29 Mar 2023
preference parameters of the decision-maker are constant throughout time and throughout different
choice situations, which may be deemed too strong for certain choice modelling applications.
Individual preferences are complex and heterogeneous; they depend on different choice scenarios,
and might evolve over time (Castells et al. 2021; Hess and Giergiczny 2015; Krueger et al. 2021). Even
when facing the same choice situation multiple times, an individual might make different decisions de-
pending, for example, on the circumstances in the moment of choice (context). One can think of a
context as an external factor that varies across individuals, as well as choice situations, and captures
temporal (e.g. weather) or long-term (e.g. pandemics) circumstances. This temporal element might
cause behaviour related to certain effects not to remain stable over time which can be important in
explaining the variation in explanatory variables (Mannering 2018). This has been evident in many be-
havioural research fields, including transport (Mannering et al. 1994), economics (Meier and Sprenger
2015), or accident research (Islam et al. 2020; Mannering 2018).
Empirical evidence suggests that even though the major part of heterogeneity relates to inter-
respondent heterogeneity, incorporating intra-respondent heterogeneity can lead to further gains in fit
(Hess and Giergiczny 2015). Capturing this intra-heterogeneity often requires more complex model
specification considerations. Hess and Rose (2009) extended the MMNL model formulation, allowing
for intra-respondent heterogeneity on top of the inter-respondent heterogeneity and assuming additional
random variation around the mean taste across multiple choice scenarios for the same individual. This
model outperforms the standard multinomial logit model in terms of prediction accuracy (in- and out-
of-sample; Danaf et al. 2019; Xie et al. 2020), but Krueger et al. (2021) found only minor improvement
when compared to the mixed multinomial logit model. Becker et al. (2018) further provided a Bayesian
treatment of this intra-inter heterogeneity approach and proposed a Markov-chain Monte Carlo (MCMC)
procedure for performing inference. Danaf et al. (2020) extended this procedure by relaxing the con-
straint of normality assumptions. However, these approaches do not allow for systematic variations in
preference parameters as a function of contextual variables.
A further approach to model heterogeneity is based on classes of individuals, with homogeneous
preferences within each class. The most common example is the Latent Class Choice Model (LCCM);
see, e.g. Greene and Hensher (2003) or Hess (2014) for a discussion. A similar idea was employed
in a collaborative learning framework with time-varying parameters in the context of personal recom-
mendations (Zhu et al. 2020). Unlike for LCCM, where an individual is perceived as belonging to one
class, the individual’s membership vector in a more flexible collaborative model represents a combi-
nation of multiple preference patterns identified by the model, enabling personalisation of preferences.
For a detailed comparison between these two approaches, we refer to Zhu et al. (2020). Moreover,
the collaborative learning approach overcomes the limitation of inter-intra heterogeneity model, where
personalised predictions and recommendations are not possible because the individual parameters for
each choice situation are not estimated.
With the developments in computational hardware, Bayesian approaches to choice modelling have
been gaining research interest. Washington et al. (2009) elaborated on the theory and specified the
task to route choice modelling. Train (2001) compared the Bayesian approach to mixed multinomial
logit with Maximum Likelihood Simulation, an experiment repeated and extended by Elshiewy et al.
(2017). When using non-informative priors, estimates in both approaches are similar, especially for
large datasets (Congdon 2007; J. Huber and Train 2001). However, utilising the Bayesian approach
allows researchers to include more information within the estimation procedure, thus improving the
behavioural explanation of the model. It also gives a possibility to obtain full posterior distributions
over the model parameters (including the individual-specific taste parameters) and take advantage of
modern approaches, for example utility generation (Rodrigues et al. 2020) or inference (Rodrigues
2022). Further extensions of the classical mixed logit model are possible using Variational Bayes for
posterior inference, as shown in Krueger et al. (2019), where a method for including unobserved inter-
and intra-individual heterogeneity in behaviour was derived.
A substantial effort has been made in extending discrete choice models with machine learning
frameworks, with recent papers suggesting a joint perspective and a complementarity of these two
approaches (Salas et al. 2022; Wang et al. 2021). An example of this synergy is leveraging neural
networks for the utility specification (”Neural-embedded Discrete Choice Model”, Han 2019) through
representation learning, and thus enhancing the capability to capture the inter-respondent heterogene-
ity (Han et al. 2022; Sifringer et al. 2020; Van Cranenburgh and Alwosheel 2019). Despite the im-
proved prediction accuracy compared to traditional DCM methods (Salas et al. 2022), these methods
do not focus on improving the predictability on the individual level (Krueger et al. 2021). Thus, previous
approaches had limited practicality, and efforts should be made to make these more advanced and
2
non-linear mechanisms more applicable and interpretable in diverse settings.
In this work, we propose the context-aware Bayesian mixed multinomial logit (C-MMNL) model which
allows to model context-dependent intra-respondent heterogeneity by introducing a neural network that
maps contextual information to interpretable shifts in the preference parameters of the decision-maker
in a Bayesian mixed multinomial logit framework. This framework allows for a joint treatment of both
discrete and continuous variables and is able to capture non-linear interactions between them. Addi-
tionally, the neural network can extrapolate the behaviour of individuals to unseen context situations by
leveraging information from other individuals. By making use of Stochastic Variation Inference (SVI)
and GPU-hardware acceleration, we are able to handle a large-scale revealed preference (RP) dataset
for bicycle route choice modelling. Lastly, despite relying on a black-box function approximator (neu-
ral network), whose non-interpretabilitable mechanism was subject to criticism in some studies, (e.g.
Salas et al. 2022), we show that the proposed model is highly interpretable and preserves the links to
economic theories of the original MMNL. The C-MMNL model improves the individual (conditional) and
general (unconditional) predictions on a hold-out sample over variations of the traditional MMNL with
interaction terms. By introducing context shifts that are consistent across individuals, the estimates are
not as dependent on the sample structure (number of observations per individual) as in the case of
inter-intra models, where the variation of preferences for a given individual highly depends on the panel
structure.
The remainder of this paper consists of four sections. Section 2briefly describes the standard MMNL
model from the Bayesian perspective and introduces the proposed C-MMNL model as an extension to
the MMNL model. Section 3provides an extensive simulation study illustrating the estimation of the
C-MMNL model and the interpretation of the results. Section 4estimates a large-scale bicycle route
choice model with the proposed method. Section 5concludes the paper.
2 Method
2.1 Bayesian Mixed Multinomial Logit Model
We consider a standard mixed multinomial logit model (MMNL) setup where on each choice occasion
t∈ {1, . . . , T }a decision-maker n∈ {1, . . . , N}derives a random utility Untj =V(xntj ,ηn) + ntj from
each alternative jin the choice set Cnt. The systematic utility term V(xntj ,ηn)is assumed to be a
function of covariates xntj and a collection of taste parameters ηn, while ntj is a random error term,
following a type-I Extreme Value distribution. We consider the general setting under which the tastes
ηncan be decomposed into a vector of fixed taste parameters αRLshared across decision-makers,
and individual-specific random taste parameters βnRK.
In the Bayesian framework, we assume the individual-specific taste parameters βnto follow a mul-
tivariate normal distribution, i.e. βn N (ζ,). We further assume the fixed taste parameters α
and the mean vector ζto follow a multivariate normal distribution, respectively: α N (λ0,Ξ0)and
ζ N (µ0,Σ0). As for the covariance matrix , we decompose our prior into a scale and a correlation
matrix as follows: =diag(τ)×Ψ×diag(τ), where Ψis a correlation matrix and τis the vector
of coefficient scales (Barnard et al. 2000; Hilbe 2009). For the components of the scale vector τwe
employ a vague half-Cauchy prior, e.g. τkhalf-Cauchy(10), while for the correlation matrix - a LKJ
prior (Lewandowski et al. 2009), such that ΨLKJ(ν). The hyper-parameter νdirectly controls the
amount of correlation favoured by the prior.
The generative process of the MMNL model can be summarised as follows:
1. Draw fixed taste parameters α N (λ0,Ξ0)
2. Draw mean vector ζ N (µ0,Σ0)
3. Draw scales vector θhalf-Cauchy(σ0)
4. Draw correlation matrix ΨLKJ(ν)
5. For each decision-maker n∈ {1, . . . , N}
(a) Draw random taste parameters βn N (ζ,)
(b) For each choice occasion t∈ {1, . . . , T }
i. Draw observed choice ynt MNL(ηn,Xnt)
3
yn,t
xn,t
ηn,t
βn
ζ
Ψ
α
ct
µt
θNN
T
N
Fig. 1: Graphical model representation of the proposed C-MMNL model, where the key changes to the
original MMNL model are highlighted in blue.
where =diag(θ)×Ψ×diag(θ)and ηn= [α,βn].
2.2 Context-aware Bayesian Mixed Multinomial Logit Model
We present the idea of the context-aware Bayesian mixed multinomial logit (C-MMNL) model, where the
context information is included in the form of an easily interpretable context-specific bias term µt, a non-
linear function of the context information that shifts the preference parameters of each individual nin
each choice occasion t, i.e. ηnt =ηn+µt, where ηn= [α,βn]. The adjustment term µtis assumed to be
determined by a neural network that takes as input the context information ct, i.e.: µt=NNetθNN (ct). In
order to share statistical strength across individuals, we assume that all individuals shift their preference
parameters in the same way when faced with a given choice context ct, and therefore the parameters
of the neural network, θNN , are shared for all individuals. However, we note that, if this assumption is
considered too strong for some applications, one can relax it by allowing the neural network to also take
into account, for example, the socio-demographic characteristics of the decision-maker. This would
allow for complex interactions between the latter and the context information ct. Figure 1shows the
graphical model representation of the proposed C-MMNL model.
The generative process of the proposed C-MMNL model can then be summarised as follows, where
the main changes to the generative process assumed by the original MMNL model (described in Sec-
tion 2.1) have been highlighted:
1. Draw fixed taste parameters α N (λ0,Ξ0)
2. Draw mean vector ζ N (µ0,Σ0)
3. Draw scales vector θhalf-Cauchy(σ0)
4. Draw correlation matrix ΨLKJ(ν)
5. For each choice occasion t∈ {1, . . . , T }
(a) Determine context-specific shift term µt=NNetθNN (ct)
6. For each decision-maker n∈ {1, . . . , N}
(a) Draw random taste parameters βn N (ζ,)
(b) For each choice occasion t∈ {1, . . . , T }
i. Compute context-adjusted taste parameters: ηnt =ηn+µt, with ηn= [α,βn]
ii. Draw observed choice ynt MNL(ηnt,Xnt)
4
摘要:

Context-awareBayesianMixedMultinomialLogitModelMirosawaukawska,y,AndersFjendboJenseny,andFilipeRodriguesyCorrespondingauthor(mirlu@dtu.dk)yTechnicalUniversityofDenmark,DepartmentofTechnology,ManagementandEconomics,Bygningstorvet116b,2800Kgs.LyngbyDeclarationsofinterest:noneHighlights•Weproposean...

展开>> 收起<<
Context-aware Bayesian Mixed Multinomial Logit Model Mirosława Łukawskay Anders Fjendbo Jenseny and Filipe Rodriguesy Corresponding author mirludtu.dk.pdf

共18页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:18 页 大小:443.83KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 18
客服
关注