preference parameters of the decision-maker are constant throughout time and throughout different
choice situations, which may be deemed too strong for certain choice modelling applications.
Individual preferences are complex and heterogeneous; they depend on different choice scenarios,
and might evolve over time (Castells et al. 2021; Hess and Giergiczny 2015; Krueger et al. 2021). Even
when facing the same choice situation multiple times, an individual might make different decisions de-
pending, for example, on the circumstances in the moment of choice (context). One can think of a
context as an external factor that varies across individuals, as well as choice situations, and captures
temporal (e.g. weather) or long-term (e.g. pandemics) circumstances. This temporal element might
cause behaviour related to certain effects not to remain stable over time which can be important in
explaining the variation in explanatory variables (Mannering 2018). This has been evident in many be-
havioural research fields, including transport (Mannering et al. 1994), economics (Meier and Sprenger
2015), or accident research (Islam et al. 2020; Mannering 2018).
Empirical evidence suggests that even though the major part of heterogeneity relates to inter-
respondent heterogeneity, incorporating intra-respondent heterogeneity can lead to further gains in fit
(Hess and Giergiczny 2015). Capturing this intra-heterogeneity often requires more complex model
specification considerations. Hess and Rose (2009) extended the MMNL model formulation, allowing
for intra-respondent heterogeneity on top of the inter-respondent heterogeneity and assuming additional
random variation around the mean taste across multiple choice scenarios for the same individual. This
model outperforms the standard multinomial logit model in terms of prediction accuracy (in- and out-
of-sample; Danaf et al. 2019; Xie et al. 2020), but Krueger et al. (2021) found only minor improvement
when compared to the mixed multinomial logit model. Becker et al. (2018) further provided a Bayesian
treatment of this intra-inter heterogeneity approach and proposed a Markov-chain Monte Carlo (MCMC)
procedure for performing inference. Danaf et al. (2020) extended this procedure by relaxing the con-
straint of normality assumptions. However, these approaches do not allow for systematic variations in
preference parameters as a function of contextual variables.
A further approach to model heterogeneity is based on classes of individuals, with homogeneous
preferences within each class. The most common example is the Latent Class Choice Model (LCCM);
see, e.g. Greene and Hensher (2003) or Hess (2014) for a discussion. A similar idea was employed
in a collaborative learning framework with time-varying parameters in the context of personal recom-
mendations (Zhu et al. 2020). Unlike for LCCM, where an individual is perceived as belonging to one
class, the individual’s membership vector in a more flexible collaborative model represents a combi-
nation of multiple preference patterns identified by the model, enabling personalisation of preferences.
For a detailed comparison between these two approaches, we refer to Zhu et al. (2020). Moreover,
the collaborative learning approach overcomes the limitation of inter-intra heterogeneity model, where
personalised predictions and recommendations are not possible because the individual parameters for
each choice situation are not estimated.
With the developments in computational hardware, Bayesian approaches to choice modelling have
been gaining research interest. Washington et al. (2009) elaborated on the theory and specified the
task to route choice modelling. Train (2001) compared the Bayesian approach to mixed multinomial
logit with Maximum Likelihood Simulation, an experiment repeated and extended by Elshiewy et al.
(2017). When using non-informative priors, estimates in both approaches are similar, especially for
large datasets (Congdon 2007; J. Huber and Train 2001). However, utilising the Bayesian approach
allows researchers to include more information within the estimation procedure, thus improving the
behavioural explanation of the model. It also gives a possibility to obtain full posterior distributions
over the model parameters (including the individual-specific taste parameters) and take advantage of
modern approaches, for example utility generation (Rodrigues et al. 2020) or inference (Rodrigues
2022). Further extensions of the classical mixed logit model are possible using Variational Bayes for
posterior inference, as shown in Krueger et al. (2019), where a method for including unobserved inter-
and intra-individual heterogeneity in behaviour was derived.
A substantial effort has been made in extending discrete choice models with machine learning
frameworks, with recent papers suggesting a joint perspective and a complementarity of these two
approaches (Salas et al. 2022; Wang et al. 2021). An example of this synergy is leveraging neural
networks for the utility specification (”Neural-embedded Discrete Choice Model”, Han 2019) through
representation learning, and thus enhancing the capability to capture the inter-respondent heterogene-
ity (Han et al. 2022; Sifringer et al. 2020; Van Cranenburgh and Alwosheel 2019). Despite the im-
proved prediction accuracy compared to traditional DCM methods (Salas et al. 2022), these methods
do not focus on improving the predictability on the individual level (Krueger et al. 2021). Thus, previous
approaches had limited practicality, and efforts should be made to make these more advanced and
2