are also popular definitions of fairness. Recently, fairness in terms of a gap of recourse has been
proposed, where recourse is defined as the ability to obtain a positive outcome from the model [
34
].
While the suitability of a fairness measure is application dependent [
26
,
3
], demographic parity and
equalized odds remain the most popularly used, and the need for recourse gap-based fairness is being
increasingly recognized [18].
However, static models can encounter drift once deployed, as the statistical properties of real data
often change over time. This can lead to deteriorating performance. Model drift can occur when the
properties of the target variable change (concept drift) or when the input data distribution changes, or
both. The performance of models has largely been measured through accuracy-based metrics such
as misclassification rates, F-score or AUC. [
37
]. However, a model trained in the past and found
to be fair at training time may act unfairly for data in the present. Addressing drift with respect to
fairness in addition to accuracy has remained largely unexplored though it is an important aspect of
trustworthy AI in practice.
Explainability of individual model outcomes is another principal concern for trustworthy ML. Among
many methods of explanations in terms of feature attribution, [
6
], the SHAP approach based on Shap-
ley values is particularly popular as it enjoys several axiomatic guarantees [21]. While computation
of SHAP values is fast for linear and tree-based models, it can be very slow for neural networks
and several other model types, especially when the data has a large numbers of features or when a
large number of explanations are required [
27
]. This poses a barrier to deployments that demand fast
explanations in real-time, production settings.
In this paper, we address these fairness, data/model drift, and explainability concerns by proposing
FEAMOE: Fair, Explainable and Adaptive Mixture of Experts, an incrementally grown mixture of
experts (MOE) with fairness constraints. In the standard mixture of experts setup, each expert is
a machine learning model, and so is the gating network. The gating network learns to assign an
input-dependent weight
gu(x)
to the
uth
expert for input
x
, and the final output of the model is a
weighted combination of the outputs of each expert. Hence, each expert contributes differently for
every data point towards the final outcome, which is a key difference from standard ensembles.
Many types of MOE’s exist in the literature [
40
] - the architecture is not standard. For FEAMOE, we
chose this family, with some novel modifications described later, for three main reasons: 1) Suitable
regularization penalties that promote fairness can be readily incorporated into the loss function. 2)
Online learning is possible, so changes in the data can be tracked. Crucially, since localized changes
in data distribution post-deployment may impact only one or a few experts, the other experts may not
need to be adjusted, making the experts localized and only loosely coupled. This allows for handling
drift and avoiding catastrophic forgetting, which is a prime concern in widely used neural network
models [
31
]. 3) Simpler models can be used to fit a more complex problem in the mixture of experts,
as each model needs to fit well in only a limited part of the input space. In particular, even linear
models, which provide very fast SHAP explanations, can be used. The overall mixture of experts,
even with such simple base models (the "experts") often has predictive power that is comparable to a
single complex model such as a neural network, as shown by our experiments as well as in many
previous studies [40].
A motivating toy example of why FEAMOE is needed and how it works is shown in Figure 1.
Consider a linear binary classifier (1a) that has perfect accuracy. The colors represent the ground
truth labels, and green is the positive (desired) class label. The circles are the privileged group and
diamonds are the underprivileged group. As can be seen in the figure, more diamonds receive a
negative outcome and more circles receive a positive outcome. Consider new data that arrives for
predictions. This classifier (1b) not only misclassifies individuals but also gives more underprivileged
individuals that were actually in the positive class a negative outcome, hence inducing bias with
respect to equalized odds. There is drift with respect to accuracy and fairness. A more complex
model (1c) such as a neural network, if retrained, may handle some of these concerns but would be
less explainable.
FEAMOE can address these imperative concerns, as shown in 1d. Trained in an online manner, a
new linear model is added (i.e., an expert) once the new data arrives. The gating network dictates
which region each expert operates in (shown by the blue and pink colors), and FEAMOE is able
to adapt automatically with respect to accuracy and fairness. This dynamic framework enables the
overall model to be fairer, adjust to drift, maintain accuracy, while also remaining explainable since
the decision boundary is locally linear.
2