Conditional Feature Importance for Mixed Data Kristin Blesch12 David S. Watson3and Marvin N. Wright124 1Leibniz Institute for Prevention Research Epidemiology BIPS Bremen Germany.

2025-05-01 0 0 7.23MB 52 页 10玖币

侵权投诉

Conditional Feature Importance for Mixed Data

Kristin Blesch1,2*, David S. Watson3and Marvin N. Wright1,2,4

1* Leibniz Institute for Prevention Research & Epidemiology – BIPS, Bremen, Germany.

2Faculty of Mathematics and Computer Science, University of Bremen, Bremen, Germany.

3Department of Informatics, King’s College London, London, United Kingdom.

4Department of Public Health, University of Copenhagen, Copenhagen, Denmark.

*Corresponding author. E-mail: blesch@leibniz-bips.de;

Contributing authors: david.watson@kcl.ac.uk;wright@leibniz-bips.de;

Abstract

Despite the popularity of feature importance (FI) measures in interpretable

machine learning, the statistical adequacy of these methods is rarely dis-

cussed. From a statistical perspective, a major distinction is between analyzing

a variable’s importance before and after adjusting for covariates – i.e.,

between marginal and conditional measures. Our work draws attention to this

rarely acknowledged, yet crucial distinction and showcases its implications.

We ﬁnd that few methods are available for testing conditional

FI, and practitioners have hitherto been severely restricted in

method application due to mismatched data requirements. Most

real-world data exhibits complex feature dependencies and incorpo-

rates both continuous and categorical features (i.e., mixed data).

Both properties are oftentimes neglected by conditional FI measures.

To ﬁll this gap, we propose to combine the conditional predictive

impact (CPI) framework with sequential knockoﬀ sampling. The CPI

enables conditional FI measurement that controls for any feature depen-

dencies by sampling valid knockoﬀs – hence, generating synthetic data

with similar statistical properties – for the data to be analyzed.

Sequential knockoﬀs were deliberately designed to handle mixed data

and thus allow us to extend the CPI approach to such datasets.

We demonstrate through numerous simulations and a real-world example that

our proposed workﬂow controls type I error, achieves high power, and is in line

with results given by other conditional FI measures, whereas marginal FI met-

rics can result in misleading interpretations. Our ﬁndings highlight the necessity

of developing statistically adequate, specialized methods for mixed data.

Keywords: Interpretable Machine Learning, Feature Importance, Knockoﬀs,

Explainable Artiﬁcial Intelligence

arXiv:2210.03047v3 [stat.ML] 2 May 2023

2Conditional Feature Importance for Mixed Data

1 Introduction

Interpretable machine learning is on the rise as practitioners become inter-

ested in not only achieving high prediction accuracy in supervised learning

tasks, but also understanding why certain predictions were made. Evaluating

the importance of input variables (features) to the target prediction plays a

crucial role in facilitating such endeavours. Several feature importance (FI)

measures have been proposed by the machine learning community, but diﬀering

conceptualizations are spread across the literature.

We identify at least ﬁve dichotomies that orient FI methods: (1) global vs.

local; (2) model-agnostic vs. model-speciﬁc; (3) testing vs. scoring; (4) meth-

ods that do and do not accommodate mixed tabular data; and (5) conditional

vs. marginal measures. This deﬁnes a grid with 25= 32 cells that helps cate-

gorize FI measures. For example, the popular SHAP algorithm (Lundberg and

Lee,2017) produces local, model-agnostic FI scores that can accommodate

mixed data and measures marginal FI. We emphasize that there is no “ideal”

conﬁguration of these ﬁve options—each is the right answer to a diﬀerent

question that is irreducibly context-dependent. However, this grid helps iden-

tify a notable lacuna: There are few global, model-agnostic FI methods that

accommodate mixed data with error control for conditional FI measurement.

Explaining the dichotomies in more detail, local FI measures (Lundberg

and Lee,2017;Ribeiro et al.,2016) are optimized for a particular point or

region of the feature space, e.g. a single observation, while global FI scores

(Fisher et al.,2019;Friedman,2001) measure a variable’s overall importance.

Model-speciﬁc measures (Breiman,2001;Kursa and Rudnicki,2010;Shriku-

mar et al.,2017) exploit the properties of a particular function class for more

eﬃcient or precise FI calculation, while model-agnostic measures (Apley and

Zhu,2020;Ribeiro et al.,2018) treat the underlying model as a black box.

Testing methods include some inference procedure for error control (Lei et al.,

2018), while scoring methods (Covert et al.,2020) do not. Some methods are

proposed with limited applicability to certain data types, e.g., only continu-

ous inputs (Watson and Wright,2021), while others are more ﬂexible (Molnar

et al.,2023). We discuss a selection of FI methods brieﬂy in Section 2, but

refer readers to review papers on FI interpretability methods, e.g. Linardatos

et al. (2021), for a wider discussion on the topic.

Through the lens of statistics, the division (5), conditional vs. marginal

measures, is particularly important yet insuﬃciently acknowledged in both

literature and practice (Apley and Zhu,2020;Hooker et al.,2021;Molnar

et al.,2023;Watson and Wright,2021). The complementary concepts become

evident when relating the statistical conception of independence testing to the

machine learning view on FI measurement. We can think of the marginal null

hypothesis as testing whether the input feature Xjis independent of other

covariates X−jor the target variable Y:

0:Xj⊥⊥ {Y, X−j}(1)

Conditional Feature Importance for Mixed Data 3

On the other hand, testing against (2) accounts for the covariates X−jand

hence corresponds to conditional FI:

0:Xj⊥⊥ Y|X−j(2)

These tests clearly target diﬀerent objectives. In this setup, we have HM

0entail-

ing HC

0, but not the other way around. However, this strength comes with a

certain loss of speciﬁcity, because rejecting HM

0leaves it unclear whether Xj

is correlated with Y,X−j, or both.

The relationship between FI and independence testing sheds light on

another aspect, which may even be considered another dichotomy: does the

FI measure aim to investigate model behaviour or the underlying data struc-

ture (Chen et al.,2020)? For example, conditional independence tests that

are part of some conditional FI measures (Watson and Wright,2021) may be

used for causal structure learning, which often is based on repeated condi-

tional independence testing (Glymour et al.,2019). Therefore, conditional FI

measures can help explain the underlying data structure, whereas marginal FI

measures diﬀerentiate between variables the predictive model relies on, which

can be used to evaluate the fairness of a model. This does not preclude practi-

tioners from using marginal and conditional FI measures in conjunction, and

since marginal measures are often faster to compute, they might be preferable

for quick assessments in large pipelines with many iterations. However, prac-

titioners must be careful to interpret these measures properly and not infer a

conditional signal from a marginal test.

In Fig. 1, we illustrate the diﬀerence between marginal (permutation fea-

ture importance (PFI), Fisher et al.,2019;Breiman,2001) and conditional

(conditional predictive impact with Gaussian knockoﬀs (CPIgauss), Watson

and Wright,2021) FI measures. In this example, the confounding variable C

is a common cause of both Xand Y. This causal structure induces spurious

correlation between Xand Y, leading the marginal FI measure to attribute

nonzero importance to both Cand Xin predicting Y. On the contrary, the con-

ditional FI measure attributes nonzero FI only to C, since Xhas no additional

predictive value for Yabove C.

This paper explores global, model-agnostic FI methods that accommodate

mixed data with error control for conditional FI measurement. This is not a

niche problem: mixed tabular data is the norm in many important areas such

as healthcare, economics, and industry, and inference procedures are essen-

tial for decision making in high risk domains to minimize costly errors. With

the proliferation of machine learning algorithms, model-agnostic approaches

can help standardize FI tasks without recalibrating to a particular function

class for each new application. Conditional, global measures are valuable when

practitioners seek mechanistic understanding that takes data covariance into

account and go beyond individual model outputs.

Even though the empirical relevance of this kind of FI measurement is

eminent, specialized methods are lacking. Some FI methods have yet to be eval-

uated in mixed data settings (Covert et al.,2020;Molnar et al.,2023;Lei et al.,

4Conditional Feature Importance for Mixed Data

Fig. 1 Boxplots contrasting marginal and conditional FI metrics for a prediction of Y

with C, X (N= 200) through a random forest prediction model across 1 000 replicates.

The conditional FI measure attributes no importance to X, whereas the marginal measure

attributes non-zero importance to Xbecause (due to induced correlation between Xand Y

by C) it is predictive of Y.

2018), while others are currently inapplicable (Watson and Wright,2021). The

consequences of neglecting the special nature of mixed data for conditional FI

measurement remain unexplored, and therefore practitioners currently have no

guidance on how to proceed with conditional FI measurement in such cases,

which proves a severe limitation in real-world applications.

We propose to combine the conditional predictive impact (CPI) testing

framework proposed by Watson and Wright (2021) with the use of sequential

knockoﬀs (Kormaksson et al.,2021) in order to enable conditional, global,

model-agnostic FI testing for mixed data. CPI is a ﬂexible, model-agnostic

tool that relies on the usage of so-called knockoﬀs (Cand`es et al.,2018). In

short, knockoﬀs are synthetic variables that carry over the major statistical

properties of the original variables, such as the correlation structure among

covariates. While Watson and Wright (2021) claim that the CPI should in

principle work with any valid set of knockoﬀs, it has thus far only been applied

and evaluated with Gaussian knockoﬀs (Cand`es et al.,2018). This currently

limits practitioners to using the CPI method only with continuous variables or

to disregard the specialities of mixed data. We analyse consequences of such

a disregard when using CPI with Gaussian knockoﬀs (Cand`es et al.,2018)

(CPIgauss) and deep knockoﬀs (Romano et al.,2020) (CPIdeep) and propose

a specialized solution strategy to tackle the mixed data case: using sequential

knockoﬀs (Kormaksson et al.,2021) – a knockoﬀ sampling algorithm explicitly

developed for mixed data – within the CPI framework (CPIseq).

The paper will be structured as follows. We present relevant methodology

and FI measures in Section 2. Section 2.2 reviews several knockoﬀ sampling

algorithms, demonstrating the need for specialized procedures with mixed

Conditional Feature Importance for Mixed Data 5

data and motivating our proposed solution, CPIseq. Through simulation stud-

ies in Sections 3.1 and 3.2, we will evaluate our newly proposed workﬂow in

more depth and further compare it to other methods. Finally, we illustrate

method application to a real-world dataset in Section 3.3 before concluding

and discussing our ﬁndings in Section 4.

2 Methods

With a focus on the measurement of model-agnostic, global, conditional FI,

this section presents related measures proposed by previous literature and

discusses their applicability to mixed data. We acknowledge that methods

from the statistical literature on conditional independence testing (Shah and

Peters,2020;Williamson et al.,2021) might also be utilized for conditional FI

measurement, however, a full comparison of such methods is beyond the scope

of this paper. Further, it is worth clarifying at this point that we understand

FI here as a concept that is tied to the variable’s eﬀect on the predictive

performance in a supervised learning task.

2.1 Feature Importance Measures

Conditional subgroup approach (CS)

A global, model-agnostic FI measure that acknowledges the crucial distinction

between conditional and marginal measures of importance is the conditional

subgroup (CS) approach proposed by Molnar et al. (2023). CS partitions

the data into interpretable subgroups, i.e. groups whose feature distribu-

tions are homogeneous within but heterogeneous between groups. The method

is promising, as it explicitly speciﬁes the conditioning between subgroups

and further allows for an unconditional interpretation within subgroups.

This means the method provides both a global conditional and a within-

group unconditional interpretation, which sheds light on feature dependence

structures.

To determine FI, CS evaluates the change in loss when the variable of

interest is permuted within subgroups, which lowers extrapolation to low-

density regions of the feature space, thereby mitigating a common problem

with permutation-based approaches (Hooker et al.,2021). To decide on a suit-

able partition, the authors suggest determining subgroups via transformation

trees. Using a pre-speciﬁed loss function, the average increase in loss is reported

for multiple permutations versus the original ordering of variables.

CS is not aﬀected by mixed data other than through the choice of an appro-

priate prediction algorithm, which is why this method is suspected to work

equally well with mixed data. However, for this approach to work, researchers

must assume that the data is separable into subgroups. Further, for testing

FI, the method would need to rely on computationally expensive permutation

tests, as no inherent testing procedure is provided.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ConditionalFeatureImportanceforMixedDataKristinBlesch1,2*,DavidS.Watson3andMarvinN.Wright1,2,41*LeibnizInstituteforPreventionResearch&Epidemiology{BIPS,Bremen,Germany.2FacultyofMathematicsandComputerScience,UniversityofBremen,Bremen,Germany.3DepartmentofInformatics,King'sCollegeLondon,London,UnitedK...

展开>> 收起<<

Conditional Feature Importance for Mixed Data Kristin Blesch12 David S. Watson3and Marvin N. Wright124 1Leibniz Institute for Prevention Research Epidemiology BIPS Bremen Germany..pdf

共52页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Conditional Feature Importance for Mixed Data Kristin Blesch12 David S. Watson3and Marvin N. Wright124 1Leibniz Institute for Prevention Research Epidemiology BIPS Bremen Germany.

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: