POST-SELECTION INFERENCE IN MULTIVERSE ANALYSIS PIMA AN INFERENTIAL FRAMEWORK BASED ON THE SIGN FLIPPING SCORE TEST

2025-05-02 0 0 657.73KB 46 页 10玖币
侵权投诉
POST-SELECTION INFERENCE IN MULTIVERSE ANALYSIS (PIMA):
AN INFERENTIAL FRAMEWORK BASED ON THE SIGN FLIPPING
SCORE TEST
Paolo Girardi1, Anna Vesely2, Dani¨
el Lakens3, Gianmarco Alto`
e4,
Massimiliano Pastore4, Antonio Calcagn
`
ı4, Livio Finos5
1department of environmental sciences, informatics and statistics, ca
foscari university of venice, italy
2institute for statistics, university of bremen, germany
3department of industrial engineering and innovation sciences, eindhoven
university of technology, netherlands
4department of developmental psychology and socialisation, university of
padova, italy
5department of statistical sciences, university of padova, italy
Contact Info
Paolo Girardi: paolo.girardi@unive.it; Anna Vesely: anna.vesely@unipd.it; Dani¨el Lakens:
D.Lakens@tue.nl; Gianmarco Alto`e: gianmarco.altoe@unipd.it; Massimiliano Pastore:
massimiliano.pastore@unipd.it; Antonio Calcagn`ı: antonio.calcagni@unipd.it; Livio Finos:
livio.finos@unipd.it.
Founding
This research received no specific grant from any funding agency in the public, commercial, or
not-for-profit sectors.
Competing interests
arXiv:2210.02794v2 [stat.ME] 3 Oct 2023
Psychometrika Submission October 4, 2023 2
POST-SELECTION INFERENCE IN MULTIVERSE ANALYSIS (PIMA): AN INFERENTIAL
FRAMEWORK BASED ON THE SIGN FLIPPING SCORE TEST
Abstract
When analyzing data researchers make some decisions that are either arbitrary,
based on subjective beliefs about the data generating process, or for which equally
justifiable alternative choices could have been made. This wide range of data-analytic
choices can be abused, and has been one of the underlying causes of the replication
crisis in several fields. Recently, the introduction of multiverse analysis provides
researchers with a method to evaluate the stability of the results across reasonable
choices that could be made when analyzing data. Multiverse analysis is confined to a
descriptive role, lacking a proper and comprehensive inferential procedure. Recently,
specification curve analysis adds an inferential procedure to multiverse analysis, but
this approach is limited to simple cases related to the linear model, and only allows
The authors declare that they have no known competing financial interests or personal relationships
that could have appeared to influence the work reported in this paper.
Data Availability
All R code and data associated with the real data application are available at https://
osf.io/usrq7/?view_only=4c40978b0080496c98bb5b13592278b4, while further analyses can be
developed trough the dedicated package Jointest (Finos, 2022) available at https://github.com/
livioivil/jointest
Correspondence should be sent to
Paolo Girardi
Address: Via Torino 155, 30172 Venezia-Mestre (VE), Italy
E-Mail: paolo.girardi@unive.it
Psychometrika Submission October 4, 2023 3
researchers to infer whether at least one specification rejects the null hypothesis, but
not which specifications should be selected. In this paper we present a Post-selection
Inference approach to Multiverse Analysis (PIMA) which is a flexible and general
inferential approach that accounts for all possible models, i.e., the multiverse of
reasonable analyses. The approach allows for a wide range of data specifications (i.e.
pre-processing) and any generalized linear model; it allows testing the null hypothesis of
a given predictor not being associated with the outcome, by merging information from
all reasonable models of multiverse analysis, and provides strong control of the
family-wise error rate such that it allows researchers to claim that the null hypothesis
can be rejected for each specification that shows a significant effect. The inferential
proposal is based on a conditional resampling procedure. We formally prove that the
type I error rate is controlled, and compute the statistical power of the test through a
simulation study. Finally, we apply the PIMA procedure to the analysis of a real
dataset about coronavirus disease 2019 (COVID-19) vaccine hesitancy before and after
the 2020 lockdown in Italy. We end with practical recommendations to consider when
performing the proposed procedure.
Key words: multiverse analysis, flipping score, statistical inference, testing,
reproducibility, replicability
Psychometrika Submission October 4, 2023 4
1. Introduction
Real data analysis often provides many justifiable choices at each step of the analysis, such as
how measurements are combined and transformed, how missing data and outliers are handled,
and even the choice of a statistical model. Generally, there is not a single justifiable choice for
each decision researchers need to make, and several justifiable options exist for each step of the
data analysis (Gelman and Loken, 2014). As a consequence, raw data do not uniquely give rise to
a single dataset for analysis. Instead, researchers are faced with a set of processed datasets, each
of which is determined by a unique combination of choices – a multiverse of datasets. As analyses
performed on each dataset can lead to different results, the data multiverse directly implies a
multiverse of statistical results. In recent years, concerns have been raised about how researchers
can abuse this flexibility in data analysis to increase the probability of observing a statistically
significant result. The reason researchers engage in such questionable research practices can be
due both to the editorial’s practice of predominantly publishing statistically significant results or
the selection of findings that confirms the belief of the same authors (Begg and Berlin, 1988;
Dwan et al., 2008; Fanelli, 2012). When researchers select and report the results of a subset of all
possible analyses that produce significant results (Sterling, 1959; Greenwald, 1975; Simmons
et al., 2011; Brodeur et al., 2016), they dramatically increase the actual false-positive rates
despite their nominal endorsement of a low type I error rate (e.g., 5%). Two solutions have been
proposed to deal with the problem of p-hacking. The first is to require researchers to specify their
statistical analysis plan before they look at the raw data. Such preregistered studies control the
type I error rate by reducing flexibility during the data analysis. Preregistration is easily
implemented for replication studies, where researchers specify they will perform the same analysis
as was performed in an earlier study. For more novel studies, preregistration can be difficult
because researchers often lack sufficient knowledge to be able to foresee how they should deal with
all possible decisions that need to be made when analyzing the data. The second solution
acknowledges that it is often not feasible to specify a single analysis before the data has been
collected, and instead promotes transparently reporting all possible analyses that can be
performed. Steegen et al. (2016) introduced multiverse analysis which aims to use all reasonable
Psychometrika Submission October 4, 2023 5
options for data processing to construct a multiverse of datasets, and then separately perform the
same analysis of interest on each of these datasets. The main tool used to interpret the output of
a multiverse analysis is a histogram of p-values that summarizes all the p-values obtained for a
given effect. Subsequently, researchers typically discuss the results in terms of the proportion of
significant p-values. The procedure not only provides a detailed picture of the robustness or
fragility of results across different choices for processing but also allows researchers to explore key
choices that are most consequential in the fluctuation of their results.
Multiverse analysis represents an invaluable step toward a transparent science. The method
has become increasingly popular since it was developed and has been applied in various
experimental contexts, ranging from cognitive development and risk perception (Mirman et al.,
2021), assessment of parental behavior (Modecki et al., 2020), and memory tasks (Wessel et al.,
2020). Although some of the applications remain confined to exploratory purposes with the scope
to define brief guidelines for conducting a multiverse analysis (Dragicevic et al., 2019; Liu et al.,
2020), other studies aim to stimulate interest in this method as a robustness assessment for
mediation analysis (Rijnhart et al., 2021) or an exhaustive modeling approach (Frey et al., 2021).
This research approach permits to exhibit the stability and robustness of discoveries, not just
between different exclusion criteria or modifications of the variables, but between different
decisions for all phases of the elaboration of the data. This feature can be particularly interesting
and appealing from the perspective of the replicability crisis in quantitative psychology (Open
Science Collaboration, 2015), and as an attempt to increase the transparency and credibility of
scientific results (Nosek and Lakens, 2014). Multiverse analysis can therefore be extended not
only to the pre-processing step but also to the methods used for the analysis (the “multiverse of
methods”) (Harder, 2020).
The explicit flexibility in multiverse analysis is not to be condemned as it reflects an effort to
transparently describe the uncertainty about the best analysis strategy. However, if, on the one
hand, the exploration of multiple analytic choices in data analysis must be advocated, on the
other it is challenging to draw reliable inferences from such a large number of statistical analyses.
Although most researchers have interpreted the results from multiverse analysis descriptively,
while doing so it is extremely tempting to make claims about analyses that yield statistically
摘要:

POST-SELECTIONINFERENCEINMULTIVERSEANALYSIS(PIMA):ANINFERENTIALFRAMEWORKBASEDONTHESIGNFLIPPINGSCORETESTPaoloGirardi1,AnnaVesely2,Dani¨elLakens3,GianmarcoAlto`e4,MassimilianoPastore4,AntonioCalcagn`ı4,LivioFinos51departmentofenvironmentalsciences,informaticsandstatistics,ca’foscariuniversityofvenice,...

展开>> 收起<<
POST-SELECTION INFERENCE IN MULTIVERSE ANALYSIS PIMA AN INFERENTIAL FRAMEWORK BASED ON THE SIGN FLIPPING SCORE TEST.pdf

共46页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:46 页 大小:657.73KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 46
客服
关注