1 Introduction
1.1 Causal Impact of Marine Protected Areas on Biodiversity
Preserving marine biological diversity is an important objective of governments, scien-
tists, local communities, and conservationists. Marine protected areas (MPAs) have been
established worldwide to keep sustainable and resilient marine ecosystems by restricting
destructive and extractive activities within their boundaries (Grorud-Colvert et al., 2021;
UNEP-WCMC et al., 2021). Despite widespread use, the effectiveness of many MPAs and
different types of MPA policies in conserving marine biodiversity remain unclear (Grorud-
Colvert et al., 2021). Very few studies employ rigorous causal inference methods to assess
MPA impacts, and even less so to investigate the relative effects of different conservation
policies (Ferraro et al., 2019). Such studies, however, are important and have significant
policy implications, as prohibiting fishing activities that are potentially important for local
food and livelihood security can result in significant social costs and harm (e.g., Kamat
(2014); Bennett and Dearden (2014)).
Gill et al. (2017) investigated the effectiveness of MPA management and its impacts
on fish populations. They developed a database of ecological, management, social, and
environmental conditions in and around hundreds of MPAs globally. In their study, man-
agement attributes such as available capacity were strongly associated with increases in
fish biomass observed in MPAs. Nonetheless, the relative effects of different types of MPAs
(referred to as policies or treatments), such as those that restrict fishing (hereafter called
multi-use or MU MPAs) and those that prohibit all fishing (hereafter called no-take or NT
MPAs) require further investigation.
While the Gill et al. (2017) database represents one of the largest global datasets of MPA
conditions and ecological outcomes to date, its properties present significant challenges for
applying traditional causal inference methods. First, given the intractability of conducting
randomized experiments in many conservation settings, the global MPA dataset is observa-
tional, and thus subject to confounding biases not present when treatment is randomized
(Pynegar et al., 2021). MU and NT MPAs are likely to be located in areas with different
social, environmental and regulatory conditions. Direct comparisons of the biodiversity
between MU and NT MPAs are fallible. Second, the MPA data are spatially clustered as
nearby sites are usually under the same conservation policy, whether it be because they lie
within the same MPA, specific management zone within an MPA (e.g., no diving area),
or larger-scale management policy area (e.g., regional or national level fishing policies).
Individual sites also share similar geographical, environmental, and social features that are
possibly dependent on each other. Therefore, estimating the causal impacts of policies such
as MPAs requires appropriate methods for clustered and confounded data.
1.2 Previous Work: Causal Inference in Observational Studies
Although randomized experiments serve as the gold standard, observational studies can
estimate causal effects when all confounding variables are well balanced between treatment
groups. To adjust for the imbalance in observed confounding covariates, matching (Stuart,
2010) is often applied to isolate causal effects due to its transparency and intuitive appeal.
While statistical methods to estimate causal effect in observational studies are growing,
most methods apply to unstructured data (i.e., without clustering). However, clustering
2