
Other domains that adopt adversarial training to
eliminate confounders include bioinformatics (Din-
cer et al.,2020) and political science (Roberts et al.,
2020). Many existing works on identifying short-
cuts focus on situations where these patterns are
known in advance and may require potentially ex-
pensive data collection. In fairness-focused legal
NLP, Chalkidis et al. (2022b) observe and remedy
group disparities in LJP performance on the EC-
tHR informed by metadata attributes (respondent
state, applicant gender, applicant age). We extend
this to explainability in LJP by involving a legal
expert in a procedure that allows an efficient, incre-
mental identification of distracting information, as
well as its removal via adversarial training.
Interpretability
We employ interpretability tech-
niques to evaluate model alignment with expert
rationales. Danilevsky et al. (2020) reviews and
categorizes the main current interpretability meth-
ods. Though initial works (Ghaeini et al.,2018;
Lee et al.,2017) used attention scores as explana-
tion for model decisions, Bastings and Filippova
(2020); Serrano and Smith (2019) point out that
saliency methods, such as gradient based methods
(Sundararajan et al.,2017;Li et al.,2016), propaga-
tion based methods (Bach et al.,2015), occlusion
based methods (Zeiler and Fergus,2014), and sur-
rogate model based methods (Ribeiro et al.,2016)
are better suited for explainability analysis. How-
ever, the reliability and informativeness of these
methods remains an open research problem. Our
model uses the currently most commonly used In-
tegrated Gradients (IG) (Sundararajan et al.,2017),
which computes the gradient of the model’s output
with respect to its input features.
3 ECtHR Tasks & Datasets
The ECtHR has been the subject of substantial prior
work in LJP. We use two datasets for model training
and evaluation: First, for
binary violation
we use
the dataset by Chalkidis et al. (2019) of approx. 11k
case fact statements, where the target is to predict
whether the court has found at least one convention
article to be violated. To evaluate alignment, we
annotate 50 (25 each) expert rationales for cases
from both the development and test partitions (See
App. Cfor the annotation process). Second, for
article-specific violation
, we use the LexGLUE
dataset by Chalkidis et al. (2022a), which consists
of 11k case fact statements along with information
about which convention articles have been alleged
to be violated, and which the court has found to
be violated. For alignment, we merge this data
with the 50 test set rationales from Chalkidis et al.
(2021). While both datasets stem from the EC-
tHR’s public database, they differ in case facts and
outcome distribution as we explain in Sec. 3.1. The
input texts consist of each case’s FACTS section
extracted from ECtHR judgments. This section is
drafted by court staff over the course of the case
proceedings. While it does not contain the out-
come explicitly, it is not finalized before the final
decision has been determined, potentially creating
confounding effects.
We conduct experiments on four LJP tasks:
Task J - Binary Violation
For our task
J
, the
model is given a fact statement and is asked to
predict whether or not any article of the conven-
tion has been violated. We train our models on
Chalkidis et al. (2019) and evaluate alignment on
the set of expert rationales we collected.
Task B - Article Allegation
We train and evaluate
on LexGLUE’s ECtHR B,
*
where the fact descrip-
tion is the basis to predict the set of convention arti-
cles that the claimant alleges to have been violated.
It can be conceptualized as topic classification in
that the system needs to identify suitable candidate
articles (e.g., the right to respect for private and
family life) from fact statements (e.g., about gov-
ernment surveillance). We test alignment on the
expert rationales by Chalkidis et al. (2021).
Task A - Article Violation
We also experiment
with LexGLUE’s ECtHR A, which is to predict
which of the convention’s articles has been deemed
violated by the court from a case’s fact descrip-
tion. Task A is a more difficult version of task B,
where both an identification of suitable articles and
a prediction of their violation must be performed.
For alignment, we again use the expert rationales
by Chalkidis et al. (2021), which are technically
intended for task ECtHR B, but which we consider
to also be suitable for an evaluation of task A.*
Task A|B - Article Violation given Allegation
We further disentangle the LexGLUE tasks and
pose ECtHR A|B. Given the facts of a case and
the allegedly violated articles, the model should
*
The LexGLUE dataset does not contain metadata (case id,
Respondent state etc); in this work we use an enriched version
of the same dataset by Mathurin Aché.
*
The annotation explanations in (Chalkidis et al.,2021)
state that “The annotator selects the factual paragraphs that
“clearly” indicate allegations for the selected article(s)”. We
hypothesize that the so annotated passages contain information
that is legally relevant for the violation as well.