Do Charge Prediction Models Learn Legal Theory Zhenwei An12 Quzhe Huang13 Cong Jiang45

2025-08-18 0 0 2MB 13 页 10玖币

侵权投诉

Do Charge Prediction Models Learn Legal Theory?

Zhenwei An1,2∗

, Quzhe Huang1,3∗, Cong Jiang4,5,

Yansong Feng1,6and Dongyan Zhao1,5,6

1Wangxuan Institute of Computer Technology, Peking University

2School of Software & Microelectronics, Peking University

3School of Intelligence Science and Technology, Peking University

4Peking University Law School 5Institute for Artiﬁcial Intelligence, Peking University

6The MOE Key Laboratory of Computational Linguistics, Peking University

{anzhenwei,huangquzhe,jiangcong,fengyansong,zhaody} @pku.edu.cn

Abstract

The charge prediction task aims to predict

the charge for a case given its fact descrip-

tion. Recent models have already achieved

impressive accuracy in this task, however, lit-

tle is understood about the mechanisms they

use to perform the judgment. For practical ap-

plications, a charge prediction model should

conform to the certain legal theory in civil

law countries, as under the framework of civil

law, all cases are judged according to cer-

tain local legal theories. In China, for ex-

ample, nearly all criminal judges make de-

cisions based on the Four Elements Theory

(FET). In this paper, we argue that trustworthy

charge prediction models should take legal the-

ories into consideration, and standing on prior

studies in model interpretation, we propose

three principles for trustworthy models should

follow in this task, which are sensitive,

selective, and presumption of innocence.

We further design a new framework to eval-

uate whether existing charge prediction mod-

els learn legal theories. Our ﬁndings indicate

that, while existing charge prediction models

meet the selective principle on a benchmark

dataset, most of them are still not sensitive

enough and do not satisfy the presumption

of innocence. Our code and dataset are re-

leased at https://github.com/ZhenweiAn/

EXP_LJP.

1 Introduction

The task of charge prediction is to determine appro-

priate charges, such as Fraud or Theft, for a case

by analyzing its textual fact descriptions. Such a

technique is beneﬁcial for improving the efﬁciency

of legal professionals, e.g., helping judges, lawyers,

or prosecutors to distinguish similar charges and

focus on discriminative features. But as an auxil-

iary tool in the legal domain, it should be used with

great caution, in case of introducing undesirable

unfairness (Angwin et al.,2016).

∗Equal Contribution.

Figure 1: An example of accusing the defendant of

Theft. FET is the most dominant legal theory in China,

which deﬁnes that a case must satisfy four criminal el-

ements simultaneously to constitute a crime

Most existing works formalize charge predic-

tion as a text classiﬁcation task (Hu et al.,2018;

Luo et al.,2017;Zhong et al.,2018). Although

recent advances in deep learning have demon-

strated their excellent performance in predicting

the charges (Xiao et al.,2021;Yang et al.,2019),

their reliability and interpretability are still under-

explored. It is unknown whether the intrinsic de-

cision mechanism of these models corresponds to

the decision logic of human judges. Speciﬁcally,

since most existing models are data-driven and all

cases in the charge prediction dataset conform to

local legal theories, it is necessary to ﬁgure out

whether these charge prediction models learn their

corresponding legal theories.

Previous studies have shown that trustworthy

legal AI models are supposed to point out human-

interpretable factors used in a decision (Atkinson

et al.,2020). Besides, they should also explain how

the changes in fact descriptions would change their

decisions. Based on these discussions, we argue

that a trustworthy charge prediction model should

arXiv:2210.17108v1 [cs.CL] 31 Oct 2022

obey the following principles to conform to local

legal theory and illustrate how they act in legal

perspectives using FET, the most dominant legal

theory in China (Wang,2017), as an example:

Selective

: be able to identify and concen-

trate on important parts of a case when making

decisions. In FET, the important parts are consid-

ered as criminal elements.

Sensitive

: be aware of the subtle distinc-

tions between similar charges. When three of the

four criminal elements in FET are identical for a

pair of similar charges, a trustworthy model is ex-

pected to use the remaining criminal element to

distinguish the similar charges.

Apart from the prerequisites, which have been

extensively explored in various domains, we can

not ignore the presumption of innocence when fo-

cusing on a legal task. Presumption of innocence

refers to the principle that any defendant is pre-

sumed innocent until proven guilty in a criminal

trial, which is fundamental to protect human rights

worldwide (Tadros and Tierney,2004). Taking

this presumption into account, we propose an ad-

ditional principle that a trustworthy charge predic-

tion model should follow: 3)

Presumption of

innocence

: always assume innocent unless sufﬁ-

cient requirements for a charge are met. In FET,

presumption of innocence

is guaranteed by

checking all four criminal elements before mak-

ing decisions.

In this paper, we propose a framework to eval-

uate whether a charge prediction model conforms

to certain legal theory. Our framework consists of

three components that evaluate the aforementioned

principles respectively. We ﬁrst apply a probing

task to measure whether models learn the skill of

identifying criminal elements from fact descrip-

tions, corresponding to the

selective

principle.

The assumption here is that if the model is capable

of identifying criminal elements, the knowledge of

such a skill should be reﬂected in its internal repre-

sentations, which could be detected by a diagnostic

model (Alt et al.,2020).

The evaluation of the

sensitive

principle relies

on a perturbation experiment, in which we mod-

ify the fact descriptions of confusing charges and

check whether the model could detect the modiﬁca-

tions. Speciﬁcally, for a pair of confusing charges,

we rewrite the fact descriptions related to a certain

criminal element and make the modiﬁed facts fulﬁll

the requirements of the other charge.

If a model is

sensitive

enough, it should be

capable of identifying these modiﬁcations and mak-

ing different predictions for the original facts and

the modiﬁed ones. The ﬁnal component evalu-

ates whether models follow the

presumption of

innocence

by checking the model’s performance

on incomplete fact descriptions.

Those incomplete facts are obtained by exclud-

ing all descriptions related to a speciﬁc criminal

element from criminal descriptions. The models

are expected to make innocent predictions for those

incomplete fact descriptions, because they violate

the requirements of FET that all the four criminal

elements should be satisﬁed when judging guilty.

We conduct experiments with popular Chinese

charge prediction models and the results indicate

that, while existing charge prediction models meet

the

selective

principle on our benchmark dataset,

most of them are still not

sensitive

enough and

do not satisfy the presumption of innocence.

Our contributions are four-folds:

(1) We propose the ﬁrst ever set of principles

that a trustworthy charge prediction model should

follow when conforming to certain legal theories.

(2) Based on these principles, we propose a new

investigation framework to evaluate the trustwor-

thiness of charge prediction models. (3) We supple-

ment the current popular charge prediction dataset

CAIL (Xiao et al.,2018) with innocent cases and

provide sentence-level criminal elements annota-

tion for a subset. (4) We examine existing Chi-

nese charge prediction models using FET, the most

widely used legal theory in China, on the new

benchmark, and ﬁnd that most existing charge pre-

diction models are not trustworthy enough, though

they can achieve over 80% prediction accuracy.

2 The Charge Prediction Task

Suppose the fact description of a case is a word

sequence

x={x1, x2,· · · , xn}

, where

is the

length of

. Based on the fact description

, the

charge prediction task aims at predicting an ap-

propriate charge

y∈Y

, where

is the potential

charge set.

To solve this task, previous works often use ex-

isting text classiﬁcation models (He et al.,2019;

Li et al.,2018), many of which are later improved

by introducing legal knowledge (Luo et al.,2017;

Yang et al.,2019;Zhong et al.,2018). More re-

cently, pretrained language models have also been

proven effective in this task (Xiao et al.,2021).

In our study, we select the following representa-

tive charge prediction models to evaluate whether

they are trustworthy according to the speciﬁc legal

theory, i.e., the FET in this case.

BiLSTM

Luo et al. (2017) uses Bi-LSTM (Yang

et al.,2016) to encode fact descriptions and applies

an attention mechanism to aggregate encoded word

representations to obtain fact embedding, which is

then used for classiﬁcation.

TopJudge

TopJudge (Zhong et al.,2018) is a

representative of those multitask learning models.

During encoding, TopJudge employs CNN (Kim,

2014) as the encoder to obtain fact embeddings.

In decoding, it exploits a directed acyclic graph

to capture the relationship among three sub-tasks,

i.e., charge prediction, law article prediction, and

term prediction, which are jointly optimized in a

multitask framework.

FewShot

FewShot (Hu et al.,2018) introduces

discriminative attributes to distinguish confusing

charges and provide additional knowledge for few-

shot charges, which can stand for those models that

introduce legal knowledge into the charge predic-

tion task. It uses LSTM (Hochreiter and Schmidhu-

ber,1997) as the fact encoder and conducts charge

prediction and attributes prediction afterward.

BERT

BERT (Devlin et al.,2019) is a strong

baseline for many text classiﬁcation tasks. We use

the representation of [CLS] token for classiﬁcation.

Lawformer

Xiao et al. (2021) is a Longformer-

based (Beltagy et al.,2020) language model, which

is pretrained on large-scale Chinese legal cases. We

use it to encode the fact description and apply the

classiﬁcation based on the [CLS] token.

2.1 The Four Elements Theory

Legal theories are the bases for judges to correctly

determine charges, which deﬁne the method of an-

alyzing cases. Judges are required to follow legal

theories when making judgements (Gao,1993). If

they do not, they might make decisions arbitrar-

ily, which is a breach of human rights and free-

dom (Wang,2017).

In China, the Four Elements Theory (FET) is the

dominant legal theory for criminal trials. In prac-

tice, nearly all criminal judges use FET to justify

their decisions (Jiyao,2011). As a result, a trust-

worthy Chinese charge prediction model should

also conform to FET since they are trained based

Acc F1 P R

TopJudge 82.7 60.6 67.5 59.2

FewShot 82.9 71.7 75.9 71.6

BiLSTM 82.4 59.8 65.7 58.9

Bert 90.4 81.9 83.2 79.8

Lawformer 91.0 83.8 84.4 81.1

Table 1: Charge Prediction results on CAIL-I, where

Acc, F1, P, and R represent Accuracy, macro F1, macro

precision, and macro recall, respectively.

on the judgment documents which conform to the

local legal theory, FET.

According to FET, a case must satisfy four

criminal elements simultaneously to constitute a

crime. The four criminal elements are: (1) the

subject (Sub) refers to the person or organization

who has committed the criminal offense and shall

bear criminal crimes, (2) the object (Obj) refers to

the person, thing, interest, or social relations pro-

tected by criminal law and jeopardised by criminal

offence, (3) the conduct (Con) refers to harmful

behaviors, and (4) the mental state (Men) is the

mental state of the criminal subject when commit-

ting a crime, either intent or negligence.

For example, the four criminal elements of Theft

are as follows: (1) subject: the general subject, that

is, a person who has reached the age of criminal

responsibility (16 years old in China), (2) object:

public or private property, (3) conduct: the act of

stealing a large amount of property or repeatedly

stealing property, (4) mental state: intent and with

the purpose of illegal possession.

3 Dataset

Existing charge prediction datasets, such as

CAIL (Xiao et al.,2018), have played a crucial

role in the development of legal artiﬁcial intelli-

gence research. However, they suffer from two

limitations: (1) Lacking innocent cases. This vi-

olates the presumption of innocence, one of the

most fundamental legal principles worldwide. (2)

Only containing coarse-grained annotations, such

as charges and law articles, which cannot reveal

how the judges analyze the cases.

To alleviate the two shortcomings, in this paper,

we propose a new charge prediction dataset, CAIL-

I, that adds innocent cases to the original CAIL.

We further annotate whether a sentence is related to

certain criminal elements in a subset of CAIL. We

call this Sentence-level Criminal Elements dataset

as SCE, which can be utilized to analyze whether a

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DoChargePredictionModelsLearnLegalTheory?ZhenweiAn1;2,QuzheHuang1;3,CongJiang4;5,YansongFeng1;6andDongyanZhao1;5;61WangxuanInstituteofComputerTechnology,PekingUniversity2SchoolofSoftware&Microelectronics,PekingUniversity3SchoolofIntelligenceScienceandTechnology,PekingUniversity4PekingUniversityLaw...

展开>> 收起<<

Do Charge Prediction Models Learn Legal Theory Zhenwei An12 Quzhe Huang13 Cong Jiang45.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Do Charge Prediction Models Learn Legal Theory Zhenwei An12 Quzhe Huang13 Cong Jiang45

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: