
Machine Learning
Model
Concurrent
Verifier
Figure 1: Overview of a machine learning model with a concurrent verifier that checks whether
input-output pairs of a model satisfy requirements.
pair
(x, h(x))
satisfies the required properties. If it satisfies the requirements, it outputs
h(x)
. If not,
then it rejects
h(x)
and modifies or requests the learning model to modify its output. A machine
learning model and verifier pair can be seen as another machine learning model whose input-output
pairs are guaranteed to satisfy the required conditions.
Although a model with a verifier can guarantee that its input-output pairs satisfy requirements, its
effect on prediction error is unclear. This paper gives theoretical analyses of the generalization errors
of a machine learning model with a CV. We focus on how the learnability of the original model,
denoted as hypothesis class
H
, can change by using the verifier. First, we consider a situation where
we use a CV only in the inference phase. This setting corresponds to a case where the required
properties are unknown when we are in the training phase. If the hypothesis class is PAC-learnable,
we can obtain a guaranteed hypothesis using a verifier only in the inference time.
Second, we consider a situation where we know the requirements when learning the model. This
situation corresponds to viewing the learnability of hypothesis set
Hc
, which is obtained by modifying
every hypothesis
h∈ H
to satisfy the requirements. Hence we compare the generalization error upper
bounds of
Hc
with those of
H
. On the multi-class classification setting, we show that existing error
bounds [
15
,
18
] based on the Rademacher complexity of
H
are also bounds of modified hypothesis
Hc
for any input-output requirements. Moreover, we give similar analyses for a structured prediction
task, which is a kind of multi-class classification where set of classes
Y
can be decomposed into
substructures. It is worth analyzing the task since many works address the constraints in structured
prediction. Some works give error bounds for structured prediction tasks, which are tighter than
simply applying the bound for multi-class classification tasks [
16
,
6
,
19
]. Similar to the case of multi-
class classification, we show that existing Rademacher complexity-based bounds for the structured
prediction of Hare also the bounds for Hc.
Our main contributions are as follows: a) We introduce a concurrent verifier, which is a model-
agnostic way to guarantee that machine learning models satisfy the required properties. Although a
similar mechanism was used in some existing models, our model gives a generalization analysis that
does not depend on a specific model. b) We show that if hypothesis class
H
is PAC-learnable, then
using a verifier at the inference time can give a hypothesis with a guarantee in its generalization error.
Interestingly, if H is not PAC-learnable, we might fail to obtain a guaranteed hypothesis even if the
requirements are consistent with distribution
D
. c) We show that if we use a CV in a learning phase
of multi-class classification tasks, then the theoretical error bounds of
H
based on the Rademacher
complexity will not increase with any input-output requirements. We also give similar results for
structured prediction tasks.
1.1 Use Cases of a Concurrent Verifier
The following are some typical use cases for CVs.
Error-sensitive applications:
A typical situation where we want to use a verifier is that some
prediction errors might cause severe effects, which we want to avoid. For example, a recommender
system might limit the set of candidate items depending on user attributes. Although such a rule
might degrade the prediction accuracy, practically a safer model is preferable.
Controlling outputs of structured prediction:
Constraints are frequently used in structured predic-
tion tasks for improving the performance or the controllability of the outputs. For example, some
works [
23
,
5
] exploited the constraints on sequence labeling tasks for reflecting background knowl-
edge to improve the prediction results. More recently, some works [
9
,
2
] exploited the constraints
in language generation tasks, including image captioning and machine translation, and restricted a
2