
Although promising progress has been achieved, previous works largely ignore the reprogramming
property [
11
] of deep models: a well-trained model can be repurposed for a new task by a proper
transformation of original inputs (e.g., a universal feature perturbation), without modifying any model
parameter. For example, a model pre-trained on ImageNet [
12
] dataset can be reprogrammed for
classifying biomedical images [
13
]. This property indicates the possibility of making a well-trained
model adapt for effective OOD detection, motivating us to make the first attempt to investigate if the
reprogramming property of deep models can help to address OOD detection, i.e., can we reprogram
well-trained deep models for OOD detection (a new task)?
In this paper, we propose a novel method,
watermarking
, to reprogram a well-trained model by
adding a watermark to original inputs, making the model can help detect OOD data well. The
watermark has the same shape with original inputs, which is a static pattern that can be added for
test-time inputs (cf., Figure 1). The pre-defined scoring strategy (e.g., the free energy scoring [
8
])
is expected to be enhanced, with an enlarged gap of OOD scores between the watermarked ID and
OOD data (cf., Figure 2).
Figure 1: Watermarking on CIFAR-
10
[
14
] with
free energy scoring [
8
]. The left figure is the
learned watermark; the middle figure is an original
input; the right figure is the watermarked result.
It is non-trivial to find the proper watermark due
to our lack of knowledge about unseen OOD
data in advance. To address the issue, we pro-
pose a learning framework for effective water-
marking. The insight is to make a well-trained
model produce high scores for watermarked ID
inputs meanwhile regularize the watermark such
that the model will return low confidence with-
out perceiving ID pattern. In this case, the model
will have a relatively high score for a water-
marked ID input, while the score remains low
for OOD data (cf., Figure 2). The reason is that
the model encounters a watermarked input but
not seeing any ID pattern. In our realization, we
adopt several representative scoring strategies, devising specified learning objectives and proposing a
reliable optimization algorithm to learn an effective watermark.
To understand our watermarking, Figure 1depicts the watermark learned on CIFAR-
10
[
14
] dataset,
with the free energy scoring [
8
]. As we can see, the centre area of the learned watermark largely
preserves the original input pattern, containing the semantic message that guides the detection
primitively. By contrast, the edge area of the original input is superimposed by the specific pattern
of the watermark, which may encode the knowledge once hidden by the model in boosting OOD
detection. Overall, watermarking can preserve the meaningful pattern of original inputs in detection,
with the improved detection capability that is learned from the trained model and ID data.
Figure 2demonstrates the effect of our learned watermark, which is an example with the free energy
scoring. After watermarking, the scoring distributions are much concentrated, and the gap between
ID (i.e., CIFAR-
10
) and OOD (i.e., SVHN [
15
] and Texture [
16
] datasets) data is enlarged notably.
We conduct extensive experiments for a wide range of OOD evaluation benchmarks , and the results
verify the effectiveness of our proposal.
The success of watermarking takes roots in the following aspects: (1) a well-trained model on
classification has the potential to be reprogrammed for OOD detection since they are two related
tasks; (2) reprogramming has been widely studied, ranging from image classification to time series
analysis [
12
,
13
], making our proposal general across various domains; and (3) OOD detection suffers
from the lack of knowledge about the real-world OOD distributions. Fortunately, with only data-level
manipulation in low dimensions, watermarking can largely mitigate this issue of limited data. Overall,
this data-level manipulation is orthogonal to existing methods, and thus provides a new road in OOD
detection and can inspire more ways to design OOD detection methods in the future.
2 Related Works
To begin with, we briefly review the related works in OOD detection and model reprogramming.
Please refer to Appendix Afor the detailed discussion.
2