will be used as early evidence to identify a suspect.
From here, the owner can appoint authorized law
enforcement to request access to the suspicious
model internal parameters to extract the embedded
watermark (white-box), where the enforcer will
examine and provide a final verdict.
1.1 Problem Statement
Recurrent Neural Network (RNN) has been widely
used in various Natural Language Processing
(NLP) applications such as text classification, ma-
chine translation, question answering etc. Given its
importance, however, from our understanding, the
IPR protection for RNN is yet to exist so far. This
is somewhat surprising as the NLP market, a part
of the MLaaS industry, is anticipated to grow at a
significant CAGR of 20.2% during the forecast pe-
riod from 2021-2030. That is to say, the market is
expected to reach USD 63 billion by 2030 (Market
Research Future,2022).
1.2 Contributions
The contributions of our work are twofold:
1.
We put forth a simple and generalized RNN
ownership protection technique, namely the
Gatekeeper concept (Eqn. 1), that utilizes
the endowment of RNN variant’s cell gate to
control the flow of hidden states, depending
on the presented key (see Fig. 3);
2.
Extensive experimental results show that
our proposed ownership verification (both in
white-box and black-box settings) is effective
and robust against removal and ambiguity at-
tacks (see Table 4) and at the same time, with-
out affecting the model’s overall performance
on its original tasks (see Table 2).
The proposed IPR protection framework is il-
lustrated in Fig. 1. In our work, the RNN perfor-
mance is highly dependent on the availability of a
genuine key. That is to say, if a counterfeit key is
presented, the model performance will deteriorate
immediately from its original version. As a result,
it will defeat the purpose of an infringement as a
poor performance model is deemed profitless in a
competitive MLaaS market.
2 Related Work
Uchida et al. (2017) were the first to propose white-
box protection to embed watermarks into CNN by
imposing a regularization term on the weights pa-
rameters. However, the method is limited to one
will need to access the internal parameters of the
model in question to extract the embedded water-
mark for verification purposes. Therefore, Quan
et al. (2021), Adi et al. (2018) and Le Merrer et al.
(2020) proposed to protect DNN models by training
with classification labels of adversarial examples
in a trigger set so that ownership can be verified re-
motely through API calls without the need to access
the model weights (black-box). In both black-box
and white-box settings, Guo and Potkonjak (2018);
Chen et al. (2019) and Rouhani et al. (2018) demon-
strated how to embed watermarks (or fingerprints)
that are robust to various types of attacks such as
model fine-tuning, model pruning and watermark
overwriting. Recently, Fan et al. (2022) and Jie
et al. (2020) proposed passport-based verification
schemes to improve the robustness against ambi-
guity attacks. Ong et al. (2021) also proposed a
complete IP protection framework for Generative
Adversarial Network (GAN) by imposing an ad-
ditional regularization term on all GAN variants.
Other than that, Rathi et al. (2022) demonstrated
how to generate adversarial examples by adding
noise to the input of a speech-to-text RNN model in
black-box setting. Finally, He et al. (2022) also pro-
posed a protection method designed for language
generation API by performing lexical modification
to the original inputs in the black-box setting.
To the best of our knowledge, the closest work
to ours is Lim et al. (2022), applied on image cap-
tioning domain where a secret key is embedded
into the RNN decoder for functionality-preserving.
Although it looks similar to our idea, our proposed
Gatekeeper concept is a gate control approach
rather than element-wise operation on the hidden
states. That is to say, the embedded key in Lim et al.
(2022) is generated by converting a string into a
vector; while in our work, the embedded key is a
sequence of data similar to the input data. Further-
more, the key embedding operation in Lim et al.
(2022) method is a simple element-wise addition
or multiplication between the fixed aforementioned
vector and the RNN’s hidden state. Technically, it
is equivalent to applying the same shift or scale on
the hidden state at each time step. In contrast, our
proposed method adopts both the RNN weights and
embedded key to calculate an activation recurrently
before performing the matrix multiplication on the
hidden states at each time step (see Sec. 3.1).