Non-intrusive Load Monitoring b ased on Self- supervised Learning Shuyi Chen Student Member IEEE Bochao Zhao Member IEEE Mingjun Zhong Member IEEE Wenpeng

2025-05-02 0 0 883.29KB 12 页 10玖币
侵权投诉
Non-intrusive Load Monitoring based on Self-
supervised Learning
Shuyi Chen, Student Member, IEEE, Bochao Zhao, Member, IEEE, Mingjun Zhong, Member, IEEE, Wenpeng
Luan*, Senior Member, IEEE, and Yixin Yu, Life Senior Member, IEEE
AbstractDeep learning models for non-intrusive load
monitoring (NILM) tend to require a large amount of labeled data
for training. However, it is difficult to generalize the trained
models to unseen sites due to different load characteristics and
operating patterns of appliances between data sets. For addressing
such problems, self-supervised learning (SSL) is proposed in this
paper, where labeled appliance-level data from the target data set
or house is not required. Initially, only the aggregate power
readings from target data set are required to pre-train a general
network via a self-supervised pretext task to map aggregate power
sequences to derived representatives. Then, supervised
downstream tasks are carried out for each appliance category to
fine-tune the pre-trained network, where the features learned in
the pretext task are transferred. Utilizing labeled source data sets
enables the downstream tasks to learn how each load is
disaggregated, by mapping the aggregate to labels. Finally, the
fine-tuned network is applied to load disaggregation for the target
sites. For validation, multiple experimental cases are designed
based on three publicly accessible REDD, UK-DALE, and REFIT
data sets. Besides, state-of-the-art neural networks are employed
to perform NILM task in the experiments. Based on the NILM
results in various cases, SSL generally outperforms zero-shot
learning in improving load disaggregation performance without
any sub-metering data from the target data sets.
Index TermsNon-intrusive load monitoring, deep neural
network, self-supervised learning, sequence-to-point learning.
I. INTRODUCTION
N recent years, energy shortage and environmental
pollution worldwide have become increasingly serious.
Therefore, the approaches of efficient energy utilization
and carbon emissions reduction are being explored [1], [2].
Meanwhile, with the global deployment of smart meters, benign
interaction between power suppliers and users has been
established for enhancing demand side management and
optimizing power grid operation [3]. As one of the energy
conservation applications, electricity consumption detail
monitoring has attracted extensive attention around the world
[4]. In general, load monitoring technology is mainly
categorized into intrusive way and non-intrusive way. Note that
intrusive load monitoring requires extra sensor installation for
sub-metering. Alternatively, the concept of non-intrusive load
monitoring (NILM) was proposed by Hart [5] in 1984 as
This work was supported in part by the Joint Funds of the National Natural
Science Foundation of China (No. U2066207) and the National Key Research
and Development Program of China (No. 2020YFB0905904). (Corresponding
author: W. Luan)
identifying power consumed by each individual appliance via
analyzing aggregate power readings using only software tools.
NILM offers appliance-level power consumption feedback to
both demand and supply sides economically and efficiently,
contributing to power system planning and operation [1],
energy bill savings [6], demand side management [7], energy
conservation and emission reduction [3], [6], [8], etc.
NILM is a single-channel blind source separation problem,
aiming to disaggregate the appliance-level energy consumption
from the aggregate measurements [9]. Combinatorial
optimization (CO) is initially applied to perform NILM in [5],
searching for the best combination of operational states of
individual appliances at each time instance. However, CO relies
on the power range of each operational state as prior
knowledge, making it unavailable to the newly added
appliances [10]. Benefiting from the technology development
in recent years on big data, artificial intelligence and edge
computing, plenty of NILM approaches have been proposed
based on machine learning, mathematics, and signal processing
[8], [11]. Factorial hidden Markov model (FHMM) and its
variants [12]-[14] are popular in carrying out NILM. Given an
aggregate power signal as the observation, such FHMM-based
NILM methods estimate the hidden operational states of each
appliance considering their state continuity in time-series [15],
[16]. Thus, FHMM-based methods usually achieve good results
in disaggregating loads with periodic operation such as
refrigerators. However, their performance is limited for the
loads with short-lasting working cycles and the ones with less
frequent usage. Note that FHMM-based methods are regarded
as state-based NILM approaches, where the aggregate power
measurement at each time instance is assigned to each
operational state per appliance [17]. Alternatively, NILM
approaches can be event-based, where sudden changes in power
signals referring to turn-on, turn-off, and state transition events
are featured [17]. Such event-based NILM methods can be
carried out via subtractive clustering and the maximum
likelihood classifier [18]. Besides, graph signal processing
concepts are applied to perform NILM, mapping correlation
among samples to the underlying graph structure [19], [20].
Although such event-based NILM approaches can achieve high
load identification accuracy, they tend to suffer from
S. Chen, B. Zhao, W. Luan and Y. Yu is with the School of Electrical and
Information Engineering, Tianjin University, Tianjin 300072, China (e-mail:
wenpeng.luan@tju.edu.cn).
M. Zhong is with the Department of Computing Science, University of
Aberdeen, Aberdeen, the UK (e-mail: mingjun.zhong@abdn.ac.uk).
I
measurement noises.
Deep neural networks (DNN), performing well in computer
vision, speech recognition, and natural language processing,
have been employed in load disaggregation since 2015 [21].
Since then, DNN-based NILM approaches become more and
more popular, including long short-term memory (LSTM) [15],
[21], gated recurrent unit (GRU) [10], [22], denoising
autoencoder (dAE) [21], [23] and convolutional neural network
(CNN) [24], [25], etc., showing competitive performance
against traditional NILM methods. Although LSTM is suitable
for long time-series related tasks due to avoiding the vanishing
gradient problem, it underperforms in NILM task compared to
CNN [26], [27]. As a variant of LSTM, GRU can also
remember data patterns. In addition, GRU contains fewer
parameters, thus requires shorter training time, which is suitable
for online application in NILM [10], [28]. Note that the
bidirectional gated recurrent unit (Bi-GRU) is employed to
perform NILM in [10], where the network can be trained
simultaneously in positive and negative time directions.
Besides, dAE is applied to NILM by recovering the power
signal of target appliances (clean signal) from the aggregate
(noisy signal) [8], [21], where CNN layers are usually
embedded [21], [23]. A state-of-the-art CNN-based NILM
method, S2p, is proposed in [9] and claimed to outperform
benchmarks in NILM task [9], [29], beneficial from meaningful
latent features learned from sub-metering data. Compared to
traditional NILM methods, advantages of DNN-based NILM
approaches include automatic feature extraction from power
readings and linearity between computational complexity and
appliance amount [4]. However, the promising performance of
the aforementioned DNN-based methods relies on a large
amount of sub-metering data from the target set for training [4].
Since such data collection may last for months or even years
[4], it is neither user-friendly nor economical in practice.
Alternatively, transfer learning concepts are proposed, where
transferable networks can be trained on a source (seen) data set
and applied to the load disaggregation task on a target (unseen)
data set [2]. Depending on whether network fine-tuning is
required, transfer learning can be classified as few-shot learning
(FSL) and zero-shot learning (ZSL) [30]. For fine-tuning in
FSL, a small amount of labeled data from the target set is still
required [30]. However, when labels are unable to be captured
from the target data sets, ZSL offers proper solutions. In [31],
ZSL achieves tiny performance drop compared to baseline
when it is employed in load disaggregation by both GRU and
CNN networks, showing transferability across data sets.
However, in ZSL, it is difficult to generalize networks between
data sets with different load characteristics and operating
patterns of appliances. The same as ZSL, self-supervised
learning (SSL) requires no labeled data from the target set. SSL
is an efficient way to extract universal features from large-scale
unlabeled data, contributing to robustness enhancement [32],
thus it performs well in image processing and speech
recognition [33]. To the best of our knowledge, SSL has not
been used to solve NILM problem.
Driven by such research gaps, in this paper, SSL is applied
to two state-of-the-art NILM algorithms based on CNN and
GRU, as S2p [9] and Bi-GRU [10]. For performing NILM, a
self-supervised pretext task training is initially carried out for
learning features from the aggregate power readings from the
unlabeled data in the target set. Then the pre-trained network is
fine-tuned in the supervised downstream task training based on
the labeled data from the source set for transferring the pre-
learned knowledge to load disaggregation. After pre-training
and fine-tuning, the network can be applied to load
disaggregation for target sites. The proposed method is
validated on the real-world data sets at 1-min granularity, in the
scenarios designed for the same data set or across various data
sets. The contributions of this paper are clarified as follows:
SSL is applied to load disaggregation based on deep
learning without sub-metering on the target set, by
setting a pretext task for network pre-training on
unlabeled data from the target set with fine-tuning.
Experiments are carried out for all combinations of
two state-of-the-art DNN-based NILM methods (S2p
[9] and Bi-GRU [10]) and learning frameworks (SSL
with various fine-tuning ways and ZSL), on three real-
world data sets;
Six cases differing in data selection are designed for
performance evaluation on the data across houses or
sets, showing SSL generally outperforms in various
metrics and energy consumption estimation results,
with comparable training time cost.
The rest of this paper is organized as follows: in Section II,
the NILM formulation is clarified, followed by introducing the
preliminaries for NILM neural networks and SSL; The
methodology of SSL for NILM is explained in Section III;
Section IV contains data sets, evaluation metrics, and
experimental settings, followed by experimental results with
discussion illustrated in Section V; eventually, the conclusion
is drawn and the future work is prospected in Section VI.
II. PRELIMINARIES
In this section, we first formulate the NILM problem and
then clarify seq2seq and seq2point concepts, followed by an
introduction for two seq2point network architectures. Finally,
the overall structure of SSL is demonstrated.
A. NILM Problem Formulation
Assuming that the aggregate power reading measured in a
household at time index
[1, ]tT
is
t
y
, where
T
refers to the
total number of samples. Then the simultaneous power
consumed by appliance
m
to be disaggregated is denoted
by
m
t
x
. The measurement noise is denoted by
t
e
, usually
regarded as Gaussian distributed [7]. Then the total load power
for a household can be expressed as:
()
Thus, for each time index
t
, NILM problem is to estimate
m
t
x
,
m
, given the aggregate power
t
y
. When applying machine
learning or deep learning to NILM, it will become a regression
or classification problem [7].
B. Sequence-to-sequence vs. Sequence-to-point NILM
Frameworks
NILM can be carried out via neural networks with a seq2seq
or seq2point framework [24]. In a seq2seq NILM solution, for
each appliance, a network learns the non-linear regression
between sequences with the same time stamps, referring to the
aggregate and appliance-level power. For an arbitrary aggregate
power sequence
y
covering time instance
t
, the power
m
t
x
consumed by appliance
m
at
t
is predicted by the network,
thus can be finalized as the average value of all such predictions
[9]. Unlike the seq2seq framework, the seq2point framework
predicts the appliance-level power consumed at only one point
of each sliding window iteratively. The inputs and outputs of
both seq2seq and seq2point frameworks in a NILM task are
illustrated in Fig.1.
Sliding
window
Mains power
DNN
...
Ending element in
corresponding
sequence
DNN
Corresponding
appliance-level
power sequence
...
DNN
Midpoint element
in corresponding
sequence
...
Mains power
Mains power
Sliding
window
Sliding
window
(a) seq2seq (b) S2p in [9] (c) Bi-GRU in [10]
Fig. 1. Examples for seq2seq and seq2point frameworks.
Note that seq2point frameworks are demonstrated in Fig. 1
(b) and (c) on two architectures, as S2p proposed in [9] and Bi-
GRU proposed in [10], respectively. Compared to the seq2seq
framework, the seq2point framework emphasizes the
representational power at one element and eases the prediction
task. Then, S2p and Bi-GRU are introduced in details.
1) S2p: The utilization of S2p in NILM is based on the
assumption that the midpoint of each sliding window acts as its
non-linear regression representation. Namely, S2p makes full
use of the past and future information to infer the midpoint, as
shown in Fig. 1 (b).
For a defined neural network
m
f
, the input is a power
sequence denoted by
:1t t W+−
y
segmented by a sliding window
from the aggregate, where
t
is time index and the window size
W
is set to an odd number. Thus, by mapping each sequence
:1t t W+−
y
to the power
m
x
consumed by appliance
m
at
( 1) / 2
= + tW
, the entire power signal
m
x
for appliance
m
can be predicted. Such model can be formulated as:
:1
()
mm
t t W
xf
+−
=+y
()
where
is W-dimensional Gaussian random noise. Besides, the
loss function in the network training is formulated:
1
:1
1log ( | , )
TW m
p t t W p
t
L p x
−+
+−
=
=y
()
where
p
is a set of network parameters.
The CNN-based architecture of S2p is illustrated in Fig. 2 (a)
containing five convolutional layers and one dense layer. In
each iteration, the input signal refers to an n-length sliding
window for aggregate measurements. Then, five convolutional
layers are employed for feature extraction through an activation
function called ReLU. Eventually, the feature maps are
flattened and fed to a dense layer, and an appliance-level power
corresponding to the midpoint of the input window is obtained.
It is claimed in [29] that S2p achieves performance
improvement against seq2seq framework on the same network
architecture.
2) Bi-GRU: Unlike S2p in Fig. 1 (b), the power consumed by
each appliance at time index
1tW+−
is mapped by the pre-
defined sequence
:1t t W+−
y
in Bi-GRU as historical aggregate
measurements, shown in Fig. 1 (c). That is, as window sliding,
power prediction per appliance can be obtained from only the
past information, which is applicable for real-time load
disaggregation. Moreover, GRU is beneficial from less memory
occupancy and fewer parameters than other network
architectures such as LSTM [28]. The architecture of Bi-GRU
is demonstrated in Fig. 2 (b).
As shown in Fig. 2 (b), after each aggregate power sequence
is input to the network, a convolutional layer is used for feature
extraction. Then two Bi-GRU layers are applied to enhance the
memory for the data patterns based on the extracted features,
followed by a dense layer as in the S2p network. Note that
dropout performs overfitting prevention for such layers. The
Conv. layer
Bi-GRU layer1
Dense layer
Output
Bi-GRU layer2
Conv. layer1
Conv. layer2
Conv. layer3
Conv. layer4
Conv. layer5
Flatten
Dense layer
Output
Dropout
Dropout
Dropout
(a) S2p (b) Bi-GRU
Fig. 2. The architectures for S2p and Bi-GRU.
摘要:

Non-intrusiveLoadMonitoringbasedonSelf-supervisedLearningShuyiChen,StudentMember,IEEE,BochaoZhao,Member,IEEE,MingjunZhong,Member,IEEE,WenpengLuan*,SeniorMember,IEEE,andYixinYu,LifeSeniorMember,IEEEAbstract—Deeplearningmodelsfornon-intrusiveloadmonitoring(NILM)tendtorequirealargeamountoflabeleddatafo...

展开>> 收起<<
Non-intrusive Load Monitoring b ased on Self- supervised Learning Shuyi Chen Student Member IEEE Bochao Zhao Member IEEE Mingjun Zhong Member IEEE Wenpeng.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:883.29KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注