MetaEMS A Meta Reinforcement Learning-based Control Framework for Building Energy Management System Huiliang Zhang Di Wu Benoit Boulet

2025-05-02 0 0 1.1MB 11 页 10玖币
侵权投诉
MetaEMS: A Meta Reinforcement Learning-based Control Framework for
Building Energy Management System
Huiliang Zhang, Di Wu, Benoit Boulet
Electrical and Computer Engineering Department
McGill University
Montreal, QC H3A 0E9
huiliang.zhang2@mail.mcgill.ca; di.wu5@mail.mcgill.ca; benoit.boulet@mcgill.ca
Abstract
The building sector has been recognized as one of the pri-
mary sectors for worldwide energy consumption. Improving
the energy efficiency of the building sector can help reduce
the operation cost and reduce the greenhouse gas emission.
The energy management system (EMS) can monitor and con-
trol the operations of built-in appliances in buildings, so an
efficient EMS is of crucial importance to improve the build-
ing operation efficiency and maintain safe operations. With
the growing penetration of renewable energy and electrical
appliances, increasing attention has been paid to the devel-
opment of intelligent building EMS. Recently, reinforcement
learning (RL) has been applied for building EMS and has
shown promising potential. However, most of the current RL-
based EMS solutions would need a large amount of data to
learn a reliable control policy, which limits the applicability
of these solutions in the real world. In this work, we propose
MetaEMS, which can help achieve better energy management
performance with the benefits of RL and meta-learning. Ex-
periment results showcase that our proposed MetaEMS can
adapt faster to environment changes and perform better in
most situations compared with other baselines.
Introduction
The building sector accounts for about 40% of primary en-
ergy use and associated greenhouse gas emissions in U.S.
(Zhang et al. 2022; Mariano-Hern
´
andez et al. 2021; U.S. De-
partment of Energy 2015), and a similar situation exists in
other countries. Therefore, it is essential to reduce energy
consumption and carbon emission in buildings to meet na-
tional energy and environmental challenges. Furthermore,
people spend more than 85% of their time in buildings (Yu
et al. 2021), so well-performing building control methods
can also help deliver a comfortable indoor environment for
people. Recently, the area of building energy management
systems (EMS) has gained a significant amount of interest
(Arroyo et al. 2022; Mariano-Hern
´
andez et al. 2021), and the
advanced control strategies for building EMS are believed
to provide great potential to reduce building energy costs,
improve grid energy efficiency and system reliability (Wu
et al. 2017).
Copyright
©
2022, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
In EMS, there are many opportunities for saving the en-
ergy cost of smart buildings, which are evolved from tradi-
tional buildings by adopting internal networks, intelligent
controls, and home automation. For example, dynamic elec-
tricity prices could be utilized to reduce energy costs by
intelligently scheduling Energy Storage Units (ESU) and
thermostatically controllable loads such as Heating, Ventila-
tion, and Air Conditioning (HVAC) systems (Sanjareh et al.
2021; Yang et al. 2021). Besides, the increasing integration
of distributed renewable energy resources also helps with
reducing energy consumption from the power grid (Mason
and Grijalva 2019; Shahrabi et al. 2021; Antoniadis et al.
2021; Hussein, Bhat, and Doppa 2022).
Traditional model-oriented building control methods such
as rule-based control (RBC) and model predictive control
(MPC) are useful in EMS with small-scale and simple appli-
cations (Mariano-Hern
´
andez et al. 2021; Serale et al. 2018;
Sturzenegger et al. 2014). However, they can’t be generalized
well in different EMS environments because they need lots
of experts’ knowledge in the design of control rules and rely
heavily on precise domain knowledge on buildings’ dynam-
ics. Reinforcement learning (RL) has been recently applied
in building control problems (Arroyo et al. 2022; Wang et al.
2021; Li, Wan, and He 2020; Wei, Wang, and Zhu 2017; Ru-
elens et al. 2016). RL methods can learn the building control
policies based on the interactions between the agent and envi-
ronment with no assumption on the dynamics of the building,
and have shown better performance than traditional EMS
methods in dealing with uncertainties challenges such as re-
newable energy resources and quantities of energy appliances
(Zhang et al. 2022; Mariano-Hern´
andez et al. 2021).
The training mechanism of RL follows a trial-and-error
manner, so the superior performance of RL is conditioned
on a large number of training episodes. Taking recently pub-
lished results (Forootani, Rastegar, and Jooshaki 2022; Zhang
and Lam 2018) as examples, an RL agent may need 5 mil-
lion interaction steps to achieve the same performance as a
feedback controller on an HVAC system. Besides, the current
RL-based algorithms are designed based on the assumption
of static EMS environments. However, such an assumption
is impractical in real-world scenarios because the building
may locate in different regions and have different dynamics
(Abedi, Yoon, and Kwon 2022; Antoniadis et al. 2021; Chen,
Cai, and Berg
´
es 2019). As a result, these methods may cause
arXiv:2210.12590v1 [cs.AI] 23 Oct 2022
a mismatch when the environment changes and cannot make
the right decision rapidly in a new environment.
In this work, we propose to improve the efficiency of the
learning process in RL with meta-learning in EMS. Meta-
learning is the method of systematically observing how differ-
ent learning approaches perform on a wide range of learning
tasks, and then learning from this experience to learn new
tasks much faster. Successful applications have been demon-
strated in areas spanning few-shot image recognition, unsu-
pervised learning, data-efficient, and RL (Finn, Abbeel, and
Levine 2017; Nagabandi, Finn, and Levine 2018; Vanschoren
2018; Han et al. 2022; Li et al. 2022). These methods learn a
well-generalized initialization that can be quickly adapted to
a new scenario with a few gradient steps.
Moreover, we investigate an efficient energy optimization
learning problem for EMS with ESU, HVAC systems, re-
newable energies, and non-shiftable loads (e.g., televisions)
in the absence of a building dynamics model. To be spe-
cific, our objective is to quickly minimize the energy cost
of the EMS during a time horizon with the consideration of
shaping load profiles to improve system reliability. However,
it is very challenging to achieve the above aims by simply
applying meta-RL methods to EMS control due to the follow-
ing reasons. Firstly, it is often intractable to obtain accurate
dynamics of different loads demands and buildings, which
can be affected by many factors. Secondly, it is difficult to
know the statistical distributions of all combinations of ran-
dom system parameters (e.g., renewable generation output,
power demand of non-shiftable loads, outdoor temperature,
and electricity price). Thirdly, there are temporally-coupled
operational constraints associated with ESU and HVAC sys-
tems in different environments, which means that the current
action would affect future decisions.
To address the above challenge, we propose a meta-RL
framework for building control in EMS (MetaEMS), which is
built upon the actor-critic-based meta-RL line. To the best of
our knowledge, it is the first work to introduce the meta-RL
paradigm into building control. In MetaEMS, we learn a well-
generalized initialization from various building control tasks.
Given a new building scenario with a limited learning period,
the learned initialization can be quickly adapted with a few
generated samples without knowing the building dynamics.
We further propose two types of adaptation mechanisms to
enhance the data efficiency in MetaEMS: group-level adap-
tation and building-level adaptation. The former is a step-
by-step optimization process on each task and the latter is a
periodic synchronous updating process on a batch of sampled
tasks. Each task inherits a group-shared initialization of pa-
rameters, then performs building-level adaptation and finally
contributes to group-level adaptation. Our experiment results
show that the proposed method is more robust and can learn
faster and be generalized well, facing different building envi-
ronment dynamics. In summary, this paper has the following
key contributions:
In this work, MetaEMS, a meta-RL framework consisting
of group-level and building-level adaptation, is proposed
to deal with building energy management control.
Empirically, we demonstrate the effectiveness and effi-
ciency of our proposed model on the newest released
real-world CityLearn environment datasets.
Related Work
Traditional Control Methods for EMS
The traditional ways of building control can be sorted
into RBC and MPC methods (Zhang et al. 2022; Mariano-
Hern
´
andez et al. 2021). The basic idea of RBC techniques is
that adjustment is based on the manually designed set points.
For example, cooling control is applied when the measured
temperature exceeds a pre-defined temperature. The MPC
techniques merge principles of feedback control and numer-
ical optimization in EMS. The system response models of
MPC are based on the physical principle to calculate the ther-
mal dynamics and energy behaviour of the whole building
(Camponogara et al. 2021; Serale et al. 2018; Sturzenegger
et al. 2014). Another trend of MPC is to combine various ma-
chine learning tools with classical MPC to design data-driven
MPC strategies that preserve the reliability of classical MPC.
In (Eini and Abdelwahed 2020), MPC combined with neu-
ral network model is used for lighting and thermal comfort
optimization. A nonlinear autoregressive exogenous model
with parallel architecture is used to train the networks that
estimate the comfort specifications, environmental conditions
and power consumption. However, there exist some limita-
tions to such methods in solving control problems in EMS.
First, it needs quantities of precise domain knowledge and
building information to manually design the model, which is
hard to obtain and results in limited commercial implemen-
tation (Zhao et al. 2022; B
¨
unning et al. 2020). Second, the
iterative algorithms designed by the traditional optimization
methods cannot make the fast decision on the building con-
trol in a dynamic building environment (Drgo
ˇ
na et al. 2018;
Chen, Cai, and Berg
´
es 2019). Since such algorithms require
iterative calculations for each building dynamic model.
Reinforcement Learning for EMS
RL-based EMS control has attracted wide attention from
both academia and industry in the last decades (Forootani,
Rastegar, and Jooshaki 2022; Yu et al. 2021). Traditional
RL methods in EMS are limited to tabular Q-learning and a
discrete state representation (Wen, O’Neill, and Maei 2015).
Recently, researchers have studied deep RL methods in EMS
(Ren et al. 2022; Wei, Wang, and Zhu 2017), which can deal
with problems with large action spaces and state spaces. The
authors in (Ren et al. 2022) adopts a forecasting based dueling
deep Q-learning to optimize and dispatch a featured home
EMS, where a generalized corr-entropy assisted long short-
term memory neural network is adopted to predict outdoor
temperature. (Huang et al. 2022) uses a mixed deep RL to
deal with discrete-continuous hybrid action space in EMS.
To jointly optimize the schedules of all kinds of appliances, a
deep RL approach based on trust region policy optimization
is proposed in (Li, Wan, and He 2020).
Some works also point out that it is impractical to let the
deep RL agent explore the state space fully in a real building
environment, because an unacceptably high economic cost
may incur during the long training process (Camponogara
Figure 1: A general diagram for RL with key components in EMS. A typical EMS architecture consists of a building, control
center, loads, renewable energy resources, ESU, power grid, and smart meter. The control center learns to take the optimal set of
actions through interaction in a dynamic building environment with the goal of maximizing a certain reward quantity.
et al. 2021; Forootani, Rastegar, and Jooshaki 2022). To
reduce the dependency on a real building environment, many
model-based deep RL control methods have been developed.
The authors in (Zhang et al. 2019) use the observed data in
EnergyPlus to develop a building energy model, and then
use the model as the environment simulator to train the deep
RL agent. (Arroyo et al. 2022) combines the MPC objective
function with the RL agent value function while using a
nonlinear controller model encoded from domain knowledge
in EMS. However, most of the aforementioned approaches
rely heavily on accurate simulator design or a large amount
of training data in EMS, and those data are hard to collect
let alone be used in reality. Furthermore, while one can train
an RL agent in simulation, it is not cost-effective to train a
model for each building from scratch. Implementing meta-
learning is a potential solution because controllers trained by
a small number of buildings could be generalized and used
for other buildings.
Meta-reinforcement Learning
Meta-RL aims to solve a new RL task by leveraging the
experience learned from a set of similar tasks (Liu et al. 2021;
Mitchell et al. 2021). There are mainly two lines of meta-
RL algorithms: The first is recurrent-based meta-RL. In this
case, the parameters of the prediction model are controlled by
a learnable recurrent meta-optimizer and its corresponding
hidden state (Liu et al. 2021; Duan et al. 2016; Mishra et al.
2017). For example, (Duan et al. 2016) trains a recurrent
neural network by using the training data as input and then
outputs the parameters of a leaner model. These approaches
can achieve relatively good performances, but they may lack
computational efficiency. The second is gradient-based meta-
RL. These methods learn a well-generalized initialization that
can be quickly adapted to a new scenario with a few gradient
steps (Tancik et al. 2021; Fallah, Mokhtari, and Ozdaglar
2021; Finn, Abbeel, and Levine 2017; Nagabandi, Finn, and
Levine 2018; Yoon et al. 2018). Representatively, model-
agnostic meta-learning (MAML) (Finn, Abbeel, and Levine
2017) optimizes the initial policy network parameters of the
base learner in the meta-training process, which significantly
improves the efficiency of RL on the new task. The authors of
(Zang et al. 2020) and (Zhou et al. 2020) generalize the meta-
learning framework in value-based and actor-critic-based RL
methods.
Preliminaries
Key Components in EMS
A typical EMS architecture has several important compo-
nents: building, control center, smart meter, loads, energy
storage units (ESU), renewable energy resources and power
grid, as illustrated in Fig. 1. ESU could be lead-acid batteries
or lithium-ion batteries, or a storage tank, which can reduce
net-energy demand from main grids by storing excess re-
newable energies locally. Renewable energy resources could
be solar panels or wind generators. Loads in an EMS can
be generally divided into several types, e.g., non-shiftable
loads, shiftable loads, non-interruptible loads, and control-
lable loads. To be specific, power demands of non-shiftable
loads (e.g., televisions, microwaves) must be satisfied com-
pletely without delay. As for shiftable and non-interruptible
loads (e.g., washing machines), their tasks can be scheduled
to a proper time but can not be interrupted. In contrast, con-
trollable loads (e.g., HVAC systems, heat pumps, and electric
water heaters) can be controlled to flexibly adjust their oper-
ation times and energy usage quantities by following some
operational requirements, e.g., temperature ranges. In this
paper, we mainly focus on non-shiftable loads and thermo-
statically controlled loads. As for thermostatically controlled
loads, HVAC systems are considered since they consume
about 40% of the total energy in a smart home. In each time
slot, the control center makes the decision on ESU charg-
ing/discharging power and HVAC input power according to
a set of available information (e.g., renewable generation out-
摘要:

MetaEMS:AMetaReinforcementLearning-basedControlFrameworkforBuildingEnergyManagementSystemHuiliangZhang,DiWu,BenoitBouletElectricalandComputerEngineeringDepartmentMcGillUniversityMontreal,QCH3A0E9huiliang.zhang2@mail.mcgill.ca;di.wu5@mail.mcgill.ca;benoit.boulet@mcgill.caAbstractThebuildingsectorhasb...

展开>> 收起<<
MetaEMS A Meta Reinforcement Learning-based Control Framework for Building Energy Management System Huiliang Zhang Di Wu Benoit Boulet.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:1.1MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注