MetaEMS A Meta Reinforcement Learning-based Control Framework for Building Energy Management System Huiliang Zhang Di Wu Benoit Boulet

2025-05-02 0 0 1.1MB 11 页 10玖币

侵权投诉

MetaEMS: A Meta Reinforcement Learning-based Control Framework for

Building Energy Management System

Huiliang Zhang, Di Wu, Benoit Boulet

Electrical and Computer Engineering Department

McGill University

Montreal, QC H3A 0E9

huiliang.zhang2@mail.mcgill.ca; di.wu5@mail.mcgill.ca; benoit.boulet@mcgill.ca

Abstract

The building sector has been recognized as one of the pri-

mary sectors for worldwide energy consumption. Improving

the energy efﬁciency of the building sector can help reduce

the operation cost and reduce the greenhouse gas emission.

The energy management system (EMS) can monitor and con-

trol the operations of built-in appliances in buildings, so an

efﬁcient EMS is of crucial importance to improve the build-

ing operation efﬁciency and maintain safe operations. With

the growing penetration of renewable energy and electrical

appliances, increasing attention has been paid to the devel-

opment of intelligent building EMS. Recently, reinforcement

learning (RL) has been applied for building EMS and has

shown promising potential. However, most of the current RL-

based EMS solutions would need a large amount of data to

learn a reliable control policy, which limits the applicability

of these solutions in the real world. In this work, we propose

MetaEMS, which can help achieve better energy management

performance with the beneﬁts of RL and meta-learning. Ex-

periment results showcase that our proposed MetaEMS can

adapt faster to environment changes and perform better in

most situations compared with other baselines.

Introduction

The building sector accounts for about 40% of primary en-

ergy use and associated greenhouse gas emissions in U.S.

(Zhang et al. 2022; Mariano-Hern

andez et al. 2021; U.S. De-

partment of Energy 2015), and a similar situation exists in

other countries. Therefore, it is essential to reduce energy

consumption and carbon emission in buildings to meet na-

tional energy and environmental challenges. Furthermore,

people spend more than 85% of their time in buildings (Yu

et al. 2021), so well-performing building control methods

can also help deliver a comfortable indoor environment for

people. Recently, the area of building energy management

systems (EMS) has gained a signiﬁcant amount of interest

(Arroyo et al. 2022; Mariano-Hern

andez et al. 2021), and the

advanced control strategies for building EMS are believed

to provide great potential to reduce building energy costs,

improve grid energy efﬁciency and system reliability (Wu

et al. 2017).

2022, Association for the Advancement of Artiﬁcial

In EMS, there are many opportunities for saving the en-

ergy cost of smart buildings, which are evolved from tradi-

tional buildings by adopting internal networks, intelligent

controls, and home automation. For example, dynamic elec-

tricity prices could be utilized to reduce energy costs by

intelligently scheduling Energy Storage Units (ESU) and

thermostatically controllable loads such as Heating, Ventila-

tion, and Air Conditioning (HVAC) systems (Sanjareh et al.

2021; Yang et al. 2021). Besides, the increasing integration

of distributed renewable energy resources also helps with

reducing energy consumption from the power grid (Mason

and Grijalva 2019; Shahrabi et al. 2021; Antoniadis et al.

2021; Hussein, Bhat, and Doppa 2022).

Traditional model-oriented building control methods such

as rule-based control (RBC) and model predictive control

(MPC) are useful in EMS with small-scale and simple appli-

cations (Mariano-Hern

andez et al. 2021; Serale et al. 2018;

Sturzenegger et al. 2014). However, they can’t be generalized

well in different EMS environments because they need lots

of experts’ knowledge in the design of control rules and rely

heavily on precise domain knowledge on buildings’ dynam-

ics. Reinforcement learning (RL) has been recently applied

in building control problems (Arroyo et al. 2022; Wang et al.

2021; Li, Wan, and He 2020; Wei, Wang, and Zhu 2017; Ru-

elens et al. 2016). RL methods can learn the building control

policies based on the interactions between the agent and envi-

ronment with no assumption on the dynamics of the building,

and have shown better performance than traditional EMS

methods in dealing with uncertainties challenges such as re-

newable energy resources and quantities of energy appliances

(Zhang et al. 2022; Mariano-Hern´

andez et al. 2021).

The training mechanism of RL follows a trial-and-error

manner, so the superior performance of RL is conditioned

on a large number of training episodes. Taking recently pub-

lished results (Forootani, Rastegar, and Jooshaki 2022; Zhang

and Lam 2018) as examples, an RL agent may need 5 mil-

lion interaction steps to achieve the same performance as a

feedback controller on an HVAC system. Besides, the current

RL-based algorithms are designed based on the assumption

of static EMS environments. However, such an assumption

is impractical in real-world scenarios because the building

may locate in different regions and have different dynamics

(Abedi, Yoon, and Kwon 2022; Antoniadis et al. 2021; Chen,

Cai, and Berg

es 2019). As a result, these methods may cause

arXiv:2210.12590v1 [cs.AI] 23 Oct 2022

a mismatch when the environment changes and cannot make

the right decision rapidly in a new environment.

In this work, we propose to improve the efﬁciency of the

learning process in RL with meta-learning in EMS. Meta-

learning is the method of systematically observing how differ-

ent learning approaches perform on a wide range of learning

tasks, and then learning from this experience to learn new

tasks much faster. Successful applications have been demon-

strated in areas spanning few-shot image recognition, unsu-

pervised learning, data-efﬁcient, and RL (Finn, Abbeel, and

Levine 2017; Nagabandi, Finn, and Levine 2018; Vanschoren

2018; Han et al. 2022; Li et al. 2022). These methods learn a

well-generalized initialization that can be quickly adapted to

a new scenario with a few gradient steps.

Moreover, we investigate an efﬁcient energy optimization

learning problem for EMS with ESU, HVAC systems, re-

newable energies, and non-shiftable loads (e.g., televisions)

in the absence of a building dynamics model. To be spe-

ciﬁc, our objective is to quickly minimize the energy cost

of the EMS during a time horizon with the consideration of

shaping load proﬁles to improve system reliability. However,

it is very challenging to achieve the above aims by simply

applying meta-RL methods to EMS control due to the follow-

ing reasons. Firstly, it is often intractable to obtain accurate

dynamics of different loads demands and buildings, which

can be affected by many factors. Secondly, it is difﬁcult to

know the statistical distributions of all combinations of ran-

dom system parameters (e.g., renewable generation output,

power demand of non-shiftable loads, outdoor temperature,

and electricity price). Thirdly, there are temporally-coupled

operational constraints associated with ESU and HVAC sys-

tems in different environments, which means that the current

action would affect future decisions.

To address the above challenge, we propose a meta-RL

framework for building control in EMS (MetaEMS), which is

built upon the actor-critic-based meta-RL line. To the best of

our knowledge, it is the ﬁrst work to introduce the meta-RL

paradigm into building control. In MetaEMS, we learn a well-

generalized initialization from various building control tasks.

Given a new building scenario with a limited learning period,

the learned initialization can be quickly adapted with a few

generated samples without knowing the building dynamics.

We further propose two types of adaptation mechanisms to

enhance the data efﬁciency in MetaEMS: group-level adap-

tation and building-level adaptation. The former is a step-

by-step optimization process on each task and the latter is a

periodic synchronous updating process on a batch of sampled

tasks. Each task inherits a group-shared initialization of pa-

rameters, then performs building-level adaptation and ﬁnally

contributes to group-level adaptation. Our experiment results

show that the proposed method is more robust and can learn

faster and be generalized well, facing different building envi-

ronment dynamics. In summary, this paper has the following

key contributions:

•

In this work, MetaEMS, a meta-RL framework consisting

of group-level and building-level adaptation, is proposed

to deal with building energy management control.

•

Empirically, we demonstrate the effectiveness and efﬁ-

ciency of our proposed model on the newest released

real-world CityLearn environment datasets.

Related Work

Traditional Control Methods for EMS

The traditional ways of building control can be sorted

into RBC and MPC methods (Zhang et al. 2022; Mariano-

Hern

andez et al. 2021). The basic idea of RBC techniques is

that adjustment is based on the manually designed set points.

For example, cooling control is applied when the measured

temperature exceeds a pre-deﬁned temperature. The MPC

techniques merge principles of feedback control and numer-

ical optimization in EMS. The system response models of

MPC are based on the physical principle to calculate the ther-

mal dynamics and energy behaviour of the whole building

(Camponogara et al. 2021; Serale et al. 2018; Sturzenegger

et al. 2014). Another trend of MPC is to combine various ma-

chine learning tools with classical MPC to design data-driven

MPC strategies that preserve the reliability of classical MPC.

In (Eini and Abdelwahed 2020), MPC combined with neu-

ral network model is used for lighting and thermal comfort

optimization. A nonlinear autoregressive exogenous model

with parallel architecture is used to train the networks that

estimate the comfort speciﬁcations, environmental conditions

and power consumption. However, there exist some limita-

tions to such methods in solving control problems in EMS.

First, it needs quantities of precise domain knowledge and

building information to manually design the model, which is

hard to obtain and results in limited commercial implemen-

tation (Zhao et al. 2022; B

unning et al. 2020). Second, the

iterative algorithms designed by the traditional optimization

methods cannot make the fast decision on the building con-

trol in a dynamic building environment (Drgo

na et al. 2018;

Chen, Cai, and Berg

es 2019). Since such algorithms require

iterative calculations for each building dynamic model.

Reinforcement Learning for EMS

RL-based EMS control has attracted wide attention from

both academia and industry in the last decades (Forootani,

Rastegar, and Jooshaki 2022; Yu et al. 2021). Traditional

RL methods in EMS are limited to tabular Q-learning and a

discrete state representation (Wen, O’Neill, and Maei 2015).

Recently, researchers have studied deep RL methods in EMS

(Ren et al. 2022; Wei, Wang, and Zhu 2017), which can deal

with problems with large action spaces and state spaces. The

authors in (Ren et al. 2022) adopts a forecasting based dueling

deep Q-learning to optimize and dispatch a featured home

EMS, where a generalized corr-entropy assisted long short-

term memory neural network is adopted to predict outdoor

temperature. (Huang et al. 2022) uses a mixed deep RL to

deal with discrete-continuous hybrid action space in EMS.

To jointly optimize the schedules of all kinds of appliances, a

deep RL approach based on trust region policy optimization

is proposed in (Li, Wan, and He 2020).

Some works also point out that it is impractical to let the

deep RL agent explore the state space fully in a real building

environment, because an unacceptably high economic cost

may incur during the long training process (Camponogara

Figure 1: A general diagram for RL with key components in EMS. A typical EMS architecture consists of a building, control

center, loads, renewable energy resources, ESU, power grid, and smart meter. The control center learns to take the optimal set of

actions through interaction in a dynamic building environment with the goal of maximizing a certain reward quantity.

et al. 2021; Forootani, Rastegar, and Jooshaki 2022). To

reduce the dependency on a real building environment, many

model-based deep RL control methods have been developed.

The authors in (Zhang et al. 2019) use the observed data in

EnergyPlus to develop a building energy model, and then

use the model as the environment simulator to train the deep

RL agent. (Arroyo et al. 2022) combines the MPC objective

function with the RL agent value function while using a

nonlinear controller model encoded from domain knowledge

in EMS. However, most of the aforementioned approaches

rely heavily on accurate simulator design or a large amount

of training data in EMS, and those data are hard to collect

let alone be used in reality. Furthermore, while one can train

an RL agent in simulation, it is not cost-effective to train a

model for each building from scratch. Implementing meta-

learning is a potential solution because controllers trained by

a small number of buildings could be generalized and used

for other buildings.

Meta-reinforcement Learning

Meta-RL aims to solve a new RL task by leveraging the

experience learned from a set of similar tasks (Liu et al. 2021;

Mitchell et al. 2021). There are mainly two lines of meta-

RL algorithms: The ﬁrst is recurrent-based meta-RL. In this

case, the parameters of the prediction model are controlled by

a learnable recurrent meta-optimizer and its corresponding

hidden state (Liu et al. 2021; Duan et al. 2016; Mishra et al.

2017). For example, (Duan et al. 2016) trains a recurrent

neural network by using the training data as input and then

outputs the parameters of a leaner model. These approaches

can achieve relatively good performances, but they may lack

computational efﬁciency. The second is gradient-based meta-

RL. These methods learn a well-generalized initialization that

can be quickly adapted to a new scenario with a few gradient

steps (Tancik et al. 2021; Fallah, Mokhtari, and Ozdaglar

2021; Finn, Abbeel, and Levine 2017; Nagabandi, Finn, and

Levine 2018; Yoon et al. 2018). Representatively, model-

agnostic meta-learning (MAML) (Finn, Abbeel, and Levine

2017) optimizes the initial policy network parameters of the

base learner in the meta-training process, which signiﬁcantly

improves the efﬁciency of RL on the new task. The authors of

(Zang et al. 2020) and (Zhou et al. 2020) generalize the meta-

learning framework in value-based and actor-critic-based RL

methods.

Preliminaries

Key Components in EMS

A typical EMS architecture has several important compo-

nents: building, control center, smart meter, loads, energy

storage units (ESU), renewable energy resources and power

grid, as illustrated in Fig. 1. ESU could be lead-acid batteries

or lithium-ion batteries, or a storage tank, which can reduce

net-energy demand from main grids by storing excess re-

newable energies locally. Renewable energy resources could

be solar panels or wind generators. Loads in an EMS can

be generally divided into several types, e.g., non-shiftable

loads, shiftable loads, non-interruptible loads, and control-

lable loads. To be speciﬁc, power demands of non-shiftable

loads (e.g., televisions, microwaves) must be satisﬁed com-

pletely without delay. As for shiftable and non-interruptible

loads (e.g., washing machines), their tasks can be scheduled

to a proper time but can not be interrupted. In contrast, con-

trollable loads (e.g., HVAC systems, heat pumps, and electric

water heaters) can be controlled to ﬂexibly adjust their oper-

ation times and energy usage quantities by following some

operational requirements, e.g., temperature ranges. In this

paper, we mainly focus on non-shiftable loads and thermo-

statically controlled loads. As for thermostatically controlled

loads, HVAC systems are considered since they consume

about 40% of the total energy in a smart home. In each time

slot, the control center makes the decision on ESU charg-

ing/discharging power and HVAC input power according to

a set of available information (e.g., renewable generation out-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MetaEMS:AMetaReinforcementLearning-basedControlFrameworkforBuildingEnergyManagementSystemHuiliangZhang,DiWu,BenoitBouletElectricalandComputerEngineeringDepartmentMcGillUniversityMontreal,QCH3A0E9huiliang.zhang2@mail.mcgill.ca;di.wu5@mail.mcgill.ca;benoit.boulet@mcgill.caAbstractThebuildingsectorhasb...

展开>> 收起<<

MetaEMS A Meta Reinforcement Learning-based Control Framework for Building Energy Management System Huiliang Zhang Di Wu Benoit Boulet.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

MetaEMS A Meta Reinforcement Learning-based Control Framework for Building Energy Management System Huiliang Zhang Di Wu Benoit Boulet

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: