A Contextual Bandit Approach for Value-oriented Prediction Interval Forecasting Yufan Zhang Honglin Wen Member IEEE and Qiuwei Wu Senior Member IEEE

2025-04-28 0 0 797.32KB 11 页 10玖币
侵权投诉
A Contextual Bandit Approach for Value-oriented
Prediction Interval Forecasting
Yufan Zhang, Honglin Wen, Member, IEEE, and Qiuwei Wu, Senior Member, IEEE
Abstract—Prediction interval (PI) is an effective tool to quan-
tify uncertainty and usually serves as an input to downstream
robust optimization. Traditional approaches focus on improving
the quality of PI in the view of statistical scores and assume the
improvement in quality will lead to a higher value in the power
systems operation. However, such an assumption cannot always
hold in practice. In this paper, we propose a value-oriented PI
forecasting approach, which aims at reducing operational costs
in downstream operations. For that, it is required to issue PIs
with the guidance of operational costs in robust optimization,
which is addressed within the contextual bandit framework here.
Concretely, the agent is used to select the optimal quantile
proportion, while the environment reveals the costs in operations
as rewards to the agent. As such, the agent can learn the policy
of quantile proportion selection for minimizing the operational
cost. The numerical study regarding a two-timescale operation
of a virtual power plant verifies the superiority of the proposed
approach in terms of operational value. And it is especially
evident in the context of extensive penetration of wind power.
Keywords: Prediction interval; forecast value; decision-
making; uncertainty
I. INTRODUCTION
The ongoing decarbonization effort in the energy sector
places a particular emphasis on renewable energy sources
(RESs). Albeit enjoying the merits of clean and non-emission,
the stochastic nature of RESs poses a great challenge to
power systems operation and electricity markets, as the power
generation of RESs cannot be scheduled at will. This drives the
need of forecasting RES generation at future times to support
power system operation [1], such as power dispatch, trading
[2], and reserve procurement.
Forecasts can be communicated in various forms[3], includ-
ing single-valued points[4], densities[5], [6], and prediction
regions[7], [8]. Among them, prediction regions provide a
summary of the probability distribution of random variables.
For univariate forecasting, a prediction region is communi-
cated as a prediction interval (PI), which is specified by
two bounds and the nominal coverage probability (NCP)
(1β)×100% that specifies the probability that the realization
falls in. PI has a wide range of applications in nowadays
power industry. For instance, PI serves as an input to robust
optimization for quantifying the wind uncertainty, determining
the reserve quantities [9] and wind power offering in the
day-ahead market [10], where the NCP is commonly chosen
Yufan Zhang is with the Department of Electrical and Computer Engineer-
ing, University of California San Diego, San Diego, California 92161, US.
Honglin Wen is with Department of Electrical Engineering, Shanghai Jiao
Tong University, Shanghai 200240, China.
Qiuwei Wu is with Tsinghua-Berkeley Shenzhen Institute,Tsinghua Shen-
zhen International Graduate School, Tsinghua University, Shenzhen 518055,
China.
Corresponding author: Qiuwei Wu (e-mail: qiuwu@sz.tsinghua.edu.cn).
between 90% and 95%. Also, based on the estimated PI,
the concept of uncertainty budget is leveraged to reduce the
conserveness of robust optimization in storage control [11],
unit commitment [12], and microgrid dispatch [13], which
is beneficial to reducing the operational cost in the robust
optimization.
PI is always desired to have good reliability and sharpness,
which means that the interval width needs to be minimized in
the constraint of some NCP. In recent decades, non-parametric
approaches have been preferred by the forecasting community,
which mainly develops quantile regression (QR) models to
issue a pair of quantiles as a PI. Machine learning models, such
as recurrent neural network [14], ridge regression [15], and
neural basis expansion model [16] have been combined with
QR, with the loss function of the pinball loss, which shows
superiority thanks to the strong learning ability of machine
learning models. Usually, the quantiles in a PI are statistically
symmetric with respect to the median, i.e., qβ/2,q1β/2, which
is therefore referred to as the central PI (CPI) in literature.
However, the probability distribution of the RESs power
output is generally skewed [10], [17], thereby the width of
CPIs is often unnecessarily wide [18]. For this reason, optimal
PI (OPI) forecasting approaches arise, which optimize over
the bounds with the objective of improving statistical quality,
such as minimizing the Winkler score. A thread of studies
select the probability proportion according to the contextual
information, instead of setting it as a predetermined constant
like in the CPI approaches. Ref. [19] learned the policy of
proportion selection seeking to minimize the Winkler score. In
another thread of studies, the forecast model outputs the two
bounds directly without specifying a probability proportion to
it. As the quality metrics such as Winkler score are generally
non-differentiable, the main difficulty lies in how to design a
surrogate loss function [20] or a proper optimization technique
to estimate the forecast model parameters. Ref. [21] formu-
lated a multi-objective problem and optimized the parameters
of the extreme learning machine (ELM) by the particle swarm
optimization. In [18], the parameters estimation for the ELM
was formulated as a mixed-integer linear programming (MILP)
problem, which was solved by off-the-shelf solvers.
Although the aforementioned PI forecasting approaches
have contributed to improving forecasting quality in the view
of statistics, they have overlooked the value of forecasts in the
downstream power system operation. The idea of using value
for evaluating the goodness of forecasting can be dated back to
[22], where value is defined as the economic/operational gain
from leveraging forecasts at decision-making stages. Take a
robust optimization problem as an example (such as robust
power dispatch); the input PI will definitely impact the opera-
tional cost. Indeed, it has been shown that the improvement in
arXiv:2210.04152v2 [eess.SY] 13 Feb 2023
forecast quality does not necessarily lead to a higher value in
operation. For example, the biased prediction of wind power
offering quantity is more preferred than the accurate single-
point forecasts with small mean squared errors [23], [24], [25],
as the operational cost of up-regulation (the case that the wind
power offer determined in the day-ahead market is larger than
the wind realization) for settling the energy deficit is more
expensive than that of down-regulation (the case that the wind
power offer determined in the day-ahead market is smaller than
the wind realization) for settling the energy excess. Similar
results can be found in the context of unit commitment (UC)
[26].
Therefore, value-oriented forecasting has been advocated in
recent years [27], [28]. The key challenge lies in how to link
forecasting with decision-making. Attempts have been made
by designing decision-aware loss functions. For instance, to fill
the gap between the point forecast and decision-making, the
loss function called smart ”Predict, then Optimize” (SPO) loss
was proposed in [29]. Ref. [30] approximated the objective
function based on the historical data. However, the approxima-
tion may result in errors which may compromise the value of
forecasting. In [10], a cost-oriented machine learning (COML)
framework was established, which performed value-oriented
PI forecasting by optimizing the probability proportion of QR
models under the decision-making objective. However, the
COML framework restricts the QR model to be linear so that
the estimation of the parameters is allowed to be reduced to a
single-level optimization problem through the KKT condition.
Here, special focus is placed on multi-timescale decision mak-
ing in power systems, such as market clearing or centralized
operation of virtual power plants (VPPs). As the compensation
cost for the decision in each timescale differs, it calls for more
strategic forecasting to reduce the cost.
In this paper, without loss of generality, we design a PI fore-
casting approach for a two-timescale VPP operation task with
wind power, where the day-ahead problem is based on robust
optimization with recourse, while the real-time problem settles
the wind power deviation. Concretely, the proposed value-
oriented PI forecasting approach contains a policy learning
module, which optimally selects the probability proportion to
reduce the operational cost. For that, the training stage of the
value-oriented PI forecasting, which involves the estimation
of model parameters, is solved by a contextual bandit in
a closed-loop manner. Specifically, the policy learning task
is modeled by an agent, whereas the optimization over QR
models parameters is solved in the environment. The agent and
the environment are linked by the reward, which is the negative
objective value of the decision-making problem. As such, the
agent can learn the selection policy guided by the optimal
objective of the decision-making problem. And, the nature of
the contextual bandit avoids the tedious work of labelling in
supervised learning [31]. Compared with the existing studies,
the main contributions of the paper are summarized as follows:
1) A new solution strategy for the value-oriented PI fore-
casting approach, which uses the contextual bandit framework
to link the proportion selection with the operational value of
the downstream decision-making problem.
2) An integration of the value-oriented PI forecasting ap-
proach with the complex decision-making problem, which
involves multiple decision variables and constraints.
The remaining parts of this paper are organized as follows.
The illustrative examples to show the necessity of value-
oriented forecast are presented in Section II. The preliminaries
regarding PI and the two-timescale operation are given in Sec-
tion III. Section IV formulates the problem, whereas Section V
presents the contextual bandit-based solution strategy. Results
are discussed and evaluated in Section VI, followed by the
conclusions.
Notation: The variables in the day-ahead problem have the
subscript D, while the look-ahead variables in the day-ahead
problem have the subscript ξ, D. And the variables in the real-
time problem have the subscript R.
II. MOTIVATING EXAMPLES
Under the current operation framework of power systems,
forecasts often serve as parameters in the subsequent decision-
making problems. Thus, forecasting is not just related to how
well the stochastic process of random variables is described.
Instead, forecasting acquires its own value through the ability
to influence the decisions made by the users of the forecast.
In this section, we present the point forecast for the UC
problem and PI forecast for the wind power offering in the day-
ahead market to show the necessity of value-oriented forecast.
Example 1 (The point forecast of net load for UC
problem): Here, we consider a one-bus system with two
generators (G1 and G2) serving the net load. Let x1, x2and
c1, c2denote the power generation and cost coefficients of
the two generators, respectively. And let u1, u2be the binary
variables regarding the on/off status of the two generators and
d1, d2be the startup costs. Given the point forecast of the net
load ˆ
l, the UC problem is formulated as,
min
x1,x2,u1,u2
c1·x1+c2·x2+d1·u1+d2·u2(1a)
s.t.0x1x1·u1(1b)
0x2x2·u2(1c)
x1+x2=ˆ
l, (1d)
where x1, x2are the generation limits of the two generators.
Here we assume G1 is cheaper than G2. And we assign the
cost coefficients c1, c2, d1, d2with the values 0.1 $/kW, 0.2
$/kW, 10 $, 20 $, and the generation limits x1, x2with the
values 50 kW, 40 kW.
Let the realization of net load lbe 49 kW. If the forecast
of load ˆ
lis 47 kW, the optimal solution of the UC problem is
x
1= 47, x
2= 0, u
1= 1, u
2= 0. And once the realization of
the net load lis available, G1 generates an additional 2 kW
of electricity to satisfy the demand. Therefore, the total cost
under the forecast ˆ
l= 47 kW is 14.9 $. If the forecast of load
ˆ
lis 51 kW, the optimal solution of the UC problem is x
1=
50, x
2= 1, u
1= 1, u
2= 1. And once the realization of the
net load lis available, G1 and G2 respectively reduce 1 kW of
electricity generation to satisfy the demand. And the total cost
under the forecast ˆ
l= 51 kW is 34.9 $. It can be observed that
although the two forecasts have the same deviation from the
2
realization and therefore they have the same quality evaluated
by the statistical quality metric such as mean squared error, the
costs incurred by them are different. Therefore, this case shows
the value-oriented forecast is important to the UC problem and
the similar opinion can be found in [26].
Example 2 (PI forecast for wind power offering in day-
ahead market): Let ˆ
Edenote the quantity the wind power
producer offers in the day-ahead market and Edenote the
wind power realization. Under the day-ahead electricity price
λD, the profit obtained in the day-ahead market is λD·ˆ
E. In
the two-price balance market where λUP , λDW are the prices
for up- and down- regulation, the wind power producer has
to buy up-regulation power when its actual realization Eis
smaller than the offer ˆ
E, while down-regulation is to be sold
when Eis larger than ˆ
E. The total profit of the wind power
producer in the day-ahead and balance markets is therefore
formulated as,
ρ=λD·ˆ
E+λUP ·[Eˆ
E]+λDW ·[Eˆ
E]+,(2)
where [·]= min(·,0) and [·]+= max(·,0). Eq. (2) can be
equivalently formulated as,
ρ=λD·E[(λDλUP )·[Eˆ
E]+(λDλDW )·[Eˆ
E]+],
(3)
Pricing rules entail that λDλUP ,λDλDW . Given
[Eˆ
E]0and [Eˆ
E]+0, both terms inside the
brackets are nonnegative. λUP λDand λDλDW are the
costs of opportunity loss per energy unit under up-regulation
and down-regulation, respectively. The wind power producer
aims to offer the wind power ˆ
Efor maximizing the profit ρ.
Let us assume the values of the day-ahead electricity price
λD, the prices of up- and down- regulation λUP , λDW , and the
wind power realization are 60 $/MW, 300 $/MW, 10 $/MW,
and 20 MW respectively. Denote the lower and upper bounds
of PI for quantifying the wind power Ein day-ahead market
as q, q. The following PI-based robust optimization problem to
determine the wind power offer in the worst case is formulated
as,
max
ˆ
E[q,q]
min
E[q,q]λD·E[(λDλUP )·[Eˆ
E]+
(λDλDW )·[Eˆ
E]+]
(4)
Here, we consider the two day-ahead PI forecasts for the
wind power E, namely [16,18] and [21,22]. Obviously, the
statistical quality of the latter is better than the former.
For the first PI forecast [16,18], in the worst scenario, the
wind power offer equals 16 MW, and the optimal objective of
(3) equals 960 $. Then, when the wind power production is
revealed in the real-time market, the profit incurred by down-
regulation is 40 $. Therefore, the total profit of the wind power
producer in the day-ahead and real-time markets is 1000 $.
For the second PI forecast [21,22], in the worst scenario, the
wind power offer equals 21 MW, and the optimal objective of
(4) equals 1260 $. Then, when the wind power production
is revealed in the real-time market, the profit incurred by
down-regulation is -300 $. Therefore, the total profit of the
wind producer in the day-ahead and real-time markets is 960
$, which is lower than the profit obtained under the first PI
forecast.
Therefore, the example shows that the good statistical
quality cannot ensure the good value for the decision-making.
As such, the value-oriented PI forecasting is needed to bridge
the gap between the forecast and the decision.
III. PRELIMINARIES
In this section, first we introduce the formulation of PI
and discuss the need of value-oriented PI in subsection A. In
subsection B, we introduce the two-timescale VPP operation,
which is the downstream decision-making problem of the PI
estimation task.
A. Preliminary of Prediction Interval
Let Yt+kdenote a random variable for the target wind
power output at future time t+k,FYt+kbe the corresponding
cumulative distribution function, and yt+kbe the realization.
Specifically, a PI with NCP (1β)×100% provides a summary
of the cumulative distribution function FYt+k, and can be
developed as,
Yt+k= [ˆqαt+k
t+k,ˆqαt+k
t+k]
αt+k=αt+k+ 1 β, (5)
where αt+kis in the range of (0, β), and ˆqαt+k
t+k,ˆqαt+k
t+kare the
predictions of the quantiles F1
Yt+k(αt+k),F1
Yt+k(αt+k). Given
the probability proportion α∈ {αt+k,αt+k}and contextual
information stup to time t, the quantile prediction can be
achieved by training a QR model fα(st; Θα)minimizing the
pinball loss function, where QR models can be chosen as many
off-the-shelf ones. It is described as,
ˆ
Θα= arg min
Θα
EFYt+k[`α(fα(st; Θα), Yt+k)],(6)
where `αis the pinball loss function, defined as,
`α(x, y) = max{α(yx),(α1)(yx)}.(7)
After the model training process illustrated in (6), with the
estimated model parameters ˆ
Θαt+kand ˆ
Θαt+k, the predicted
quantiles are given by ˆqαt+k
t+k=fαt+k(st;ˆ
Θαt+k)and ˆqαt+k
t+k=
fαt+k(st;ˆ
Θαt+k).
The probability proportion αt+kis often chosen as β/2
if the distribution function FYt+kis symmetric, or chosen
optimally and adaptively to the skewed distribution. Indeed,
PI serves as an input to the subsequent decision-making.
The optimality to the probability distribution cannot always
ensure the optimality to the value of the downstream decision
task. To tackle this challenge, we seek to find the optimal
probability proportion for PI prediction, such that the value
of the downstream decision task is maximized. Before we
show how to achieve this in the context of two-timescale
VPP operation, we firstly introduce the general model of its
operation framework under the uncertainty of wind power
output in the next subsection.
3
摘要:

AContextualBanditApproachforValue-orientedPredictionIntervalForecastingYufanZhang,HonglinWen,Member,IEEE,andQiuweiWu,SeniorMember,IEEEAbstract—Predictioninterval(PI)isaneffectivetooltoquan-tifyuncertaintyandusuallyservesasaninputtodownstreamrobustoptimization.Traditionalapproachesfocusonimprovingthe...

展开>> 收起<<
A Contextual Bandit Approach for Value-oriented Prediction Interval Forecasting Yufan Zhang Honglin Wen Member IEEE and Qiuwei Wu Senior Member IEEE.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:797.32KB 格式:PDF 时间:2025-04-28

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注