A Contextual Bandit Approach for Value-oriented Prediction Interval Forecasting Yufan Zhang Honglin Wen Member IEEE and Qiuwei Wu Senior Member IEEE

2025-04-28 0 0 797.32KB 11 页 10玖币

侵权投诉

A Contextual Bandit Approach for Value-oriented

Prediction Interval Forecasting

Yufan Zhang, Honglin Wen, Member, IEEE, and Qiuwei Wu, Senior Member, IEEE

Abstract—Prediction interval (PI) is an effective tool to quan-

tify uncertainty and usually serves as an input to downstream

robust optimization. Traditional approaches focus on improving

the quality of PI in the view of statistical scores and assume the

improvement in quality will lead to a higher value in the power

systems operation. However, such an assumption cannot always

hold in practice. In this paper, we propose a value-oriented PI

forecasting approach, which aims at reducing operational costs

in downstream operations. For that, it is required to issue PIs

with the guidance of operational costs in robust optimization,

which is addressed within the contextual bandit framework here.

Concretely, the agent is used to select the optimal quantile

proportion, while the environment reveals the costs in operations

as rewards to the agent. As such, the agent can learn the policy

of quantile proportion selection for minimizing the operational

cost. The numerical study regarding a two-timescale operation

of a virtual power plant veriﬁes the superiority of the proposed

approach in terms of operational value. And it is especially

evident in the context of extensive penetration of wind power.

Keywords: Prediction interval; forecast value; decision-

making; uncertainty

I. INTRODUCTION

The ongoing decarbonization effort in the energy sector

places a particular emphasis on renewable energy sources

(RESs). Albeit enjoying the merits of clean and non-emission,

the stochastic nature of RESs poses a great challenge to

power systems operation and electricity markets, as the power

generation of RESs cannot be scheduled at will. This drives the

need of forecasting RES generation at future times to support

power system operation [1], such as power dispatch, trading

[2], and reserve procurement.

Forecasts can be communicated in various forms[3], includ-

ing single-valued points[4], densities[5], [6], and prediction

regions[7], [8]. Among them, prediction regions provide a

summary of the probability distribution of random variables.

For univariate forecasting, a prediction region is communi-

cated as a prediction interval (PI), which is speciﬁed by

two bounds and the nominal coverage probability (NCP)

(1−β)×100% that speciﬁes the probability that the realization

falls in. PI has a wide range of applications in nowadays

power industry. For instance, PI serves as an input to robust

optimization for quantifying the wind uncertainty, determining

the reserve quantities [9] and wind power offering in the

day-ahead market [10], where the NCP is commonly chosen

Yufan Zhang is with the Department of Electrical and Computer Engineer-

ing, University of California San Diego, San Diego, California 92161, US.

Honglin Wen is with Department of Electrical Engineering, Shanghai Jiao

Tong University, Shanghai 200240, China.

Qiuwei Wu is with Tsinghua-Berkeley Shenzhen Institute,Tsinghua Shen-

zhen International Graduate School, Tsinghua University, Shenzhen 518055,

China.

Corresponding author: Qiuwei Wu (e-mail: qiuwu@sz.tsinghua.edu.cn).

between 90% and 95%. Also, based on the estimated PI,

the concept of uncertainty budget is leveraged to reduce the

conserveness of robust optimization in storage control [11],

unit commitment [12], and microgrid dispatch [13], which

is beneﬁcial to reducing the operational cost in the robust

optimization.

PI is always desired to have good reliability and sharpness,

which means that the interval width needs to be minimized in

the constraint of some NCP. In recent decades, non-parametric

approaches have been preferred by the forecasting community,

which mainly develops quantile regression (QR) models to

issue a pair of quantiles as a PI. Machine learning models, such

as recurrent neural network [14], ridge regression [15], and

neural basis expansion model [16] have been combined with

QR, with the loss function of the pinball loss, which shows

superiority thanks to the strong learning ability of machine

learning models. Usually, the quantiles in a PI are statistically

symmetric with respect to the median, i.e., qβ/2,q1−β/2, which

is therefore referred to as the central PI (CPI) in literature.

However, the probability distribution of the RESs power

output is generally skewed [10], [17], thereby the width of

CPIs is often unnecessarily wide [18]. For this reason, optimal

PI (OPI) forecasting approaches arise, which optimize over

the bounds with the objective of improving statistical quality,

such as minimizing the Winkler score. A thread of studies

select the probability proportion according to the contextual

information, instead of setting it as a predetermined constant

like in the CPI approaches. Ref. [19] learned the policy of

proportion selection seeking to minimize the Winkler score. In

another thread of studies, the forecast model outputs the two

bounds directly without specifying a probability proportion to

it. As the quality metrics such as Winkler score are generally

non-differentiable, the main difﬁculty lies in how to design a

surrogate loss function [20] or a proper optimization technique

to estimate the forecast model parameters. Ref. [21] formu-

lated a multi-objective problem and optimized the parameters

of the extreme learning machine (ELM) by the particle swarm

optimization. In [18], the parameters estimation for the ELM

was formulated as a mixed-integer linear programming (MILP)

problem, which was solved by off-the-shelf solvers.

Although the aforementioned PI forecasting approaches

have contributed to improving forecasting quality in the view

of statistics, they have overlooked the value of forecasts in the

downstream power system operation. The idea of using value

for evaluating the goodness of forecasting can be dated back to

[22], where value is deﬁned as the economic/operational gain

from leveraging forecasts at decision-making stages. Take a

robust optimization problem as an example (such as robust

power dispatch); the input PI will deﬁnitely impact the opera-

tional cost. Indeed, it has been shown that the improvement in

arXiv:2210.04152v2 [eess.SY] 13 Feb 2023

forecast quality does not necessarily lead to a higher value in

operation. For example, the biased prediction of wind power

offering quantity is more preferred than the accurate single-

point forecasts with small mean squared errors [23], [24], [25],

as the operational cost of up-regulation (the case that the wind

power offer determined in the day-ahead market is larger than

the wind realization) for settling the energy deﬁcit is more

expensive than that of down-regulation (the case that the wind

power offer determined in the day-ahead market is smaller than

the wind realization) for settling the energy excess. Similar

results can be found in the context of unit commitment (UC)

[26].

Therefore, value-oriented forecasting has been advocated in

recent years [27], [28]. The key challenge lies in how to link

forecasting with decision-making. Attempts have been made

by designing decision-aware loss functions. For instance, to ﬁll

the gap between the point forecast and decision-making, the

loss function called smart ”Predict, then Optimize” (SPO) loss

was proposed in [29]. Ref. [30] approximated the objective

function based on the historical data. However, the approxima-

tion may result in errors which may compromise the value of

forecasting. In [10], a cost-oriented machine learning (COML)

framework was established, which performed value-oriented

PI forecasting by optimizing the probability proportion of QR

models under the decision-making objective. However, the

COML framework restricts the QR model to be linear so that

the estimation of the parameters is allowed to be reduced to a

single-level optimization problem through the KKT condition.

Here, special focus is placed on multi-timescale decision mak-

ing in power systems, such as market clearing or centralized

operation of virtual power plants (VPPs). As the compensation

cost for the decision in each timescale differs, it calls for more

strategic forecasting to reduce the cost.

In this paper, without loss of generality, we design a PI fore-

casting approach for a two-timescale VPP operation task with

wind power, where the day-ahead problem is based on robust

optimization with recourse, while the real-time problem settles

the wind power deviation. Concretely, the proposed value-

oriented PI forecasting approach contains a policy learning

module, which optimally selects the probability proportion to

reduce the operational cost. For that, the training stage of the

value-oriented PI forecasting, which involves the estimation

of model parameters, is solved by a contextual bandit in

a closed-loop manner. Speciﬁcally, the policy learning task

is modeled by an agent, whereas the optimization over QR

models parameters is solved in the environment. The agent and

the environment are linked by the reward, which is the negative

objective value of the decision-making problem. As such, the

agent can learn the selection policy guided by the optimal

objective of the decision-making problem. And, the nature of

the contextual bandit avoids the tedious work of labelling in

supervised learning [31]. Compared with the existing studies,

the main contributions of the paper are summarized as follows:

1) A new solution strategy for the value-oriented PI fore-

casting approach, which uses the contextual bandit framework

to link the proportion selection with the operational value of

the downstream decision-making problem.

2) An integration of the value-oriented PI forecasting ap-

proach with the complex decision-making problem, which

involves multiple decision variables and constraints.

The remaining parts of this paper are organized as follows.

The illustrative examples to show the necessity of value-

oriented forecast are presented in Section II. The preliminaries

regarding PI and the two-timescale operation are given in Sec-

tion III. Section IV formulates the problem, whereas Section V

presents the contextual bandit-based solution strategy. Results

are discussed and evaluated in Section VI, followed by the

conclusions.

Notation: The variables in the day-ahead problem have the

subscript D, while the look-ahead variables in the day-ahead

problem have the subscript ξ, D. And the variables in the real-

time problem have the subscript R.

II. MOTIVATING EXAMPLES

Under the current operation framework of power systems,

forecasts often serve as parameters in the subsequent decision-

making problems. Thus, forecasting is not just related to how

well the stochastic process of random variables is described.

Instead, forecasting acquires its own value through the ability

to inﬂuence the decisions made by the users of the forecast.

In this section, we present the point forecast for the UC

problem and PI forecast for the wind power offering in the day-

ahead market to show the necessity of value-oriented forecast.

Example 1 (The point forecast of net load for UC

problem): Here, we consider a one-bus system with two

generators (G1 and G2) serving the net load. Let x1, x2and

c1, c2denote the power generation and cost coefﬁcients of

the two generators, respectively. And let u1, u2be the binary

variables regarding the on/off status of the two generators and

d1, d2be the startup costs. Given the point forecast of the net

load ˆ

l, the UC problem is formulated as,

min

x1,x2,u1,u2

c1·x1+c2·x2+d1·u1+d2·u2(1a)

s.t.0≤x1≤x1·u1(1b)

0≤x2≤x2·u2(1c)

x1+x2=ˆ

l, (1d)

where x1, x2are the generation limits of the two generators.

Here we assume G1 is cheaper than G2. And we assign the

cost coefﬁcients c1, c2, d1, d2with the values 0.1 $/kW, 0.2

$/kW, 10 $, 20 $, and the generation limits x1, x2with the

values 50 kW, 40 kW.

Let the realization of net load lbe 49 kW. If the forecast

of load ˆ

lis 47 kW, the optimal solution of the UC problem is

x∗

1= 47, x∗

2= 0, u∗

1= 1, u∗

2= 0. And once the realization of

the net load lis available, G1 generates an additional 2 kW

of electricity to satisfy the demand. Therefore, the total cost

under the forecast ˆ

l= 47 kW is 14.9 $. If the forecast of load

lis 51 kW, the optimal solution of the UC problem is x∗

50, x∗

2= 1, u∗

1= 1, u∗

2= 1. And once the realization of the

net load lis available, G1 and G2 respectively reduce 1 kW of

electricity generation to satisfy the demand. And the total cost

under the forecast ˆ

l= 51 kW is 34.9 $. It can be observed that

although the two forecasts have the same deviation from the

realization and therefore they have the same quality evaluated

by the statistical quality metric such as mean squared error, the

costs incurred by them are different. Therefore, this case shows

the value-oriented forecast is important to the UC problem and

the similar opinion can be found in [26].

Example 2 (PI forecast for wind power offering in day-

ahead market): Let ˆ

Edenote the quantity the wind power

producer offers in the day-ahead market and Edenote the

wind power realization. Under the day-ahead electricity price

λD, the proﬁt obtained in the day-ahead market is λD·ˆ

E. In

the two-price balance market where λUP , λDW are the prices

for up- and down- regulation, the wind power producer has

to buy up-regulation power when its actual realization Eis

smaller than the offer ˆ

E, while down-regulation is to be sold

when Eis larger than ˆ

E. The total proﬁt of the wind power

producer in the day-ahead and balance markets is therefore

formulated as,

ρ=λD·ˆ

E+λUP ·[E−ˆ

E]−+λDW ·[E−ˆ

E]+,(2)

where [·]−= min(·,0) and [·]+= max(·,0). Eq. (2) can be

equivalently formulated as,

ρ=λD·E−[(λD−λUP )·[E−ˆ

E]−+(λD−λDW )·[E−ˆ

E]+],

(3)

Pricing rules entail that λD≤λUP ,λD≥λDW . Given

[E−ˆ

E]−≤0and [E−ˆ

E]+≥0, both terms inside the

brackets are nonnegative. λUP −λDand λD−λDW are the

costs of opportunity loss per energy unit under up-regulation

and down-regulation, respectively. The wind power producer

aims to offer the wind power ˆ

Efor maximizing the proﬁt ρ.

Let us assume the values of the day-ahead electricity price

λD, the prices of up- and down- regulation λUP , λDW , and the

wind power realization are 60 $/MW, 300 $/MW, 10 $/MW,

and 20 MW respectively. Denote the lower and upper bounds

of PI for quantifying the wind power Ein day-ahead market

as q, q. The following PI-based robust optimization problem to

determine the wind power offer in the worst case is formulated

as,

max

E∈[q,q]

min

E∈[q,q]λD·E−[(λD−λUP )·[E−ˆ

E]−+

(λD−λDW )·[E−ˆ

E]+]

(4)

Here, we consider the two day-ahead PI forecasts for the

wind power E, namely [16,18] and [21,22]. Obviously, the

statistical quality of the latter is better than the former.

For the ﬁrst PI forecast [16,18], in the worst scenario, the

wind power offer equals 16 MW, and the optimal objective of

(3) equals 960 $. Then, when the wind power production is

revealed in the real-time market, the proﬁt incurred by down-

regulation is 40 $. Therefore, the total proﬁt of the wind power

producer in the day-ahead and real-time markets is 1000 $.

For the second PI forecast [21,22], in the worst scenario, the

wind power offer equals 21 MW, and the optimal objective of

(4) equals 1260 $. Then, when the wind power production

is revealed in the real-time market, the proﬁt incurred by

down-regulation is -300 $. Therefore, the total proﬁt of the

wind producer in the day-ahead and real-time markets is 960

$, which is lower than the proﬁt obtained under the ﬁrst PI

forecast.

Therefore, the example shows that the good statistical

quality cannot ensure the good value for the decision-making.

As such, the value-oriented PI forecasting is needed to bridge

the gap between the forecast and the decision.

III. PRELIMINARIES

In this section, ﬁrst we introduce the formulation of PI

and discuss the need of value-oriented PI in subsection A. In

subsection B, we introduce the two-timescale VPP operation,

which is the downstream decision-making problem of the PI

estimation task.

A. Preliminary of Prediction Interval

Let Yt+kdenote a random variable for the target wind

power output at future time t+k,FYt+kbe the corresponding

cumulative distribution function, and yt+kbe the realization.

Speciﬁcally, a PI with NCP (1−β)×100% provides a summary

of the cumulative distribution function FYt+k, and can be

developed as,

Yt+k= [ˆqαt+k

t+k,ˆqαt+k

t+k]

αt+k=αt+k+ 1 −β, (5)

where αt+kis in the range of (0, β), and ˆqαt+k

t+k,ˆqαt+k

t+kare the

predictions of the quantiles F−1

Yt+k(αt+k),F−1

Yt+k(αt+k). Given

the probability proportion α∈ {αt+k,αt+k}and contextual

information stup to time t, the quantile prediction can be

achieved by training a QR model fα(st; Θα)minimizing the

pinball loss function, where QR models can be chosen as many

off-the-shelf ones. It is described as,

Θα= arg min

Θα

EFYt+k[`α(fα(st; Θα), Yt+k)],(6)

where `αis the pinball loss function, deﬁned as,

`α(x, y) = max{α(y−x),(α−1)(y−x)}.(7)

After the model training process illustrated in (6), with the

estimated model parameters ˆ

Θαt+kand ˆ

Θαt+k, the predicted

quantiles are given by ˆqαt+k

t+k=fαt+k(st;ˆ

Θαt+k)and ˆqαt+k

t+k=

fαt+k(st;ˆ

Θαt+k).

The probability proportion αt+kis often chosen as β/2

if the distribution function FYt+kis symmetric, or chosen

optimally and adaptively to the skewed distribution. Indeed,

PI serves as an input to the subsequent decision-making.

The optimality to the probability distribution cannot always

ensure the optimality to the value of the downstream decision

task. To tackle this challenge, we seek to ﬁnd the optimal

probability proportion for PI prediction, such that the value

of the downstream decision task is maximized. Before we

show how to achieve this in the context of two-timescale

VPP operation, we ﬁrstly introduce the general model of its

operation framework under the uncertainty of wind power

output in the next subsection.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AContextualBanditApproachforValue-orientedPredictionIntervalForecastingYufanZhang,HonglinWen,Member,IEEE,andQiuweiWu,SeniorMember,IEEEAbstractPredictioninterval(PI)isaneffectivetooltoquan-tifyuncertaintyandusuallyservesasaninputtodownstreamrobustoptimization.Traditionalapproachesfocusonimprovingthe...

展开>> 收起<<

A Contextual Bandit Approach for Value-oriented Prediction Interval Forecasting Yufan Zhang Honglin Wen Member IEEE and Qiuwei Wu Senior Member IEEE.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Contextual Bandit Approach for Value-oriented Prediction Interval Forecasting Yufan Zhang Honglin Wen Member IEEE and Qiuwei Wu Senior Member IEEE

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: