Optimal Event Monit oring through Internet Mashup over Multivariate Time Series Chun -Kit Ngan George Mason University USA

2025-04-29 0 0 734.77KB 25 页 10玖币

侵权投诉

Optimal Event Monitoring through Internet Mashup over

Multivariate Time Series

Chun-Kit Ngan, George Mason University, USA

Alexander Brodsky, George Mason University, USA

ABSTRACT

We propose a Web-Mashup Application Service Framework for Multivariate Time Series

Analytics (MTSA) that supports the services of model definitions, querying, parameter learning,

model evaluations, data monitoring, decision recommendations, and web portals. This

framework maintains the advantage of combining the strengths of both the domain-knowledge-

based and the formal-learning-based approaches and is designed for a more general class of

problems over multivariate time series. More specifically, we identify a general-hybrid-based

model, MTSA – Parameter Estimation, to solve this class of problems in which the objective

function is maximized or minimized from the optimal decision parameters regardless of

particular time points. This model also allows domain experts to include multiple types of

constraints, e.g., global constraints and monitoring constraints. We further extend the MTSA

data model and query language to support this class of problems for the services of learning,

monitoring, and recommendation. At the end, we conduct an experimental case study for a

university campus microgrid as a practical example to demonstrate our proposed framework,

models, and language.

Keywords: Web-Mashup Framework, Parameter Learning, Decision Support, Optimization

Model, Query Language

INTRODUCTION

Observing behaviors, trends, and patterns on multivariate time series (Bisgaard & Kulahci, 2011;

Chatfield, 2001) has been broadly used in various application domains, such as financial

markets, medical treatments, economic studies, and electric power management. Domain experts

utilize multiple time series to detect events and make better decisions. For example, financial

analysts predict different states of the stock market, e.g., bull or bear, more accurately based

upon monitoring daily stock prices, weekly interest rates, and monthly price indices. Physicians

monitor patients’ health conditions by measuring their diastolic and systolic blood pressures, as

well as their electrocardiogram tracings over time. Sociologists uncover hidden social problems

within a community more profoundly through studying a variety of economic, medical, and

social indicators, e.g., annual birth rates, mortality rates, accident rates, and various crime rates.

The goal of examining those characteristics over multivariate time series on events is to support

decision makers, e.g., financial analysts, physicians, and sociologists, to better understand a

problem in different perspectives within a particular domain and to offer better actionable

recommendations.

To support such an event-based decision-making and determination over multivariate time

series, in this paper, we propose a Web-Mashup Application Service Framework for Multivariate

Time Series Analytics (MTSA). This framework is an integrated tool to support the MTSA

service development, including model definitions, querying, parameter learning, data monitoring,

decision recommendations, and model evaluations. Domain experts could use the framework to

develop and implement their web-based decision-making applications on the Internet. Using a

Web Mashup function offered by the Web 2.0 technology (Vancea & Others, 2008; Gurram &

Others, 2008; Murugesan, 2007;Bradley, 2008; Alonso & Others, 2004; Altinel & Others, 2007;

Ennals & Others, 2007;Thor & Others, 2007) on our framework, domain experts could collect

and unify global information and data from different channels and media, such as web sites, data

sources, organizational information, etc., to generate a concentric view of collected time series

data from which the learning service determines optimal decision parameters. Using optimal

decision parameters, domain experts can employ the monitoring service to detect events and the

recommendation service to suggest actions.

Presently, there are two key approaches that domain users utilize to identify and detect

interesting events over multivariate time series. These approaches are domain-knowledge-based

and formal-learning-based. The former approach completely relies on domain experts’

knowledge. Based on their knowledge and experience, domain experts determine monitoring

conditions that detect events of interest and trigger an appropriate action. More specifically,

domain experts, e.g., financial analysts, have identified several deterministic time series, such as

the S&P 500 percentage decline (SPD) and the Consumer Confidence Index drop (CCD), from

which they develop parametric monitoring templates, e.g., SPD < -20%, CCD < -30 (Stack,

2009), etc., according to their expertise. Once the incoming time series, i.e., SPD and CCD,

satisfy the given templates at a particular time point, the financial analysts decide that the bear

market bottom is coming, which is the best buy opportunity to purchase the stock to earn the

maximal earning.

Consider another real-world case study of the timely event detection of certain conditions in

the electric power microgrid at George Mason University (GMU), where its energy planners

would like to regularly detect when the electric power demand (electricPowerDemand) exceeds

the pre-determined peak demand bound (peakDemandBound). The reason is that the occurrence

of this event leads to a significant portion of the GMU electric bill based upon its contractual

terms even though the event, electricPowerDemand > peakDemandBound, occurs only within a

short period of time, e.g., one minute. Thus such an identification and detection can aid in the

task of decision-making and the determination of action plans. To make better decisions and

determinations, the energy planners have identified a set of time series that can be used to detect

the event and perform an action, e.g., to execute the electric load shedding to shut down some

electric account units on the GMU campus according to a prioritization scheme from the energy

manager. The multiple time series include the input electric power demand per hourly time

interval, the given peak demand bound per monthly pay period, etc. If these time series satisfy a

pre-defined, parameterized condition, e.g., electricPowerDemand > peakDemandBound, where

the given peakDemandBound is 17200 kWh for all the hourly time intervals within the same

monthly pay period, e.g., July, 2012, it signals the energy planners to execute the electric load

shedding in the microgrid on the campus. Often these parameters, e.g., the predetermined peak

demand bound, may reflect some realities since they are set by domain experts, e.g., the energy

planners, based on their past experiences, observations, intuitions, and domain knowledge.

However, these given thresholds, e.g., the peak demand bound, are not always accurate. In

addition, the parameters are static, but the problem that we deal with is often dynamic in nature,

so the parameters definitely are not the optimal values for achieving the monitoring purpose at

different periods of time, e.g., hourly, daily, monthly, quarterly, and yearly, to minimize the

electricity expenses of the bill. Thus this domain-knowledge-based approach lacks a formal

mathematical foundation that dynamically learns optimal decision parameters to determine an

event.

The latter approach utilizes a formal learning methodology, such as a non-linear logistic

regression model (Bierens, 2008; Cook & Others, 2000; Dougherty, 2007; Hansen, 2010; Heij &

Others, 2004). This regression model is used to predict the occurrence of an event (0 or 1), e.g.,

when to shed load or unshed load, by learning parametric coefficients of the logistic distribution

function of explanatory variables, i.e., the electric power demand and the peak demand bound.

More specifically, this non-linear logistic regression model focuses on modeling the data

relationship between explanatory variables and response variables. The truth is that not all the

response variables are numeric and continuous. In many real-world cases, the responses may

only take one of two possible answers, e.g., shed load or unshed load, buy or sell stocks, success

or failure, etc. Each outcome of the responses is assigned to a value 1 if the probability of the

event happening is above 0.5 and 0 otherwise. To learn the parametric coefficients of the logistic

distribution function of explanatory variables to determine the outcome of the binary responses,

we can apply the nonlinear logistic regression model and the Maximum Likelihood Estimation

(MLE) (Myung, 2003) over historical and projected data. However, the main challenge of using

formal learning approaches is that they do not always produce satisfactory results, as they do not

consider incorporating domain knowledge, including monitoring constraints, e.g.,

electricPowerDemand > peakDemandBound , and global constraints, e.g., utility contractual

terms, into their formal learning aproaches. Lacking domain experts’ knowledge on parameter

learning will result in an inaccurate decision-making. For instance, the energy planners might

execute the electric load shedding at an improper moment of time, particularly during the

business office hours between 9:00 a.m. and 6:00 p.m. from Monday to Friday.

Some existing mathematical models, e.g., the Durland and McCurdy duration-dependent

Markov-switching (DDMS) models, such as DDMS-ARCH and DDMS-DD (Maheu and

McCurdy, 2000), do integrate domain knowledge, e.g., duration dependence, into their

forecasting criteria. Both models, DDMS-ARCH and DDMS-DD, are extended from the

Markov-switching model (Bickel, et al., 1998) that is incorporated with duration dependence to

affect a transition probability that is parameterized using the logistic distribution function. The

transition probability is the probability of being in a particular state at a specific point in time.

The value and the trend of this probability over time demonstrates the current state of an event.

However, all of these models only consider a single element, i.e., duration, to integrate into the

model to determine a state of an event. This approach is not flexible and complete as there are

many other external, unknown factors that may affect the state of an event in the currenct

environment. In addition, those models also involve parameters that need to be learnt by formal

mathematical computations. Without wide-ranging domain experts’ knoweledge, those formal

learning methods become computationally intensive and time consuming. The whole model

building is an iterative and interactive process, including model formulation, parameter

estimation, and model evaluation. Despite enormous improvements in computer software in

recent years, fitting such nonlinear quantitative decision model (Evans, 2010) is not a trival task,

especially if the parameter learning process involves multiple explanatory variables, i.e., high

dimensionality. Moreover, working with high-dimensional data creates difficult challenges, a

phenomenon known as the “curse of dimensionality” (Bellman, 1957 and 1961). Specifically, the

amount of observations required in order to obtain good estimates increases exponentially with

the increase of dimensionality. In addition, many learning algorithms do not scale well on high

dimensional data due to the high computational cost. The parameter computations by formal-

learning-based approaches, e.g., logistic regression model, are complicated and costly, and they

lack the consideration of integrating various experts’ domain knowledge into the learning

process – a step that could potentially reduce the dimensionality. Clearly, both approaches,

domain-knowledge-based and formal-learning-based, do not take advantage of each other to

learn optimal decision parameters, which are then used to monitor the events and to take

appropriate actions.

To mitigate the shortcomings of the existing approaches, we have proposed a mathematical

hybrid-based model, Expert Query Parametric Estimation (EQPE), and an SQL-based language

(Ngan, Brodsky & Lin, 2010), which combine the strengths of both domain-knowledge-based

and formal-learning-based approaches. More specifically, we take a monitoring template of

conditions in a specific form, that is, conjunctions of inequality constraints, identified by domain

experts. This template consists of inequalities of values in time sequences and then is

parameterized. The goal is to find parameters that maximize or minimize an objective function in

which the function is depended on optimal time points of a time utility function from which the

parameters are learned. Because of these characteristics, however, the EQPE model is only able

to solve a specific class of problems that (1) their decision parameters of an objective function

are learned from optimal time points of a time utility function, (2) the monitoring template has to

be in the considered form, i.e., conjunctions of inequality constraints, only, and (3) the

constraints being used are solely for monitoring purposes.

To address the above weaknesses, the proposed web-mashup application service framework

for MTSA also maintains the advantage of combining the strengths of both the domain-

knowledge-based and the formal-learning-based approaches, but it is designed for a more general

class of problems over multivariate time series. This service framework supports quick

implementations of services towards decision recommendations on events. More specifically, the

MTSA Model Definition Service takes multiple templates of conditions, for example, the

monitoring template to determine the occurrence of an event identified by domain experts, the

general template for a contractual term of an electric bill required by power companies, etc. Such

templates consist of inequalities of values in time sequences, and then the Learning Service

“parameterizes” it, e.g., electricPowerDemand > peakDemandBound. The goal of the learning

service is to learn parameters that optimize the objective function, e.g., minimizing the cost of

the GMU electric bill. The Monitoring and Recommendation Service continuously monitors the

data streams that satisfy the parameterized conditions of the monitoring template, in which the

parameters have been instantiated by the learning service.

To support such services for a general class of problems, we further extend the proposed

relational database model and SQL with high-level MTSA constructs. This further extension can

support parameter learning, data monitoring, and decision recommendation over multivariate

time series for this class of problems. To this end, we identify a general-hybrid-based model,

Multivariate Time Series Analytics – Parameter Estimation (MTSA-PE). This model is a

combination of both domain-knowledge-based and formal-learning-based approaches with

possibly incorporating any global constraints, e.g., the contractual terms of the GMU electric bill,

which are applied to an entire problem, and monitoring constraints, e.g., electricPowerDemand

> peakDemandBound, which are used to detect the occurrence of an event. Both types of

inequality constraints, global and monitoring, are allowed in any possible combinations and

forms. Using the MTSA-PE model, domain experts can learn decision parameters that satisfy all

the given constraints and that optimize the objective function, which is independent of a

particular time point.

To demonstrate our MTSA-PE model, we conduct an experimental case study on the Fairfax

campus microgrid at GMU. We utilize the MTSA-PE model to illustrate the GMU problem and

the further extended MTSA-query constructs to express the model. After the MTSA-query

constructs are initiated to learn the optimal peak demand bound over historical and projected

electric power demands, the occurrence of the event can be monitored and determined through

the parametric monitoring constraints, e.g., electricPowerDemand > peakDemandBound. Once

the event is detected, the electric load shedding can be executed.

The rest of the paper is organized as follows: using the GMU Fairfax campus microgrid as an

example, we describe its electric bill problem in the second section. In the third section, we

provide an overview on the web-mashup application service framework for multivariate time

series analytics and describe its supports of quick service implementations towards

recommendations on events over multivariate time series. In the fourth section, we use the GMU

electric bill as an example to describe the further extended MTSA data model and query

language that is used for the MTSA service implementations of the general class of problems.

We also define the Multivariate Time Series Analytics - Parameter Estimation (MTSA-PE)

model for the MTSA-query semantics and use the GMU electric bill problem to illustrate the

learning, monitoring, and recommendation services on this model in the fifth section. In the sixth

section, we present the architecture for the parameter learning process. In the seventh section, we

conduct and describe the experimental case study on that GMU problem. In the eighth section,

we conclude and briefly outline the future work.

PROBLEM DESCRIPTION OF A REAL CASE STUDY

Consider the real case study at George Mason University (GMU), where the electric power

demand across the expanding Fairfax and other campuses is expected to increase. The increase in

power consumption results in a higher electricity cost, which is composed of the two main

components: (1) a total kilowatt-hour (kWh) charge, i.e., the charge for the total electricity

consumption, and (2) an Electricity Supply (ES) service charge, i.e., the charge for the peak

demand usage in any 30-minute interval over the past 12 months. The total kWh charge is priced

particularly higher during the business office hours between 09:00 a.m. and 06:00 p.m. from

Monday to Friday. This monthly ES service charge (monthlyEServiceCharge) is a proxy for the

cost of capital investment for power generation capacity, since the power company, Virginia

Electric and Power Company, needs to build generation, transmission, and distribution facilities

that are capable of supporting the peak demand, even though the average power demand could be

considerably lower. This ES service charge amounts to approximately 30% of the electric bill in

each monthly pay period (payPeriod) and is determined based upon the electricity supply

demand (payPeriodSupplyDemand). This electricity supply demand is decided on the highest of

either (C1) or (C2):

C1: The highest average kilowatt measured in any 30-minute interval of the current billing

month during the on-peak hours of either:

 Between 10 a.m. and 10 p.m. from Monday to Friday for the billing months of June

through September or

 Between 7 a.m. and 10 p.m. from Monday to Friday for all other billing months.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OptimalEventMonitoringthroughInternetMashupoverMultivariateTimeSeriesChun-KitNgan,GeorgeMasonUniversity,USAAlexanderBrodsky,GeorgeMasonUniversity,USAABSTRACTWeproposeaWeb-MashupApplicationServiceFrameworkforMultivariateTimeSeriesAnalytics(MTSA)thatsupportstheservicesofmodeldefinitions,querying,param...

展开>> 收起<<

Optimal Event Monit oring through Internet Mashup over Multivariate Time Series Chun -Kit Ngan George Mason University USA.pdf

共25页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Optimal Event Monit oring through Internet Mashup over Multivariate Time Series Chun -Kit Ngan George Mason University USA

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: