Optimal Event Monit oring through Internet Mashup over Multivariate Time Series Chun -Kit Ngan George Mason University USA

2025-04-29 0 0 734.77KB 25 页 10玖币
侵权投诉
Optimal Event Monitoring through Internet Mashup over
Multivariate Time Series
Chun-Kit Ngan, George Mason University, USA
Alexander Brodsky, George Mason University, USA
ABSTRACT
We propose a Web-Mashup Application Service Framework for Multivariate Time Series
Analytics (MTSA) that supports the services of model definitions, querying, parameter learning,
model evaluations, data monitoring, decision recommendations, and web portals. This
framework maintains the advantage of combining the strengths of both the domain-knowledge-
based and the formal-learning-based approaches and is designed for a more general class of
problems over multivariate time series. More specifically, we identify a general-hybrid-based
model, MTSA Parameter Estimation, to solve this class of problems in which the objective
function is maximized or minimized from the optimal decision parameters regardless of
particular time points. This model also allows domain experts to include multiple types of
constraints, e.g., global constraints and monitoring constraints. We further extend the MTSA
data model and query language to support this class of problems for the services of learning,
monitoring, and recommendation. At the end, we conduct an experimental case study for a
university campus microgrid as a practical example to demonstrate our proposed framework,
models, and language.
Keywords: Web-Mashup Framework, Parameter Learning, Decision Support, Optimization
Model, Query Language
INTRODUCTION
Observing behaviors, trends, and patterns on multivariate time series (Bisgaard & Kulahci, 2011;
Chatfield, 2001) has been broadly used in various application domains, such as financial
markets, medical treatments, economic studies, and electric power management. Domain experts
utilize multiple time series to detect events and make better decisions. For example, financial
analysts predict different states of the stock market, e.g., bull or bear, more accurately based
upon monitoring daily stock prices, weekly interest rates, and monthly price indices. Physicians
monitor patients’ health conditions by measuring their diastolic and systolic blood pressures, as
well as their electrocardiogram tracings over time. Sociologists uncover hidden social problems
within a community more profoundly through studying a variety of economic, medical, and
social indicators, e.g., annual birth rates, mortality rates, accident rates, and various crime rates.
The goal of examining those characteristics over multivariate time series on events is to support
decision makers, e.g., financial analysts, physicians, and sociologists, to better understand a
problem in different perspectives within a particular domain and to offer better actionable
recommendations.
To support such an event-based decision-making and determination over multivariate time
series, in this paper, we propose a Web-Mashup Application Service Framework for Multivariate
Time Series Analytics (MTSA). This framework is an integrated tool to support the MTSA
service development, including model definitions, querying, parameter learning, data monitoring,
decision recommendations, and model evaluations. Domain experts could use the framework to
develop and implement their web-based decision-making applications on the Internet. Using a
Web Mashup function offered by the Web 2.0 technology (Vancea & Others, 2008; Gurram &
Others, 2008; Murugesan, 2007;Bradley, 2008; Alonso & Others, 2004; Altinel & Others, 2007;
Ennals & Others, 2007;Thor & Others, 2007) on our framework, domain experts could collect
and unify global information and data from different channels and media, such as web sites, data
sources, organizational information, etc., to generate a concentric view of collected time series
data from which the learning service determines optimal decision parameters. Using optimal
decision parameters, domain experts can employ the monitoring service to detect events and the
recommendation service to suggest actions.
Presently, there are two key approaches that domain users utilize to identify and detect
interesting events over multivariate time series. These approaches are domain-knowledge-based
and formal-learning-based. The former approach completely relies on domain experts’
knowledge. Based on their knowledge and experience, domain experts determine monitoring
conditions that detect events of interest and trigger an appropriate action. More specifically,
domain experts, e.g., financial analysts, have identified several deterministic time series, such as
the S&P 500 percentage decline (SPD) and the Consumer Confidence Index drop (CCD), from
which they develop parametric monitoring templates, e.g., SPD < -20%, CCD < -30 (Stack,
2009), etc., according to their expertise. Once the incoming time series, i.e., SPD and CCD,
satisfy the given templates at a particular time point, the financial analysts decide that the bear
market bottom is coming, which is the best buy opportunity to purchase the stock to earn the
maximal earning.
Consider another real-world case study of the timely event detection of certain conditions in
the electric power microgrid at George Mason University (GMU), where its energy planners
would like to regularly detect when the electric power demand (electricPowerDemand) exceeds
the pre-determined peak demand bound (peakDemandBound). The reason is that the occurrence
of this event leads to a significant portion of the GMU electric bill based upon its contractual
terms even though the event, electricPowerDemand > peakDemandBound, occurs only within a
short period of time, e.g., one minute. Thus such an identification and detection can aid in the
task of decision-making and the determination of action plans. To make better decisions and
determinations, the energy planners have identified a set of time series that can be used to detect
the event and perform an action, e.g., to execute the electric load shedding to shut down some
electric account units on the GMU campus according to a prioritization scheme from the energy
manager. The multiple time series include the input electric power demand per hourly time
interval, the given peak demand bound per monthly pay period, etc. If these time series satisfy a
pre-defined, parameterized condition, e.g., electricPowerDemand > peakDemandBound, where
the given peakDemandBound is 17200 kWh for all the hourly time intervals within the same
monthly pay period, e.g., July, 2012, it signals the energy planners to execute the electric load
shedding in the microgrid on the campus. Often these parameters, e.g., the predetermined peak
demand bound, may reflect some realities since they are set by domain experts, e.g., the energy
planners, based on their past experiences, observations, intuitions, and domain knowledge.
However, these given thresholds, e.g., the peak demand bound, are not always accurate. In
addition, the parameters are static, but the problem that we deal with is often dynamic in nature,
so the parameters definitely are not the optimal values for achieving the monitoring purpose at
different periods of time, e.g., hourly, daily, monthly, quarterly, and yearly, to minimize the
electricity expenses of the bill. Thus this domain-knowledge-based approach lacks a formal
mathematical foundation that dynamically learns optimal decision parameters to determine an
event.
The latter approach utilizes a formal learning methodology, such as a non-linear logistic
regression model (Bierens, 2008; Cook & Others, 2000; Dougherty, 2007; Hansen, 2010; Heij &
Others, 2004). This regression model is used to predict the occurrence of an event (0 or 1), e.g.,
when to shed load or unshed load, by learning parametric coefficients of the logistic distribution
function of explanatory variables, i.e., the electric power demand and the peak demand bound.
More specifically, this non-linear logistic regression model focuses on modeling the data
relationship between explanatory variables and response variables. The truth is that not all the
response variables are numeric and continuous. In many real-world cases, the responses may
only take one of two possible answers, e.g., shed load or unshed load, buy or sell stocks, success
or failure, etc. Each outcome of the responses is assigned to a value 1 if the probability of the
event happening is above 0.5 and 0 otherwise. To learn the parametric coefficients of the logistic
distribution function of explanatory variables to determine the outcome of the binary responses,
we can apply the nonlinear logistic regression model and the Maximum Likelihood Estimation
(MLE) (Myung, 2003) over historical and projected data. However, the main challenge of using
formal learning approaches is that they do not always produce satisfactory results, as they do not
consider incorporating domain knowledge, including monitoring constraints, e.g.,
electricPowerDemand > peakDemandBound , and global constraints, e.g., utility contractual
terms, into their formal learning aproaches. Lacking domain experts’ knowledge on parameter
learning will result in an inaccurate decision-making. For instance, the energy planners might
execute the electric load shedding at an improper moment of time, particularly during the
business office hours between 9:00 a.m. and 6:00 p.m. from Monday to Friday.
Some existing mathematical models, e.g., the Durland and McCurdy duration-dependent
Markov-switching (DDMS) models, such as DDMS-ARCH and DDMS-DD (Maheu and
McCurdy, 2000), do integrate domain knowledge, e.g., duration dependence, into their
forecasting criteria. Both models, DDMS-ARCH and DDMS-DD, are extended from the
Markov-switching model (Bickel, et al., 1998) that is incorporated with duration dependence to
affect a transition probability that is parameterized using the logistic distribution function. The
transition probability is the probability of being in a particular state at a specific point in time.
The value and the trend of this probability over time demonstrates the current state of an event.
However, all of these models only consider a single element, i.e., duration, to integrate into the
model to determine a state of an event. This approach is not flexible and complete as there are
many other external, unknown factors that may affect the state of an event in the currenct
environment. In addition, those models also involve parameters that need to be learnt by formal
mathematical computations. Without wide-ranging domain experts’ knoweledge, those formal
learning methods become computationally intensive and time consuming. The whole model
building is an iterative and interactive process, including model formulation, parameter
estimation, and model evaluation. Despite enormous improvements in computer software in
recent years, fitting such nonlinear quantitative decision model (Evans, 2010) is not a trival task,
especially if the parameter learning process involves multiple explanatory variables, i.e., high
dimensionality. Moreover, working with high-dimensional data creates difficult challenges, a
phenomenon known as the “curse of dimensionality(Bellman, 1957 and 1961). Specifically, the
amount of observations required in order to obtain good estimates increases exponentially with
the increase of dimensionality. In addition, many learning algorithms do not scale well on high
dimensional data due to the high computational cost. The parameter computations by formal-
learning-based approaches, e.g., logistic regression model, are complicated and costly, and they
lack the consideration of integrating various experts’ domain knowledge into the learning
process a step that could potentially reduce the dimensionality. Clearly, both approaches,
domain-knowledge-based and formal-learning-based, do not take advantage of each other to
learn optimal decision parameters, which are then used to monitor the events and to take
appropriate actions.
To mitigate the shortcomings of the existing approaches, we have proposed a mathematical
hybrid-based model, Expert Query Parametric Estimation (EQPE), and an SQL-based language
(Ngan, Brodsky & Lin, 2010), which combine the strengths of both domain-knowledge-based
and formal-learning-based approaches. More specifically, we take a monitoring template of
conditions in a specific form, that is, conjunctions of inequality constraints, identified by domain
experts. This template consists of inequalities of values in time sequences and then is
parameterized. The goal is to find parameters that maximize or minimize an objective function in
which the function is depended on optimal time points of a time utility function from which the
parameters are learned. Because of these characteristics, however, the EQPE model is only able
to solve a specific class of problems that (1) their decision parameters of an objective function
are learned from optimal time points of a time utility function, (2) the monitoring template has to
be in the considered form, i.e., conjunctions of inequality constraints, only, and (3) the
constraints being used are solely for monitoring purposes.
To address the above weaknesses, the proposed web-mashup application service framework
for MTSA also maintains the advantage of combining the strengths of both the domain-
knowledge-based and the formal-learning-based approaches, but it is designed for a more general
class of problems over multivariate time series. This service framework supports quick
implementations of services towards decision recommendations on events. More specifically, the
MTSA Model Definition Service takes multiple templates of conditions, for example, the
monitoring template to determine the occurrence of an event identified by domain experts, the
general template for a contractual term of an electric bill required by power companies, etc. Such
templates consist of inequalities of values in time sequences, and then the Learning Service
parameterizes” it, e.g., electricPowerDemand > peakDemandBound. The goal of the learning
service is to learn parameters that optimize the objective function, e.g., minimizing the cost of
the GMU electric bill. The Monitoring and Recommendation Service continuously monitors the
data streams that satisfy the parameterized conditions of the monitoring template, in which the
parameters have been instantiated by the learning service.
To support such services for a general class of problems, we further extend the proposed
relational database model and SQL with high-level MTSA constructs. This further extension can
support parameter learning, data monitoring, and decision recommendation over multivariate
time series for this class of problems. To this end, we identify a general-hybrid-based model,
Multivariate Time Series Analytics Parameter Estimation (MTSA-PE). This model is a
combination of both domain-knowledge-based and formal-learning-based approaches with
possibly incorporating any global constraints, e.g., the contractual terms of the GMU electric bill,
which are applied to an entire problem, and monitoring constraints, e.g., electricPowerDemand
> peakDemandBound, which are used to detect the occurrence of an event. Both types of
inequality constraints, global and monitoring, are allowed in any possible combinations and
forms. Using the MTSA-PE model, domain experts can learn decision parameters that satisfy all
the given constraints and that optimize the objective function, which is independent of a
particular time point.
To demonstrate our MTSA-PE model, we conduct an experimental case study on the Fairfax
campus microgrid at GMU. We utilize the MTSA-PE model to illustrate the GMU problem and
the further extended MTSA-query constructs to express the model. After the MTSA-query
constructs are initiated to learn the optimal peak demand bound over historical and projected
electric power demands, the occurrence of the event can be monitored and determined through
the parametric monitoring constraints, e.g., electricPowerDemand > peakDemandBound. Once
the event is detected, the electric load shedding can be executed.
The rest of the paper is organized as follows: using the GMU Fairfax campus microgrid as an
example, we describe its electric bill problem in the second section. In the third section, we
provide an overview on the web-mashup application service framework for multivariate time
series analytics and describe its supports of quick service implementations towards
recommendations on events over multivariate time series. In the fourth section, we use the GMU
electric bill as an example to describe the further extended MTSA data model and query
language that is used for the MTSA service implementations of the general class of problems.
We also define the Multivariate Time Series Analytics - Parameter Estimation (MTSA-PE)
model for the MTSA-query semantics and use the GMU electric bill problem to illustrate the
learning, monitoring, and recommendation services on this model in the fifth section. In the sixth
section, we present the architecture for the parameter learning process. In the seventh section, we
conduct and describe the experimental case study on that GMU problem. In the eighth section,
we conclude and briefly outline the future work.
PROBLEM DESCRIPTION OF A REAL CASE STUDY
Consider the real case study at George Mason University (GMU), where the electric power
demand across the expanding Fairfax and other campuses is expected to increase. The increase in
power consumption results in a higher electricity cost, which is composed of the two main
components: (1) a total kilowatt-hour (kWh) charge, i.e., the charge for the total electricity
consumption, and (2) an Electricity Supply (ES) service charge, i.e., the charge for the peak
demand usage in any 30-minute interval over the past 12 months. The total kWh charge is priced
particularly higher during the business office hours between 09:00 a.m. and 06:00 p.m. from
Monday to Friday. This monthly ES service charge (monthlyEServiceCharge) is a proxy for the
cost of capital investment for power generation capacity, since the power company, Virginia
Electric and Power Company, needs to build generation, transmission, and distribution facilities
that are capable of supporting the peak demand, even though the average power demand could be
considerably lower. This ES service charge amounts to approximately 30% of the electric bill in
each monthly pay period (payPeriod) and is determined based upon the electricity supply
demand (payPeriodSupplyDemand). This electricity supply demand is decided on the highest of
either (C1) or (C2):
C1: The highest average kilowatt measured in any 30-minute interval of the current billing
month during the on-peak hours of either:
Between 10 a.m. and 10 p.m. from Monday to Friday for the billing months of June
through September or
Between 7 a.m. and 10 p.m. from Monday to Friday for all other billing months.
摘要:

OptimalEventMonitoringthroughInternetMashupoverMultivariateTimeSeriesChun-KitNgan,GeorgeMasonUniversity,USAAlexanderBrodsky,GeorgeMasonUniversity,USAABSTRACTWeproposeaWeb-MashupApplicationServiceFrameworkforMultivariateTimeSeriesAnalytics(MTSA)thatsupportstheservicesofmodeldefinitions,querying,param...

展开>> 收起<<
Optimal Event Monit oring through Internet Mashup over Multivariate Time Series Chun -Kit Ngan George Mason University USA.pdf

共25页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:25 页 大小:734.77KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 25
客服
关注