Fitting State-space Model for Long-term Prediction of the Log-likelihood of Nonstationary Time Series Models Genshiro Kitagawa

2025-04-27 1 0 939.6KB 16 页 10玖币

侵权投诉

Fitting State-space Model for Long-term Prediction

of the Log-likelihood of Nonstationary Time Series Models

Genshiro Kitagawa

Mathematics and Informatics Center, The University of Tokyo

October 13, 2022

Abstract The goodness of the long-term prediction in the state-space model was evaluated using the squared

long-term prediction error. In order to estimate the model parameters suitable for long-term prediction, we

devised a modiﬁed log-likelihood corresponding to the long-term prediction error variance. Trend models and

seasonally adjusted models with and without AR component are examined as examples.

1 Introduction: State-Space Modeling of Time Series

1.1 State-Space Model and State Estimation

Consider the state-space model of a univaraite time series yn;

xn=Fnxn−1+Gnvn,(system model) (1)

yn=Hnxn+wn,(observation model) (2)

where xnis a k-dimensional state vector, vnis an m-dimensional system noise that follows a white noise

with mean vector zero and variance-covariance matrix Qn, and wnis the observation noise that follows a 1-

dimensional Gaussian white noise with mean zero and the variance Rn.Fn,Gn, and Hnare k×k,k×m,k×1

and 1 ×kmatrices, respectively. The initial state vector x0is assumed to follow the Gaussian distribution,

N(0, V0|0). Many linear models used in time series analysis such as the AR model, ARMA model and various

nonstationary models such as the trend model and the seasonal adjustment model are expressible in terms of

the state-space models (Anderson and Moore (1979), Kitagawa (2020)).

In this paper, we shall consider the problem of estimating the state xnat time nbased on the set of

observations Yj={y1,· · · , yj}. For j < n,j=n, and j > n, the state estimation problem is referred to as

the prediction, ﬁlter and smoothing, respectively. This state estimation problem is important in state-space

modeling since many tasks such as one-step-ahead and multi-step-ahead prediction, interpolation, and likelihood

computation for the time series can be systematically solved through the estimated state.

A generic approach to these state estimation problems is to obtain the conditional distribution p(xn|Yj) of

the state xngiven the observations Yj. Then, since the state-space model deﬁned by (1) and (2) is a linear model,

and moreover the noises vnand wn, and the initial state x0follow normal distributions, all these conditional

distributions become normal distributions. Therefore, to solve the problem of state estimation of the state-

space model, it is suﬃcient to obtain the mean vectors xn|jand the variance-covariance matrices Vn|jof the

conditional distributions.

For the linear state-space model, Kalman ﬁlter provides a computationally eﬃcient recursive computational

algorithm for state estimation (Kalman (1960), Anderson and Moore (1976)).

One-step-ahead prediction

xn|n−1=Fnxn−1|n−1

Vn|n−1=FnVn−1|n−1FT

n+GnQnGT

n.(3)

Filter

Kn=Vn|n−1HT

n(HnVn∗n−1HT

n+Rn)−1

xn|n=xn|n−1+Kn(yn−Hnxn|n−1) (4)

Vn|n= (I−KnHn)Vn|n−1.

arXiv:2210.05997v1 [stat.ME] 12 Oct 2022

1.2 Likelihood Computation and Parameter Estimation for Time Series Models

Assume that the state-space representation for a time series model speciﬁed by a parameter θis given. When

the time series y1,· · · , yNof length Nis given, the Ndimensional joint density function of y1,· · · , yNspeciﬁed

by this time series model is denoted by fN(y1,· · · , yN|θ). Then, the likelihood of this model is deﬁned by

L(θ) = fN(y1,· · · , yN|θ). Using the conditional distribution of yngiven the previous observations, the likelihood

of the time series model can be expressed as a product of one-dimentional conditional density functions:

L(θ) =

n=1

gn(yn|y1,· · · , yn−1, θ) =

n=1

gn(yn|Yn−1, θ).(5)

Here, if we deﬁne Y0=∅(empty set), then g1(y1|Y0, θ)≡f1(y1|θ). By taking the logarithm of L(θ), the

log-likelihood of the model is obtained as

`(θ) = log L(θ) =

n=1

log gn(yn|Yn−1, θ).(6)

Since gn(yn|Yn−1, θ) is the conditional distribution of yngiven the observation Yn−1and it is, in fact, a normal

distribution with mean yn|n−1and variance dn|n−1, it can be expressed as (Kitagawa and Gersch (1996))

gn(yn|Yn−1, θ) = 2πdn|n−1−1

2exp −(yn−yn|n−1)2

2dn|n−1.(7)

Here, from the observation model, (2), yn|n−1and dn|n−1are obtained by

yn|n−1=Hnxn|n−1(8)

dn|n−1=HnVn|n−1HT

n+Rn.(9)

Therefore, by substituting this density function into (6), the log-likelihood of this state-space model is

obtained as

`(θ) = −1

2Nlog 2π+

n=1

log dn|n−1+

n=1

(yn−yn|n−1)2

dn|n−1.(10)

The maximum likelihood estiamtes of the parameters of the state-space model can be obtained by maxi-

mizing this log-likelihood function numerically. However, for univariate time series, we can assume that R= 1

(Kitagawa (2020)). Actually, if ˜

Vn|n,˜

Vn|n−1,˜

Qn, and ˜

Rare deﬁned by

Vn|n−1=σ2˜

Vn|n−1, Vn|n=σ2˜

Vn|n, Qn=σ2˜

Qn,˜

R= 1,(11)

then it can be seen that the obtained Kalman gain ˜

Knis identical to Kn. Therefore, in the ﬁltering step, we

may use ˜

Vn|nand ˜

Vn|n−1instead of Vn|nand Vn|n−1. Furthermore, it can be seen that the vectors xn|n−1and

xn|nof the state do not change under these modiﬁcations. In summary, if Rnis time-invariant and R=σ2is an

unknown parameter, we may apply the Kalman ﬁlter by setting R= 1. Since we then have dn|n−1=σ2˜

dn|n−1

from (10), this yields

`(θ) = −1

2(Nlog 2πσ2+

n=1

log ˜

dn|n−1+1

σ2

n=1

(yn−yn|n−1)2

dn|n−1).

(12)

From the likelihood equation

∂`(θ)

∂σ2=−1

2(N

σ2−1

(σ2)2

n=1

(yn−yn|n−1)2

dn|n−1)= 0,(13)

the maximum likelihood estimate of σ2is obtained by

ˆσ2=1

n=1

(yn−yn|n−1)2

dn|n−1

.(14)

Furthermore, denoting the parameters in θexcept for the variance σ2by θ∗, and substituting (14) into (12), we

have an expression for the log-likelihood

`(θ∗) = −1

2 Nlog 2πˆσ2+

n=1

log ˜

dn|n−1+N!.(15)

1.3 Parameter Estimation and Criterion for Increasing Horizon Prediction of the

State

For the state-space model, by repeating the one-step-ahead prediction step, we can perform increasing horizon

prediction, that is, we can obtain xn+j|nand Vn+j|nfor j= 1,2,· · · p.

The increasing horizon prediction

For j= 1,· · · , p, repeat

xn+j|n=Fn+jxn+j−1|n

Vn+j|n=Fn+jVn+j−1|nFT

n+j+Gn+jQn+jGT

n+j.(16)

The long-term prediction is considered by many authors such as Judd and Small (2000), Sorjamaa et. al.

(2007) and Xiong et. al. (2013). In this paper, we evaluate the goodness of the long-term prediction by the

diﬀerence between the predictied value and the observed value

ˆσ2

p=1

N−p

n=1

ε2

n+p|n,(17)

where p-step-ahead prediction error is deﬁned by εn+p|n=yn+p−yn+p|nand yn+p|nis deﬁned by yn+p|n=

Hxn+p|n. We can also consider a modiﬁed log-likelihood for the long-term prediction deﬁned by

`p(θ) = −1

N−p((N−p)(log 2πˆσ2

p+ 1) +

N−p

n=1

log dn+p|n)(18)

where dn+k|nis obtained by dn+p|n=Hn+pVn+p|nHT

n+p+Rn+p. Note that, diﬀerent from the case of one-step-

ahead prediction errors, the long-term prediction errors, εp+1|1, . . . , εN|N−pare not independent.

Given the predetermined prediction horizon p, the optimal value of the parameter vector θfor p-step-ahead

prediction is obtained by maximizing this modiﬁed log-likelihood.

2 Examples

2.1 Trend models

2.1.1 The secon order trend model

We consider the second order trend model

yn=Tn+wn,(19)

where Tnis the trend component that follows the second order trend component model Tn= 2Tn−1−Tn−2+vn

and wnand vnare Gaussian white noise wn∼N(0, σ2) and vn∼N(0, τ2), respectively. Note that this trend

model can be expressed by a state-space model by

xn=Tn

Tn−1, F =2−1

1 0 , G =1

0, H =1 0 .(20)

1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477

p= 1

maxtemp mean mean - 2SD mean + 2SD

1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477

p= 2

maxtemp mean mean - 2SD mean + 2SD

1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477

p= 5

maxtemp mean mean - 2SD mean + 2SD

1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477

p= 20

maxtemp mean mean - 2SD mean + 2SD

Figure 1: The estimated trend of the maximum temperature data obtained by the second order trend models

estimated by the p-step-ahead prediction criterion, p=1, 2, 5 and 20. Each plot shows the data (black), the

mean (red) , mean±2SD (blue) of the estimated trend components.

Figure 1 shows the trend estimates of the maxtemp data (daily maximum temperature data in Tokyo,

N=486). Smoothed estimate (red) and ±2SD interval are shown. Top left plot shows the estimates obtained by

the ordinary maximum likelihood method, i.e., by p= 1. On the other hand, three other plots show the results

obtained by modiﬁed log-likelihood criteria asumming p= 1,5 and 20, respectively. It can be seen that the

trend estimate by p= 1 is considerably variable compared with other three estimates, and other three estimates

obtained with p > 1 resemble each other.

Table 1 shows the increasing horizon prediction error variances ˆσ2

j,j= 1,...,20, for various parameter

estimation criteria `p(θ), p=1,(1),6,(2),20. The results for p= 1 shown in the second column are the increasing

horaizon prediction error variances of the model obtained by the maximum-likelihood method. In general,

(j, p)-element of the table shows the j-step-ahead prediction error variance of the model whose parameter was

obtained by maximizing the modiﬁed log-likelihood for p-step-ahead prediction criterion (18). Naturally, the

one-step-ahead prediction error variance ˆσ2

1attains the smallest value 9.89 with p= 1, but the long-term

prediction error variances, i.e., for j > 1, ˆσ2

jbecomes the largest among p= 1,...,20. The table also shows

that the j-step-ahead prediction error variance is the smallest at the criterion p. For p > 1, the increase of the

long-term prediction error varaince is not so signiﬁcant and ˆσ2

jtakes similar values for diﬀerent p. The last row

of the table shows the average of the long-term prediction error varainces, ˆσ2

j, over j= 1,...,20 for each p.

Figure 2 shows the increase of the long-term prediction error variance ˆσ2

j,j= 1,...,20 for p=1, 2, 3, 10

and 20. We can see that the long-term prediction error variances obtained by p= 1 are signiﬁcantly larger than

other cases, and there is almost no diﬀerence in the prediction error variances among p= 2, 3, 10 and 20.

From the table and the ﬁgure, it can be concluded, at least for this data set, that the model with the

maximum likelihood estimates of the patameter has the minimum one-step-ahead prediction error variance but

has the largest long-term prediction error varainces. For this second order trend model p= 2,...,20 yield a

similar increasing horaizon prediction performances.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

FittingState-spaceModelforLong-termPredictionoftheLog-likelihoodofNonstationaryTimeSeriesModelsGenshiroKitagawaMathematicsandInformaticsCenter,TheUniversityofTokyoOctober13,2022AbstractThegoodnessofthelong-termpredictioninthestate-spacemodelwasevaluatedusingthesquaredlong-termpredictionerror.Inorder...

展开>> 收起<<

Fitting State-space Model for Long-term Prediction of the Log-likelihood of Nonstationary Time Series Models Genshiro Kitagawa.pdf

共16页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Fitting State-space Model for Long-term Prediction of the Log-likelihood of Nonstationary Time Series Models Genshiro Kitagawa

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: