Fitting State-space Model for Long-term Prediction of the Log-likelihood of Nonstationary Time Series Models Genshiro Kitagawa

2025-04-27 0 0 939.6KB 16 页 10玖币
侵权投诉
Fitting State-space Model for Long-term Prediction
of the Log-likelihood of Nonstationary Time Series Models
Genshiro Kitagawa
Mathematics and Informatics Center, The University of Tokyo
October 13, 2022
Abstract The goodness of the long-term prediction in the state-space model was evaluated using the squared
long-term prediction error. In order to estimate the model parameters suitable for long-term prediction, we
devised a modified log-likelihood corresponding to the long-term prediction error variance. Trend models and
seasonally adjusted models with and without AR component are examined as examples.
1 Introduction: State-Space Modeling of Time Series
1.1 State-Space Model and State Estimation
Consider the state-space model of a univaraite time series yn;
xn=Fnxn1+Gnvn,(system model) (1)
yn=Hnxn+wn,(observation model) (2)
where xnis a k-dimensional state vector, vnis an m-dimensional system noise that follows a white noise
with mean vector zero and variance-covariance matrix Qn, and wnis the observation noise that follows a 1-
dimensional Gaussian white noise with mean zero and the variance Rn.Fn,Gn, and Hnare k×k,k×m,k×1
and 1 ×kmatrices, respectively. The initial state vector x0is assumed to follow the Gaussian distribution,
N(0, V0|0). Many linear models used in time series analysis such as the AR model, ARMA model and various
nonstationary models such as the trend model and the seasonal adjustment model are expressible in terms of
the state-space models (Anderson and Moore (1979), Kitagawa (2020)).
In this paper, we shall consider the problem of estimating the state xnat time nbased on the set of
observations Yj={y1,· · · , yj}. For j < n,j=n, and j > n, the state estimation problem is referred to as
the prediction, filter and smoothing, respectively. This state estimation problem is important in state-space
modeling since many tasks such as one-step-ahead and multi-step-ahead prediction, interpolation, and likelihood
computation for the time series can be systematically solved through the estimated state.
A generic approach to these state estimation problems is to obtain the conditional distribution p(xn|Yj) of
the state xngiven the observations Yj. Then, since the state-space model defined by (1) and (2) is a linear model,
and moreover the noises vnand wn, and the initial state x0follow normal distributions, all these conditional
distributions become normal distributions. Therefore, to solve the problem of state estimation of the state-
space model, it is sufficient to obtain the mean vectors xn|jand the variance-covariance matrices Vn|jof the
conditional distributions.
For the linear state-space model, Kalman filter provides a computationally efficient recursive computational
algorithm for state estimation (Kalman (1960), Anderson and Moore (1976)).
One-step-ahead prediction
xn|n1=Fnxn1|n1
Vn|n1=FnVn1|n1FT
n+GnQnGT
n.(3)
Filter
Kn=Vn|n1HT
n(HnVnn1HT
n+Rn)1
xn|n=xn|n1+Kn(ynHnxn|n1) (4)
Vn|n= (IKnHn)Vn|n1.
1
arXiv:2210.05997v1 [stat.ME] 12 Oct 2022
1.2 Likelihood Computation and Parameter Estimation for Time Series Models
Assume that the state-space representation for a time series model specified by a parameter θis given. When
the time series y1,· · · , yNof length Nis given, the Ndimensional joint density function of y1,· · · , yNspecified
by this time series model is denoted by fN(y1,· · · , yN|θ). Then, the likelihood of this model is defined by
L(θ) = fN(y1,· · · , yN|θ). Using the conditional distribution of yngiven the previous observations, the likelihood
of the time series model can be expressed as a product of one-dimentional conditional density functions:
L(θ) =
N
Y
n=1
gn(yn|y1,· · · , yn1, θ) =
N
Y
n=1
gn(yn|Yn1, θ).(5)
Here, if we define Y0=(empty set), then g1(y1|Y0, θ)f1(y1|θ). By taking the logarithm of L(θ), the
log-likelihood of the model is obtained as
`(θ) = log L(θ) =
N
X
n=1
log gn(yn|Yn1, θ).(6)
Since gn(yn|Yn1, θ) is the conditional distribution of yngiven the observation Yn1and it is, in fact, a normal
distribution with mean yn|n1and variance dn|n1, it can be expressed as (Kitagawa and Gersch (1996))
gn(yn|Yn1, θ) = 2πdn|n11
2exp (ynyn|n1)2
2dn|n1.(7)
Here, from the observation model, (2), yn|n1and dn|n1are obtained by
yn|n1=Hnxn|n1(8)
dn|n1=HnVn|n1HT
n+Rn.(9)
Therefore, by substituting this density function into (6), the log-likelihood of this state-space model is
obtained as
`(θ) = 1
2Nlog 2π+
N
X
n=1
log dn|n1+
N
X
n=1
(ynyn|n1)2
dn|n1.(10)
The maximum likelihood estiamtes of the parameters of the state-space model can be obtained by maxi-
mizing this log-likelihood function numerically. However, for univariate time series, we can assume that R= 1
(Kitagawa (2020)). Actually, if ˜
Vn|n,˜
Vn|n1,˜
Qn, and ˜
Rare defined by
Vn|n1=σ2˜
Vn|n1, Vn|n=σ2˜
Vn|n, Qn=σ2˜
Qn,˜
R= 1,(11)
then it can be seen that the obtained Kalman gain ˜
Knis identical to Kn. Therefore, in the filtering step, we
may use ˜
Vn|nand ˜
Vn|n1instead of Vn|nand Vn|n1. Furthermore, it can be seen that the vectors xn|n1and
xn|nof the state do not change under these modifications. In summary, if Rnis time-invariant and R=σ2is an
unknown parameter, we may apply the Kalman filter by setting R= 1. Since we then have dn|n1=σ2˜
dn|n1
from (10), this yields
`(θ) = 1
2(Nlog 2πσ2+
N
X
n=1
log ˜
dn|n1+1
σ2
N
X
n=1
(ynyn|n1)2
˜
dn|n1).
(12)
From the likelihood equation
`(θ)
σ2=1
2(N
σ21
(σ2)2
N
X
n=1
(ynyn|n1)2
˜
dn|n1)= 0,(13)
2
the maximum likelihood estimate of σ2is obtained by
ˆσ2=1
N
N
X
n=1
(ynyn|n1)2
˜
dn|n1
.(14)
Furthermore, denoting the parameters in θexcept for the variance σ2by θ, and substituting (14) into (12), we
have an expression for the log-likelihood
`(θ) = 1
2 Nlog 2πˆσ2+
N
X
n=1
log ˜
dn|n1+N!.(15)
1.3 Parameter Estimation and Criterion for Increasing Horizon Prediction of the
State
For the state-space model, by repeating the one-step-ahead prediction step, we can perform increasing horizon
prediction, that is, we can obtain xn+j|nand Vn+j|nfor j= 1,2,· · · p.
The increasing horizon prediction
For j= 1,· · · , p, repeat
xn+j|n=Fn+jxn+j1|n
Vn+j|n=Fn+jVn+j1|nFT
n+j+Gn+jQn+jGT
n+j.(16)
The long-term prediction is considered by many authors such as Judd and Small (2000), Sorjamaa et. al.
(2007) and Xiong et. al. (2013). In this paper, we evaluate the goodness of the long-term prediction by the
difference between the predictied value and the observed value
ˆσ2
p=1
Np
Np
X
n=1
ε2
n+p|n,(17)
where p-step-ahead prediction error is defined by εn+p|n=yn+pyn+p|nand yn+p|nis defined by yn+p|n=
Hxn+p|n. We can also consider a modified log-likelihood for the long-term prediction defined by
`p(θ) = 1
Np((Np)(log 2πˆσ2
p+ 1) +
Np
X
n=1
log dn+p|n)(18)
where dn+k|nis obtained by dn+p|n=Hn+pVn+p|nHT
n+p+Rn+p. Note that, different from the case of one-step-
ahead prediction errors, the long-term prediction errors, εp+1|1, . . . , εN|Npare not independent.
Given the predetermined prediction horizon p, the optimal value of the parameter vector θfor p-step-ahead
prediction is obtained by maximizing this modified log-likelihood.
2 Examples
2.1 Trend models
2.1.1 The secon order trend model
We consider the second order trend model
yn=Tn+wn,(19)
where Tnis the trend component that follows the second order trend component model Tn= 2Tn1Tn2+vn
and wnand vnare Gaussian white noise wnN(0, σ2) and vnN(0, τ2), respectively. Note that this trend
model can be expressed by a state-space model by
xn=Tn
Tn1, F =21
1 0 , G =1
0, H =1 0 .(20)
3
0
10
20
30
40
1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477
p= 1
maxtemp mean mean - 2SD mean + 2SD
0
10
20
30
40
1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477
p= 2
maxtemp mean mean - 2SD mean + 2SD
0
10
20
30
40
1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477
p= 5
maxtemp mean mean - 2SD mean + 2SD
0
10
20
30
40
1 29 57 85 113 141 169 197 225 253 281 309 337 365 393 421 449 477
p= 20
maxtemp mean mean - 2SD mean + 2SD
Figure 1: The estimated trend of the maximum temperature data obtained by the second order trend models
estimated by the p-step-ahead prediction criterion, p=1, 2, 5 and 20. Each plot shows the data (black), the
mean (red) , mean±2SD (blue) of the estimated trend components.
Figure 1 shows the trend estimates of the maxtemp data (daily maximum temperature data in Tokyo,
N=486). Smoothed estimate (red) and ±2SD interval are shown. Top left plot shows the estimates obtained by
the ordinary maximum likelihood method, i.e., by p= 1. On the other hand, three other plots show the results
obtained by modified log-likelihood criteria asumming p= 1,5 and 20, respectively. It can be seen that the
trend estimate by p= 1 is considerably variable compared with other three estimates, and other three estimates
obtained with p > 1 resemble each other.
Table 1 shows the increasing horizon prediction error variances ˆσ2
j,j= 1,...,20, for various parameter
estimation criteria `p(θ), p=1,(1),6,(2),20. The results for p= 1 shown in the second column are the increasing
horaizon prediction error variances of the model obtained by the maximum-likelihood method. In general,
(j, p)-element of the table shows the j-step-ahead prediction error variance of the model whose parameter was
obtained by maximizing the modified log-likelihood for p-step-ahead prediction criterion (18). Naturally, the
one-step-ahead prediction error variance ˆσ2
1attains the smallest value 9.89 with p= 1, but the long-term
prediction error variances, i.e., for j > 1, ˆσ2
jbecomes the largest among p= 1,...,20. The table also shows
that the j-step-ahead prediction error variance is the smallest at the criterion p. For p > 1, the increase of the
long-term prediction error varaince is not so significant and ˆσ2
jtakes similar values for different p. The last row
of the table shows the average of the long-term prediction error varainces, ˆσ2
j, over j= 1,...,20 for each p.
Figure 2 shows the increase of the long-term prediction error variance ˆσ2
j,j= 1,...,20 for p=1, 2, 3, 10
and 20. We can see that the long-term prediction error variances obtained by p= 1 are significantly larger than
other cases, and there is almost no difference in the prediction error variances among p= 2, 3, 10 and 20.
From the table and the figure, it can be concluded, at least for this data set, that the model with the
maximum likelihood estimates of the patameter has the minimum one-step-ahead prediction error variance but
has the largest long-term prediction error varainces. For this second order trend model p= 2,...,20 yield a
similar increasing horaizon prediction performances.
4
摘要:

FittingState-spaceModelforLong-termPredictionoftheLog-likelihoodofNonstationaryTimeSeriesModelsGenshiroKitagawaMathematicsandInformaticsCenter,TheUniversityofTokyoOctober13,2022AbstractThegoodnessofthelong-termpredictioninthestate-spacemodelwasevaluatedusingthesquaredlong-termpredictionerror.Inorder...

展开>> 收起<<
Fitting State-space Model for Long-term Prediction of the Log-likelihood of Nonstationary Time Series Models Genshiro Kitagawa.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:939.6KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注