时间序列分析：SARIMA 过程

Author: nex3z 2019-07-25

Math

Time Series

Contents

1. SARMA 过程
2. SARIMA 过程
- 2.1. 定义
- 2.2. 自相关函数

1. SARMA 过程

　　前文描述的 $\mathrm{ARIMA}(p, d, q)$模型具有如下的形式

\begin{equation}
\phi(B) \nabla^d X_t = \theta(B)e_t
\end{equation}

即

\begin{equation}
\phi(B)(1 – B)^d X_t = \theta(B)e_t \tag{1}
\end{equation}

其中

\begin{equation}
\phi(B) = 1 – \phi_1 B – \phi_2 B^2 – \cdots – \phi_p B^p \tag{2}
\end{equation}

\begin{equation}
\theta(B) = 1 + \theta_1 B + \theta_2 B^2 – \cdots + \theta_q B^q \tag{3}
\end{equation}

　　在实际中遇到的时间序列，常常会具有季节性（Seasonality），即每 $s$ 个观测值就会发成自我重复，例如月度经营数据往往以 $12$ 个月为一个周期，此时 $X_t$ 与 $X_{t-12}, X_{t-24}, \cdots$ 等数据间会存在较强的相关性。

　　为了对序列中的季节性进行建模，首先引入季节性 $\mathrm{ARMA}(P, Q)_s$ 过程

\begin{equation}
\Phi_P(B^s) X_t = \Theta_Q(B^s)e_t \tag{4}
\end{equation}

其中

\begin{equation}
\Phi_P(B) = 1 – \Phi_1 B^s – \Phi_2 B^{2s} – \cdots – \Phi_P B^{Ps} \tag{5}
\end{equation}

\begin{equation}
\Theta_Q(B) = 1 + \Theta_1 B^s + \Theta_2 B^{2s} – \cdots + \Theta_Q B^{Qs} \tag{6}
\end{equation}

其中 $s$ 为季节周期。对于季节性 $\mathrm{ARMA}$ 过程，当 $\Phi_P(B) = 0$ 和 $\Theta_Q(B) = 0$ 的根都在单位圆外时，季节性 $\mathrm{ARMA}$ 过程是平稳且可逆的。

　　举例来说，季节性 $\mathrm{ARMA}(1, 1)_{12}$ 具有如下的形式

\begin{equation}
(1 – \Phi_1 B^{12}) X_t = (1 + \Theta_1 B^{12})e_t
\end{equation}

即

\begin{equation}
X_t = \Phi_1 B^{12} X_{t-12} + e_t + \Theta_1 B^{12} e_{t-12}
\end{equation}

2. SARIMA 过程

2.1. 定义

　　当 $\mathrm{ARMA}(p, q)$ 过程不平稳，但它的 $d$ 阶差分平稳时，可以使用 $\mathrm{ARIMA}(p,d,q)$ 模型进行建模。类似地，对于季节性过程，季节性的成分可以有独立的差分参数 $D$，此时得到季节性 $\mathrm{ARIMA}$ 过程，即 $\mathrm{SARIMA}(p, d, q, P, D, Q)_s$，其中参数 $p, d, q$ 为非季节性的自回归、差分、滑动平均阶数，参数 $P, D, Q$ 分别为季节性的自回归、差分、滑动平均阶数，$s$ 为季节周期。该过程表示为

\begin{equation}
\Phi_P(B^s) \phi_p(B) \nabla_s^D \nabla^d X_t = \Theta_Q(B^s) \theta_q(B)e_t
\end{equation}

即

\begin{equation}
\Phi_P(B^s) \phi_p(B) (1 – B^S)^D (1 – B)^d X_t = \Theta_Q(B^s) \theta_q(B)e_t \tag{7}
\end{equation}

其中

\begin{equation}
\phi_p(B) = 1 – \phi_1 B – \phi_2 B^2 – \cdots – \phi_p B^p \\
\theta(B) = 1 + \theta_1 B + \theta_2 B^2 – \cdots + \theta_q B^q \\
\Phi_P(B) = 1 – \Phi_1 B^s – \Phi_2 B^{2s} – \cdots – \Phi_P B^{Ps} \\
\Theta_Q(B) = 1 + \Theta_1 B^s + \Theta_2 B^{2s} – \cdots + \Theta_Q B^{Qs}
\end{equation}

　　实际应用中，季节性差分阶数 $D$ 不会太大，一般只会取到 $1$ 或 $2$。当 $D = 1$ 时，有

\begin{equation}
\nabla_s X_t = (1 – B^s) X_t = X_t – X_{t-s}
\end{equation}

当 $D = 1$ 时，有

\begin{equation}
\nabla_s^2 X_t = (1 – B^s)^2 X_t = (1 – 2B^s + B^{2s}) = X_t – 2X_{t-s} + X_{t-2s}
\end{equation}

　　举例来说，$\mathrm{SARIMA}(1, 0, 0, 1, 0, 1)_{12}$ 具有如下的形式

\begin{equation}
(1 – \phi_1 B)(1 – \Phi_q B^{12}) X_t = (1 + \Theta_1 B^{12}) e_t
\end{equation}

等号两边展开得

\begin{equation}
(1 – \phi_1 B – \Phi_1 B^{12} + \phi_1\Phi_1 B^{13}) X_t = e_t + \Theta_1 e_{t-12}
\end{equation}

即

\begin{equation}
X_t = \phi_1 X_{t-1} + \Phi_1 X_{t-12} – \phi_1\Phi_1 X_{t-13} + e_t + \Theta_1 e_{t-12}
\end{equation}

　　又例如 $\mathrm{SARIMA}(0, 1, 1, 0, 0, 1)_{4}$ 具有如下的形式

\begin{equation}
(1 – B)X_t = (1 + \Theta_1 B^4)(1 + \theta_1 B) e_t
\end{equation}

即

\begin{equation}
X_t = X_{t-1} + e_t + \theta_1 e_{t-1} + \Theta_1 e_{t-4} + \theta_1 \Theta_1 e_{t-5}
\end{equation}

2.2. 自相关函数

　　下面以 $\mathrm{SARIMA}(0, 0, 1, 0, 0, 1)_{12}$ 过程为例，分析其自相关函数。对于

\begin{equation}
X_t = (1 + \Theta_1 B^{12})(1 + \theta_1 B)e_t \tag{8}
\end{equation}

即

\begin{equation}
X_t = e_t + \theta_1 e_{t-1} + \Theta_1 e_{t-12} + \theta_1\Theta_1 e_{t-13}
\end{equation}

有

\begin{align}
\gamma_0 &= \mathrm{Var}(X_t) = \mathrm{Var}(e_t + \theta_1 e_{t-1} + \Theta_1 e_{t-12} + \theta_1\Theta_1 e_{t-13}) \\
&= \sigma_e^2 + \theta_1^2 \sigma_e^2 + \Theta_1^2 \sigma_e^2 + \theta_1^2\Theta_1^2 \sigma_e^2 \\
&= (1 + \theta_1^2)(1 + \Theta_1^2)\sigma_e^2
\end{align}

　　由 $X_{t-1} = e_{t-1} + \theta_1 e_{t-2} + \Theta_1 e_{t-13} + \theta_1\Theta_1 e_{t-14}$，可得

\begin{align}
\gamma_1 &= \mathrm{Cov}(X_t, X_{t-1}) \\
&= \mathrm{Cov}(e_t + \theta_1 e_{t-1} + \Theta_1 e_{t-12} + \theta_1\Theta_1 e_{t-13}, e_{t-1} + \theta_1 e_{t-2} + \Theta_1 e_{t-13} + \theta_1\Theta_1 e_{t-14}) \\
&= \theta_1 \mathrm{Cov}(e_{t-1}, e_{t-1}) + \theta_1\Theta_1\Theta_1\mathrm{Cov}(e_{t-13}, e_{t-13}) \\
&= \theta_1\sigma_e^2 + \theta_1\Theta^2\sigma_e^2 \\
&= \theta_1(1 + \Theta_1^2)\sigma_e^2
\end{align}

又由 $(1 – \theta_1)^2 = 1 – 2\theta_1 + \theta_1^2 \geq 0$，可知 $1 + \theta_1^2 \geq 2 \theta_1$，故

\begin{equation}
\rho_1 = \frac{\gamma_1}{\gamma_0} = \frac{\theta_1}{1 + \theta_1^2} \leq \frac{1}{2}
\end{equation}

　　由 $X_{t-2} = e_{t-1} + \theta_1 e_{t-2} + \Theta_1 e_{t-14} + \theta_1\Theta_1 e_{t-15}$ 与 $X_t$ 没有公共的 $\{e_t\}$ 项，可知 $\gamma_2 = \mathrm{Cov}(X_t, X_{t-2}) = 0$，故 $\rho_2 = 0$。

　　类似地，可知对于式 $(8)$ 所示的过程，当 $k = 2, 3, \cdots, 10$ 时，有 $\rho_k = 0$。

　　由 $X_{t-2} = e_{t-2} + \theta_1 e_{t-2} + \Theta_1 e_{t-14} + \theta_1\Theta_1 e_{t-15}$，可得

\begin{equation}
\gamma_{11} = \mathrm{Cov}(X_t, X_{t-11}) = \theta_1\Theta_1\sigma_e^2
\end{equation}

\begin{equation}
\rho_{11} = \frac{\gamma_{11}}{\gamma_{0}} = \frac{\theta_1\Theta_1}{(1 + \theta_1^2)(1 + \Theta_1^2)} \leq \frac{1}{4}
\end{equation}

可见虽然式 $(8)$ 中并不显式地存在滞后为 $11$ 的项，当 $\theta_1\Theta_1$ 不为零时，有 $\rho_{11} \neq 0$。

　　使用如下代码模拟 $\mathrm{SARIMA}(0, 0, 1, 0, 0, 1)_{12}$ 过程并绘制 ACF 如图 1、图 2 所示：

x <- NULL
z <- NULL
n <- 10000
e <- rnorm(n)
x[1:13]=1
for(i in 14:n){
  x[i] <- e[i] + 0.7*e[i-1] + 0.6*e[i-12] + 0.42*e[i-13]
}
x <- ts(x)

plot(x[12:96], type='l')
acf(x)

图 1

图 2

由图 2 可见在滞后为 $1$ 处有显著的自相关，与前述分析的结果一致。

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31