时间序列分析：滑动平均过程

Author: nex3z 2019-07-13

Math

Time Series

Contents

1. 定义
2. 一阶滑动平均过程
3. 二阶滑动平均过程
- 3.1. 定义
- 3.2. 模拟
4. 一般 $MA(q)$ 过程

1. 定义

　　设 $\{e_t\}$ 是均值为零，方差为 $\sigma_e^2$ 的白噪声，则称

\begin{equation}
X_t = e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} + \cdots + \theta_q e_{t-q} \tag{1}
\end{equation}

为 $q$ 阶滑动平均过程（Moving Average Process），记为 $MA(q)$。该模型也可以写为

\begin{equation}
X_t = \theta(B)e_t \tag{2}
\end{equation}

其中 $\theta(B)$ 为滑动平均算子，为

\begin{equation}
\theta(B) = 1 + \theta_1 B + \theta_2 B^2 + \cdots + \theta_q B^q \tag{3}
\end{equation}

　　将 $B$ 看做一个复变量，$\theta(B) = 0$ 称为该模型的特征方程（Characteristic Equation）。

2. 一阶滑动平均过程

2.1. 定义

　　一阶滑动平均过程表达为 $X_t = e_t + \theta e_{t-1}$，由于只有一个参数 $\theta_1$，为了方便，去掉其下标 $1$，写为

\begin{equation}
X_t = e_t + \theta e_{t-1} \tag{4}
\end{equation}

　　对于式 $(4)$ 所示的一阶滑动平均过程，显然有

\begin{equation}
E(X_t) = E(e_t + \theta e_{t-1}) = 0
\end{equation}

\begin{equation}
\gamma_0 = \mathrm{Var}(X_t) = \mathrm{Var}(e_t + \theta e_{t-1}) = (1 + \theta^2) \sigma_e^2
\end{equation}

　　进一步得到协方差和自相关函数

\begin{equation}
\gamma_1 = \mathrm{Cov}(X_t, X_{t-1}) = \mathrm{Cov}(e_t + \theta e_{t-1}, e_{t-1} + \theta e_{t-2}) = \mathrm{Cov}(\theta e_{t-1}, e_{t-1}) = \theta \sigma_e^2 \tag{5}
\end{equation}

\begin{equation}
\gamma_2 = \mathrm{Cov}(X_t, X_{t-2}) = \mathrm{Cov}(e_t + \theta e_{t-1}, e_{t-2} + \theta e_{t-3}) = 0
\end{equation}

\begin{equation}
\rho_1 = \frac{\gamma_1}{\gamma_0} = \frac{\theta}{1 + \theta^2} \tag{6}
\end{equation}

\begin{equation}
\rho_2 = \frac{\gamma_2}{\gamma_0} = 0
\end{equation}

更一般地，对于 $k \geq 2$，由于 $X_t$ 和 $X_{t-k}$ 的表达式中没有相同下标的 $e$，故 $\gamma_k = \mathrm{Cov}(X_t, X_{t-k}) = 0$，及 $\rho_k = 0$。即当过程大于 $1$ 阶滞后时，不存在自相关。

2.2. $\theta$ 与自相关

　　在式 $(6)$ 中，令 $\theta$ 取 $0$ 到 $1$ 之间不同的值，可以绘制 $\theta$ 与 $\rho_1$ 的关系曲线如图 1。

theta <- seq(0, 1, by=0.01)
corr <- theta / (1 + theta^2)
plot(theta, corr, type = 'l', xlab = expression(theta), ylab = expression(rho[1]))

图 1

可见当 $\theta = 1$ 时 $\rho_1$ 取得最大值 $1/2$；当 $\theta = -1$ 时 $\rho_1$ 取得最大值 $0$。

　　另外需要注意的是，当 $\theta = 1 / 2$ 或 $\theta = 1 / (1 / 2)$ 时，都有 $\rho_1 = 0.4$。也就是说，如果只知道 $MA(1)$ 过程的 $\rho_1 = 0.4$，是不足以确定 $\theta$ 的值的。

2.3. 模拟 $MA(1)$

　　模拟 $\theta = 0.7$ 的 $MA(1)$ 过程并绘制自相关图像如图 2、图 3 所示。

set.seed(42)
e <- rnorm(200)
ma.1 <- NULL
for(i in 2:200) {
  ma.1[i] <- e[i] + 0.7 * e[i - 1]
}
ma.1 <- ts(ma.1[2:200])
plot(ma.1, main="MA(1)")

图 2

acf(ma.1)

图 3

　　当 $\theta = 0.7$ 时，有 $\rho_1 = 0.7 / (1 + 0.7^2) \approx 0.4698$，与图 3 相符，表明一阶滞后存在中等强度的正相关。在图 2 中可见连续观测值趋于密切相关，即如果一个观测值高于该序列的平均水平，则下一个观测值一般也高于平均水平。

3. 二阶滑动平均过程

3.1. 定义

　　考虑二阶滑动平均过程

\begin{equation}
X_t = e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} \tag{7}
\end{equation}

易知

\begin{equation}
E(X_t) = E(e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2}) = 0
\end{equation}

\begin{equation}
\gamma_0 = \mathrm{Var}(e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2}) = (1 + \theta_1^2 + \theta_2^2) \sigma_e^2
\end{equation}

　　协方差和自相关函数为

\begin{align}
\gamma_1 &= \mathrm{Cov}(X_t, X_{t-1}) = \mathrm{Cov}(e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2}, e_{t-1} + \theta_1 e_{t-2} + \theta_2 e_{t-3}) \\
&= \mathrm{Cov}(\theta_1 e_{t-1}, e_{t-1}) + \mathrm{Cov}(\theta_2 e_{t-2}, \theta_1 e_{t-2}) \\
&= (\theta_1 + \theta_1\theta_2) \sigma_e^2
\end{align}

\begin{align}
\gamma_2 &= \mathrm{Cov}(X_t, X_{t-2}) = \mathrm{Cov}(e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2}, e_{t-2} + \theta_1 e_{t-3} + \theta_2 e_{t-4}) \\
&= \mathrm{Cov}(\theta_2 e_{t-2}, e_{t-2}) \\
&= \theta_2 \sigma_e^2
\end{align}

\begin{equation}
\gamma_k = 0, \qquad k = 3, 4, \cdots
\end{equation}

\begin{equation}
\rho_1 = \frac{\gamma_1}{\gamma_0} = \frac{\theta_1 + \theta_1\theta_2}{1 + \theta_1^2 + \theta_2^2}
\end{equation}

\begin{equation}
\rho_2 = \frac{\gamma_2}{\gamma_0} = \frac{\theta_2}{1 + \theta_1^2 + \theta_2^2}
\end{equation}

\begin{equation}
\rho_k = 0, \qquad k = 3, 4, \cdots
\end{equation}

3.2. 模拟

　　模拟 $\theta_1 = 0.6, theta_2 = 0.4$ 的 $MA(2)$ 过程并绘制自相关图像如图 4、图 5 所示。

set.seed(42)
e <- rnorm(200)
ma.2 <- NULL
for(i in 3:200) {
  ma.2[i] <- e[i] + 0.6 * e[i - 1] + 0.4 * e[i - 2]
}
ma.2 <- ts(ma.2[3:200])
plot(ma.2, main="MA(2)")

图 4

acf(ma.2)

图 5

4. 一般 $MA(q)$ 过程

　　对于一般 $MA(q)$ 过程

\begin{equation}
X_t = e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} + \cdots + \theta_q e_{t-q}
\end{equation}

类似地，可以得到

\begin{equation}
E(X_t) = E(e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} + \cdots + \theta_q e_{t-q}) = 0
\end{equation}

\begin{equation}
\gamma_0 = \mathrm{Var}(e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} + \cdots + \theta_q e_{t-q}) = (1 + \theta_1^2 + \theta_2^2 + \cdots + \theta_q^2) \sigma_e^2
\end{equation}

\begin{align}
\gamma_k &= \mathrm{Cov}(X_t, X_{t-k}) \\
&= \mathrm{Cov}(e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} + \cdots + \theta_q e_{t-q}, e_{t-k} + \theta_1 e_{t-k-1} + \theta_2 e_{t-k-2} + \cdots + \theta_q e_{t-k-q}) \\
&= \mathrm{Cov}(\theta_k e_{t-k}, e_{t-k}) + \mathrm{Cov}(\theta_{k+1} e_{t-k-1}, \theta_1 e_{t-k-1}) + \cdots + \mathrm{Cov}(\theta_q e_{t-q}, \theta_{q-k} e_{t-q}) \\
&= (\theta_k + \theta_1\theta_{k+1} + \theta_2\theta_{k+2} + \cdots + \theta_{q-k}\theta_q) \sigma_e^2 \\
&= (\theta_k + \sum_{i=1}^{q-k} \theta_i \theta_{i + k}) \sigma_e^2 , \qquad k = 1, 2, \cdots, q
\end{align}

\begin{equation}
\gamma_k = 0, \qquad k > q
\end{equation}

\begin{equation}
\rho_k = \begin{cases}
\frac{\theta_k + \sum\limits_{i=1}^{q-k} \theta_i \theta_{i + k}}{1 + \theta_1^2 + \theta_2^2 + \cdots + \theta_q^2} & k = 1, 2, \cdots, q \\
0 & k > q
\end{cases}
\end{equation}

　　可见无论系数 $\{\theta_i\}$ 如何取值，$MA(q)$ 过程总是平稳的。

　　注意到当 $k = q$ 时，$\rho_k$ 的分子只有一项 $\theta_q$，当 $k > q$ 时， $\rho_k = 0$，这表示自相关函数在滞后 $q$ 期后出现截尾（即自相关系数为零），据此可根据自相关函数图像判断滑动平均的阶数。而在 $k < q$ 期间，自相关系数的形状可以是任意的。

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31