The trend model in
Equation 1 assumes a deterministic linear trend and independent measurement errors. Independent measurement errors are reasonable as instrument carry-over from one measurement error to the next is unlikely. However, the progression of the anthropometric signal
Tt =
α +
βt is often not purely deterministic but also affected by stochastic perturbations
rt that lead to persistent slow-moving deviations from the deterministic linear trend, analogous to a slow moving wave. Persistence implies that a signal at time
t above the trend line tends to be followed by signals that are above the trend line as well. In other words, signals tend to stay above (or below) the trend line for several periods in a row. Such persistence can be modeled with a first-order autoregressive model,
rt = (1/(1 –
φB))
ξt =
ξt +
φξt−1 +
φ2ξt−2 + …. Here,
B is the backshift operator,
φ is the autoregressive parameter (which, for statistical stationarity, has to be between −1 and 1), and
ξt are independent mean zero random variables with variance
Display Formula. The first-order autoregressive model for
rt implies autocorrelations
Cor(
rt,
rt−k) =
φk and variance
Display Formula. Persistence is achieved when the autoregressive parameter
φ is positive and close to 1. The autoregressive model becomes the (nonstationary) random walk when
φ = 1. A random walk can take very long persistent excursions from the deterministic trend line. For a detailed discussion of time series models (including the backshift operator notation, stationarity and nonstationarity, and autoregressive and moving average models) we refer the reader to Abraham and Ledolter
4 and Box et al.
5
Incorporating anthropometric persistence into the trend model leads to the following more realistic model of change,
Yt =
α + βt + rt + εt. Subtracting the deterministic linear trend from the measurements, leads to trend deviations:
The model for the trend deviations can be written as (1 –
φB)
Ỹt =
ξt + (1 –
φB)
εt, and is known as the autoregressive-moving average, or ARMA(1,1), model: there is just one lagged autoregressive term and the autocorrelations of the moving average component on the right-hand side of the model are zero after lag 1. It is straightforward to show that the standard deviation and the autocorrelations of the deviations from the linear trend model
Ỹt =
Yt – (
α + βt) are
σỸ =
σ =
Display Formula,
ρ1 =
φDisplay Formula , and
ρk = (
ρ1)
φk−1 for
k ≥ 1. For
Display Formula = 0 (when there is no measurement error), the ARIMA(1,1) model simplifies to the first-order autoregressive model with variance
Display Formula and autocorrelations
ρk =
φk−1.
Persistence is modeled through the autoregressive parameter, and let us assume
φ = 0.8. The ratio
Display Formula compares the variance of the independent measurement errors with the variance of the persistent stochastic trend movements. We assume variance ratio
Display Formula = 3 as the stochastic trend component should not deviate too much from the deterministic linear trend and most of the variability should come from the measurement noise. With these choices of parameters the autocorrelations of
Ỹt =
Yt – (
α + βt) are
ρ1 = 0.8/(1 + 3) = 0.2 and
ρk = (0.2)(0.8)
k−1 for
k ≥ 1. While the lag 1 autocorrelation is moderate in size (
ρ1 = 0.2), there is a persistent slow decay in the autocorrelations from lag 1 onward.
We have provided motivation why the ARMA(1,1) is a useful error model for trend regressions. There is also evidence in the literature
6,7 that errors in regressions of anthropometric time series data on deterministic functions of age follow ARMA(1,1) models. Carrico et al.
7 show that, in a regression of young-adult blood pressure on linear and quadratic functions of age, body mass index, and height, ARMA(1,1) errors are preferable to AR(1) and errors with compound symmetry.
Our new model:
assumes that the errors
ε follow an ARMA(1,1) model, implying an
n ×
n error covariance matrix
V with elements
vij =
σ2 for
i = j and
vij =
σ2ρ1φ|i−j|−1 for
i ≠
j. The generalized least squares (GLS) estimator of
β in the model in
Equation 5 is given by
β̂;GLS = (
XTV−1X)
−1XTV−1(
Y − Ȳ). Here
V is the
n × n covariance matrix specified above,
X = (
t1 −
t̄,
t2 −
t̄,…,
tn −
t̄)
T is the
n × 1 column vector of times, and
Y − Ȳ = (
Y1 −
Ȳ,
Y2 −
Ȳ,…,
Yn −
Ȳ)
T is the
n × 1 column vector of mean-corrected observations. The superscript
T denotes the transpose. The GLS estimator is the most efficient estimator among all linear unbiased estimators, with the smallest sample variance
Display Formula = (
XTV−1X)
−1.
3 Substituting this standard error into
Equations 2 and
3 leads to the power
We return to our example with
z0.05 = −1.645,
β∗ = 1,
σ = 0.4 and an observation interval that is reduced from the original 3 years (
P < 1), but now assume that the error is characterized by the ARMA(1,1) model with weekly autoregressive coefficient
φW = 0.8 and variance ratio
Display Formula = 3. The
n observations on the reduced unit-time interval [0,
P < 1] are spaced 156
P/(
n − 1) weeks apart. Hence, the autoregressive coefficient between successive observations is (
φW)
156P/(n−1). This value,
σ = 0.4 and the variance ratio
Display Formula = 3 are used for the calculation of the covariance matrix
V of the
n observations equally spaced over the interval [0,
P < 1].
With the original 3-year observation interval (
P = 1) and
n = 7 observations, the power calculated from
Equation 6 with
φW = 0.8 is still 0.712, the same power we obtain when there is independence. This is because observations are 26 weeks apart (156
P/(
n – 1) = 156(1)/6 = 26),
Display Formula ≈ 0, and
V is a diagonal matrix with zero autocorrelations. The power is affected only for much larger weekly autoregressive coefficient very close to 1.
Figure 2 shows results for the ARMA(1,1) model with weekly autoregressive coefficient
φW = 0.8,
σ = 0.4 and variance ratio
Display Formula = 3. In order to obtain the same power that is achieved with seven observations over the full 36-month time interval (0.712), 20 observations are now needed if the sampling period is reduced to 70% of the full 36 months. This is larger than the 17 observations that are needed in the independent case. Thirty-seven observations are needed if sampling is reduced to 60% of the full 36 months, which is larger than the 24 observations in the independent case. For an observation window that is cut in half we find that it takes more (correlated) observations to compensate for the shortened observation interval. Now 159 observations are needed to attain the same power, instead of the 36 independent observations.
When reducing the time interval to 40% (or even less) of the original time interval, the increase in the number of correlated observations that are needed becomes very large and cannot compensate for the shortened observation interval, as adjacent observations over the reduced interval are now so close together that their autocorrelations approach 1. This implies that there is no benefit to taking such extra observations. For an observation interval reduced to 30%, we limit ourselves to (156)(0.3) = 46.8 weeks. With n = 100, for illustration, adjacent observations are 0.46 weeks apart and (φW)0.46 = (0.8)0.46 = 0.90. Off-diagonal elements in the covariance matrix V are large, which indicates that there is little benefit to collecting observations that are so close together in time.