www.gusucode.com > econ 案例源码程序 matlab代码 > econ/Demo_TSReg6.m
%% Time Series Regression VI: Residual Diagnostics % % This example shows how to evaluate model assumptions and investigate % respecification opportunities by examining the series of residuals. It is % the sixth in a series of examples on time series regression, following % the presentation in previous examples. % Copyright 2012-2014 The MathWorks, Inc. %% Introduction % % The analysis of the credit default data in previous examples in this % series has suggested a number of distinct models, using various % transformations of the data and various subsets of the predictors. % Residual analysis is an essential step for reducing the number of models % considered, evaluating options, and suggesting paths back toward % respecification. Multiple linear regression (MLR) models with residuals % that depart markedly from classical linear model (CLM) assumptions % (discussed in the example on "Linear Models") are unlikely to perform % well, either in explaining variable relationships or in predicting new % responses. Many statistical tests have been developed to assess CLM % assumptions about the innovations process, as manifested in the residual % series. We examine some of those tests here. % % We begin by loading relevant data from the previous example on "Predictor % Selection": load Data_TSReg5 %% Residual Plots % % The following produces residual plots for each model identified in the % previous example, in each of the two model categories (undifferenced and % differenced data): map = cool(3); % Model color map % Undifferenced data: res0 = M0.Residuals.Raw; res0SW = M0SW.Residuals.Raw; res0SWAC = M0SWAC.Residuals.Raw; model0Res = [res0,res0SW,res0SWAC]; figure hold on ax = gca; ax.ColorOrder = map; plot(dates,model0Res,'.-','LineWidth',2,'MarkerSize',20) plot(dates,zeros(size(dates)),'k-','LineWidth',2) hold off legend({'M0', 'M0SW', 'M0SWAC'},'Location','N') xlabel('Year') ylabel('Residual') title('{\bf Model Residuals (Undifferenced Data)}') axis tight grid on % Differenced data: resD1 = MD1.Residuals.Raw; res0SW = MD1SW.Residuals.Raw; res0SWAC = MD1SWA.Residuals.Raw; modelD1Res = NaN(length(dates),3); modelD1Res(2:end,:) = [resD1,res0SW,res0SWAC]; figure hold on ax = gca; ax.ColorOrder = map; plot(dates,modelD1Res,'.-','LineWidth',2,'MarkerSize',20) plot(dates,zeros(size(dates)),'k-','LineWidth',2) hold off legend({'MD1', 'MD1SW', 'MD1SWA'},'Location','N') xlabel('Year') ylabel('Residual') title('{\bf Model Residuals (Differenced Data)}') axis tight grid on %% % For each model, the residuals scatter around a mean near zero, as they % should, with no obvious trends or patterns indicating misspecification. % The scale of the residuals is several orders of magnitude less than the % scale of the original data (see the example on "Linear Models"), which is % a sign that the models have captured a significant portion of the % data-generating process (DGP). There appears to be some evidence of % autocorrelation in several of the persistently positive or negative % departures from the mean, particularly in the undifferenced data. A small % amount of heteroscedasticity is also apparent, though it is difficult for % a visual assessment to separate this from random variation in such a % small sample. %% Autocorrelation % % In the presence of autocorrelation, OLS estimates remain unbiased, but % they no longer have minimum variance among unbiased estimators. This is a % significant problem in small samples, where confidence intervals will be % relatively large. Compounding the problem, autocorrelation introduces % bias into the standard variance estimates, even asymptotically. Because % autocorrelations in economic data are likely to be positive, reflecting % similar random factors and omitted variables carried over from one time % period to the next, variance estimates tend to be biased downward, toward % _t_-tests with overly optimistic claims of accuracy. The result is that % interval estimation and hypothesis testing become unreliable. More % conservative significance levels for the _t_-tests are advised. % Robustness of the estimates depends on the extent, or persistence, of the % autocorrelations affecting current observations. % % The |autocorr| function, without output arguments, produces an % autocorrelogram of the residuals, and gives a quick visual take on the % residual autocorrelation structure: figure autocorr(res0) title('{\bf M0 Residual Autocorrelations}') %% % There is no evidence of autocorrelation outside of the Bartlett % two-standard-error bands for white noise, given by the blue lines. % % The Durbin-Watson statistic [3] is the autocorrelation measure most % frequently reported in econometric analyses. One reason is that it is % easy to compute. For the |M0| model: diffRes0 = diff(res0); SSE0 = res0'*res0; DW0 = (diffRes0'*diffRes0)/SSE0 % Durbin-Watson statistic %% % Under assumptions of stationary, normally distributed innovations, the % statistic is approximately $2(1 - \rho_1)$, where $\rho_1$ is the % first-order (single lag) autocorrelation estimated by |autocorr|: rho0 = autocorr(res0,1); % Sample autocorrelations at lags, 0, 1 DW0Normal = 2*(1-rho0(2)) %% % A statistic near 2 provides no evidence of first-order autocorrelation. % Appropriate _p_-values for the statistic are computed by the |dwtest| % method of the |LinearModel| class: [pValueDW0,DW0] = dwtest(M0) %% % The _p_-value for a null of no first-order autocorrelation is well % above the standard 5% critical value. % % Econometricians have traditionally relied on a rule of thumb that a % Durbin-Watson statistic below about 1.5 is reason to suspect positive % first-order autocorrelation. This ad-hoc critical value ignores the % dependence on sample size, but it is meant to be a conservative % guideline, given the serious consequences of ignoring autocorrelation. % % The Durbin-Watson test, though traditionally very popular, has a number % of shortcomings. Other than its assumption of stationary, % normally distributed innovations, and its ability to detect only % first-order autocorrelation, it is very sensitive to other model % misspecifications. That is, it is powerful against many alternatives for % which the test is not designed. It is also invalid in the presence of % lagged response variables (see the example on "Lagged Variables and % Estimator Bias"). % % The Ljung-Box _Q_-test [5], implemented by the function |lbqtest|, tests % for "overall" or "portmanteau" lack of autocorrelation. It considers lags % up to a specified order _L_, and so is a natural extension of the % first-order Durbin-Watson test. The following tests the |M0| residuals % for autocorrelation at _L_ = 5, 10, and 15: [hLBQ0,pValueLBQ0] = lbqtest(res0,'Lags',[5,10,15]) %% % At the default 5% significance level, the test fails to reject the null % of no autocorrelation in each of the extended lag structures. The results % are similar for the |MD1| model, but much higher _p_-values indicate even % less evidence for rejecting the null: [hLBQD1,pValueLBQD1] = lbqtest(resD1,'Lags',[5,10,15]) %% % The _Q_-test also has its shortcomings. If _L_ is too small, the test % will not detect higher-order autocorrelations. If it is too large, the % test will lose power, since a significant correlation at any lag may be % washed out by insignificant correlations at other lags. Also, the test is % powerful against serial dependencies other than autocorrelation. % % Another shortcoming of the _Q_-test is that the default chi-square % distributions used by the test are asymptotic, and can produce unreliable % results in small samples. For ARMA(_p_, _q_) models, for which the test was % developed, more accurate distributions are obtained if the number of % degrees of freedom is reduced by the number of estimated coefficients, % _p_ + _q_. This limits the test, however, to values of _L_ greater than % _p_ + _q_, since the degrees of freedom must be positive. Similar % adjustments can be made for general regression models, but |lbqtest| does % not do so by default. % % Another test for "overall" lack of autocorrelation is a _runs test_, % implemented by the function |runstest|, which determines if the signs of % the residuals deviate systematically from zero. The test looks for long % runs of either the same sign (positive autocorrelation) or alternating % signs (negative autocorrelation): [hRT0,pValueRT0] = runstest(res0) %% % The test fails to reject the null of randomness in the residuals of the % |M0| model. % % Autocorrelated residuals may be a sign of a significant specification % error, in which omitted, autocorrelated variables have become implicit % components of the innovations process. Absent any theoretical suggestions % of what those variables might be, the typical remedy is to include lagged % values of the response variable among the predictors, at lags up to the % order of autocorrelation. Introducing this kind of dynamic dependence % into the model, however, is a significant departure from the static MLR % specification. Dynamic models present a new set of considerations % relative to the CLM assumptions, and are considered in the example on % "Lagged Variables and Estimator Bias." %% Heteroscedasticity % % Heteroscedasticity occurs when the variance of the predictors and the % innovations process produce, in aggregate, a conditional variance in the % response. The phenomenon is commonly associated with cross-sectional % data, where systematic variations in measurement error can occur across % observations. In time series data, heteroscedasticity is more often the % result of interactions between model predictors and omitted variables, % and so is another sign of a fundamental misspecification. OLS estimates % in the presence of heteroscedasticity exhibit virtually identical % problems to those associated with autocorrelation; they are unbiased, but % no longer have minimum variance among unbiased estimators, and standard % formulas for estimator variance become biased. However, Monte Carlo % studies suggest that the effects on interval estimation are usually quite % minor [1]. Unless heteroscedasticity is pronounced, distortion of the % standard errors is small, and significance tests are largely unaffected. % With most economic data, the effects of heteroscedasticity will be minor % compared to the effects of autocorrelation. % % Engle's ARCH test [4], implemented by the |archtest| function, is an % example of a test used to identify residual heteroscedasticity. It % assesses the null hypothesis that a series of residuals $r_t$ exhibits no % conditional heteroscedasticity (ARCH effects), against the alternative % that an ARCH(_L_) model % % $$r_t^2 = a_0 + a_1 r_{t-1}^2 + ... + a_L r_{t-L}^2 + \zeta_t,$$ % % describes the series with at least one nonzero $a_k$ for $k = 0, ..., L$. % Here $\zeta_t$ is an independent innovations process. The residuals in an % ARCH process are dependent, but not correlated, so the test is for % heteroscedasticity without autocorrelation. % % Applying the test to the |M0| residual series with lags _L_ = 5, 10, and % 15 gives: [hARCH0,pARCH0] = archtest(res0,'Lags',[5,10,15]) %% % The test finds no evidence of heteroscedasticity in the residuals. % For the |MD1| model the evidence is even weaker: [hARCHD1,pARCHD1] = archtest(resD1,'Lags',[5,10,15]) %% Distribution % % The assumption that the innovations process is normally distributed is % not required by the Gauss-Markov theorem, but it is necessary for % confidence intervals to be constructed using standard techniques, and for % _t_ and _F_ tests to provide accurate assessments of predictor % significance. The assumption is especially important in small samples, % where the Central Limit theorem cannot be relied upon to provide % approximately normal distributions of estimates, independent of the % distribution of innovations. % % The usual justification for the normality assumption is that innovations % are the sum of inherent stochasticity plus all of the variables omitted % from the regression. The Central Limit theorem says that this sum will % approach normality as the number of omitted variables increases. However, % this conclusion depends on the omitted variables being independent of % each other, and this is often unjustified in practice. Thus, for small % samples, regardless of the results on autocorrelation and % heteroscedasticity, checking the normality assumption is an important % component of accurate specification. % % A normal probability plot of the residual series gives a quick % assessment: figure hNPlot0 = normplot(model0Res); legend({'M0', 'M0SW', 'M0SWAC'},'Location','Best') title('{\bf Model Residuals (Undifferenced Data)}') set(hNPlot0,'Marker','.') set(hNPlot0([1 4 7]),'Color',map(1,:)) set(hNPlot0([2 5 8]),'Color',map(2,:)) set(hNPlot0([3 6 9]),'Color',map(3,:)) set(hNPlot0,'LineWidth',2) set(hNPlot0,'MarkerSize',20) figure hNPlotD1 = normplot(modelD1Res); legend({'MD1', 'MD1SW', 'MD1SWA'},'Location','Best') title('{\bf Model Residuals (Differenced Data)}') set(hNPlotD1,'Marker','.') set(hNPlotD1([1 4 7]),'Color',map(1,:)) set(hNPlotD1([2 5 8]),'Color',map(2,:)) set(hNPlotD1([3 6 9]),'Color',map(3,:)) set(hNPlotD1,'LineWidth',2) set(hNPlotD1,'MarkerSize',20) %% % The plots show empirical probability versus residual value. Solid lines % connect the 25th and 75th percentiles in the data, then are extended by % dashed lines. The vertical scale is nonlinear, with the distance between % tick marks equal to the distance between normal quantiles. If the data % fall near the line, the normality assumption is reasonable. Here, we see % an apparent departure from normality for data with large residuals % (again, especially in the undifferenced data), indicating that the % distributions may be skewed. Clearly, removal of the most influential % observations, considered in the example on "Influential Observations," % would improve the normality of the residuals. % % It is a good idea to confirm any visual analysis with an appropriate % test. There are many statistical tests for distributional assumptions, % but the Lilliefors test, implemented by the |lillietest| function, is a % normality test designed specifically for small samples: [hNorm0,pNorm0] = lillietest(res0) %% % At the default 5% significance level, the test rejects normality in the % |M0| series, but just barely. The test finds no reason to reject normality % in the |MD1| data: s = warning('off','stats:lillietest:OutOfRangePHigh'); % Turn off small statistic warning [hNormD1,pNormD1] = lillietest(resD1) warning(s) % Restore warning state %% % The statistic is at the edge of the table of critical values tabulated by % |lillietest|, and the largest _p_-value is reported. % % A common remedy for nonnormality is to apply a Box-Cox transformation to % the response variable [2]. Unlike log and power transformations of % predictors, which are primarily used to produce linearity and facilitate % trend removal, Box-Cox transformations are designed to produce normality % in the residuals. They often have the beneficial side effect of % regularizing the residual variance. % % Collectively, Box-Cox transformations form a parameterized family with % |log| and standardized power transformations as special cases. The % transformation with parameter $\lambda$ replaces the response variable % $y_t$ with the variable: % % $${y_t}^{(\lambda)} = {{{y_t}^{\lambda}-1}\over{\lambda}}$$ % % for $\lambda \neq 0$. For $\lambda = 0$, the transformation is given by % its limiting value, log($y_t$). % % The |boxcox| function in Financial Toolbox finds the parameter % $\lambda_0$ that maximizes the normal loglikelihood of the residuals. To % apply the function to the |IGD| data in |y0|, it is necessary to perturb % the the zero default rates to make them positive: alpha = 0.01; y0(y0 == 0) = alpha; % y0BC = boxcox(y0); % In Financial Toolbox y0BC = [-3.5159 -1.6942 -3.5159 -3.5159 -1.7306 -1.7565 -1.4580 -3.5159 -3.5159 -2.4760 -2.5537 -3.5159 -2.1858 -1.7071 -1.7277 -1.5625 -1.4405 -0.7422 -2.0047 -3.5159 -2.8346]; %% % The transformation is sensitive to the value of |alpha|, which adds a % level complication to the analysis. A Lilliefors test, however, confirms % that the transform has the desired effect: M0BC = fitlm(X0,y0BC); res0BC = M0BC.Residuals.Raw; [hNorm0BC,pNorm0BC] = lillietest(res0BC) warning(s) % Restore warning state %% % Since the evidence for nonnormality in the original residual series is % slight, we do not pursue a fine-tuning of the Box-Cox transformation. %% Summary % % The fundamental purpose of residual analysis is to check CLM assumptions % and look for evidence of model misspecification. Patterns in the % residuals suggest opportunities for respecification to obtain a model % with more accurate OLS coefficient estimates, enhanced explanatory power, % and better forecast performance. % % Different models can exhibit similiar residual characteristics. If so, % alternative models may need to be retained and further assessed at the % forecasting stage. From a forecasting perspective, if a model has % successfully represented all of the systematic information in the data, % then the residuals should be white noise. That is, if the innovations are % white noise, and the model mimics the DGP, then the one-step-ahead % forecast errors should be white noise. Model residuals are in-sample % measures of these out-of-sample forecast errors. Forecast performance is % discussed in the example on "Forecasting." % % The problems of OLS estimation associated with non-white innovations, % coupled with the limited options for respecifying many economic models, % has led to the consideration of more robust _hetroscedasticity and % autocorrelation consistent_ (HAC) estimators of variance, such as the % Hansen-White and Newey-West estimators, which eliminate asymptotic % (though not small-sample) bias. Revised estimation techniques, such as % _generalized least squares_ (GLS), have also been developed for % estimating coefficients in these cases. GLS is designed to give lower % weight to influential observations with large residuals. The GLS % estimator is BLUE (see the example on "Linear Models"), and equivalent to % the maximum likelihood estimator (MLE) when the innovations are normal. % These techniques are considered in the example on "Generalized Least % Squares & HAC Estimators." %% References % % [1] Bohrnstedt, G. W., and T. M. Carter. "Robustness in Regression % Analysis." In _Sociological Methodology_, H. L. Costner, editor, pp. % 118-146. San Francisco: Jossey-Bass, 1971. % % [2] Box, G. E. P., and D. R. Cox. "An Analysis of Transformations". % _Journal of the Royal Statistical Society_. Series B, Vol. 26, 1964, pp. % 211-252. % % [3] Durbin, J. and G.S. Watson. "Testing for Serial Correlation in Least % Squares Regression." _Biometrika_. Vol. 37, 1950, pp. 409-428. % % [4] Engle, R. F. "Autoregressive Conditional Heteroscedasticity with % Estimates of the Variance of United Kingdom Inflation." _Econometrica_. % Vol. 50, 1982, pp. 987-1007. % % [5] Ljung, G. M. and G. E. P. Box. "On a Measure of Lack of Fit in Time % Series Models." _Biometrika_. Vol. 65, 1950, pp. 297-303.