www.gusucode.com > econ 案例源码程序 matlab代码 > econ/Demo_TSReg9.m
%% Time Series Regression IX: Lag Order Selection % % This example shows how to select statistically significant predictor % histories for multiple linear regression models. It is the ninth in a % series of examples on time series regression, following the presentation % in previous examples. % Copyright 2012-2014 The MathWorks, Inc. %% Introduction % % Predictors in dynamic regression models may include lagged values of % exogenous explanatory variables (distributed lag, or DL, terms), lagged % values of endogenous response variables (autoregressive, or AR, terms), % or both. Lagged values of an innovations process (moving average, or MA, % terms) may have economic significance, representing the persistence of % shocks, but they are most often included to offset the need for % additional DL or AR terms. (See the example on "Lagged Variables and OLS % Estimator Bias.") % % Ideally, economic theory suggests which lags to include in a model. % Often, however, the delay between predictor changes and corresponding % response changes must be discovered through data analysis. A common % modeling approach is to include the history of a predictor at times _t_ - % 1, _t_ - 2, _t_ - 3, ..., _t - p_, with the assumption that significant % effects on the current response are only produced by recent changes in % the predictor. Specification analysis then considers extending or % restricting the lag structure, and finally selecting an appropriate _lag % order p_. % % This example investigates strategies for lag order selection. Although % details depend on the data and the modeling context, a common goal is to % identify a concise, easy-to-interpret description of the data-generating % process (DGP) that leads to accurate estimation and reliable forecasting. % % We begin by loading relevant data from previous examples in this series: load Data_TSReg8 %% Basic Tests % % The classical, normal linear model (CNLM), introduced in the example on % "Linear Models," filters data to generate white noise residuals. % Econometric models do not always aspire to such a thorough statistical % description of the DGP, especially when predictors are dictated by % theory or policy, and modeling goals are focused on specific effects. % Nevertheless, departures from the CNLM, and their degree, are common % measures of model misspecification. % % At any point in the model specification process, residuals may display % nonnormality, autocorrelation, heteroscedasticity, and other violations % of CNLM assumptions. As predictors are added or removed, models can be % evaluated by the relative improvement in the quality of the residuals. % Tests for model fit via residual analysis are described in the example on % "Residual Diagnostics." % % Model specification must also consider the statistical significance of % predictors, to avoid over-fitting in the service of residual whitening, % and to produce a parsimonius representation of the DGP. Basic tests % include the _t_-test, which evaluates the significance of individual % predictors, and the _F_-test, which is used to evaluate the joint % significance of, say, an entire lag structure. These tests are usually % used together, since a predictor with an insignificant individual effect % may still contribute to a significant joint effect. % % Many lag order selection procedures use these basic tests to evaluate % extensions and restrictions of an initial lag specification. Good % econometric practice suggests careful evaluation of each step in the % process. Econometricians must judge statistical decisions in the context % of economic theory and model assumptions. Automated procedures are % discussed in the example on "Predictor Selection," but it is often % difficult to completely automate the identification of a useful lag % structure. We take a more "manual" approach in this example. Of course, % the reliability of any such procedure depends critically on the % reliability of the underlying tests. % % Consider the basic model of credit defaults introduced in the example on % "Linear Models": M0 %% % Based on the _p_-values of the _t_-statistics, |AGE| is the most % significant individual risk factor (positive coefficient) for the default % rates measured by the response |IGD|. |AGE| represents the percentage of % investment-grade bond issuers first rated 3 years ago. Defaults often % occur after this period, when capital from an initial issue is expended, % but they may occur sooner or later. It seems reasonable to consider % models that include lags or leads of |AGE|. % % The fit of |M0| is based on only 21 observations, and the 5 coefficients % already estimated leave only 16 degrees of freedom for further fitting. % Extended lag structures, and the corresponding reduction in sample size, % would bring into question the validity of diagnostic statistics. % % For reference, we create tables and fit models with |AGE| lag orders of % 1, 2, 3, 4, and 5: % Lagged data: AGE = DataTable.AGE; maxLag = 5; lags = 1:maxLag; AGELags = lagmatrix(AGE,lags); lagNames = strcat({'AGELag'},num2str(lags','%-d')); AGELags = array2table(AGELags,'VariableNames',lagNames); % Preallocate tables and models: DTAL = cell(maxLag,1); MAL = cell(maxLag,1); % Fit models: AL = AGELags; DT = DataTable; for lagOrder = lags lagRange = 1:lagOrder; % Trim next presample row: AL(1,:) = []; DT(1,:) = []; % Fit model: DTAL{lagOrder} = [AL(:,lagRange),DT]; MAL{lagOrder} = fitlm(DTAL{lagOrder}); MALName{lagOrder} = strcat('MAL',num2str(lagRange,'%u')); end %% % We begin by considering the model with |AGE| lag order 2, that is, with % |AGE| data on issuers first rated 3 years ago, and lagged |AGE| data on % issuers first rated 4 and 5 years ago: MAL12 = MAL{2} %% % The lagged variables reduce the sample size by two. Together with two new % estimated coefficients, the degrees of freedom are reduced by 4, to 12. % % The model fit, as measured by the root-mean-squared error and the % adjusted $R^2$ statistic (which accounts for the additional predictors), is % slightly improved relative to |M0|. The significance of the predictors % already in |M0|, as measured by the _p_ values of their _t_-statistics, % is reduced. This is typical when predictors are added, unless the new % predictors are completely insignificant. The overall _F_-statistic shows % that the extended model has a slightly reduced significance in describing % the variation of the response. % % Of the two new predictors, |AGELag2| appears to be much more significant % than |AGELag1|. This is difficult to interpret in economic terms, and it % brings into question the accuracy of the significance measures. Are the % _p_-values an artifact of the small-sample size? Are they affected by % violations of the CNLM assumptions that underly ordinary least squares % (OLS) estimation? In short, do they provide a legitimate reason to modify % the lag structure? Obtaining reliable answers to these questions in % realistically-sized economic data samples is often a challenge. % % The distribution of any diagnostic statistic depends on the distribution % of process innovations, as exhibited in the model residuals. For _t_ and % _F_ tests, normal innovations are sufficinet to produce test statistics % with _t_ and _F_ distributions in finite samples. If the innovations % depart from normality, however, statistics may not follow these expected % distributions. Tests suffer from size distortions, where nominal % significance levels misrepresent the actual frequency of rejecting the % null hypothesis. When this happens, inferences about predictor % significance become unreliable. % % This is a fundamental problem in specification analysis, since at any % stage in the process, candidate models are likely to be misspecified, and % data incompletely filtered. Tests results must be considered in the % context of the residual series. A normal probability plot of the |MAL12| % residuals shows some reason to suspect the reported _p_-values: resAL12 = MAL12.Residuals.Raw; normplot(resAL12) xlabel('Residual') title('{\bf MAL12 Residuals}') %% % A common belief is that _t_ and _F_ tests are robust to nonnormal % innovations. To some degree, this is true. Innovations from _elliptically % symmetric distributions_, such as _t_ distributions, which generalize the % multivariate normal distribution, produce _t_ and _F_ statistics that % follow _t_ and _F_ distributions in finite samples [12]. This result, % however, assumes a diagonal covariance structure. When innovations % exhibit heteroscedasticity and autocorrelation, standard _t_ and _F_ % tests become much less robust [5], [16]. Size distortions can be % substantial in finite samples. Practically, however, the nature of the % innovations distribution, and the degree of distortion, can be difficult % to measure. %% Robust Tests % % _t_ and _F_ statistics involve both coefficient estimates and their % standard errors. In the presence of heteroscedasticity or % autocorrelation, OLS coefficient estimates remain unbiased, provided the % predictors are exogenous. Standard errors, however, when estimated with % the usual CNLM formulas, are biased. % % One response is to form statistics using standard error estimates that % are robust to nonspherical innovations [2], [3], as in the example on % "Generalized Least Squares and HAC Estimators." We illustrate this % strategy here. % % The _p_-value of a _t_ statistic is usually computed using Student's _t_ % distribution. For example, for |AGELag2| in |MAL12|: AGELag2Idx = find(strcmp(MAL12.CoefficientNames,'AGELag2')); coeff_AGELag2 = MAL12.Coefficients.Estimate(AGELag2Idx); se_AGELag2 = MAL12.Coefficients.SE(AGELag2Idx); t_AGELag2 = coeff_AGELag2/se_AGELag2; dfeAL12 = MAL12.DFE; p_AGELag2 = 2*(1-tcdf(t_AGELag2,dfeAL12)) %% % This is the _p_-value reported in the previous display of |MAL12|. % % Using heteroscedasticity-consistent (HC), or more general % heteroscedasticity-and-autocorrelation-consistent (HAC), estimates of the % standard error, while continuing to assume a Student's _t_ distribution % for the resulting statistics, leads to very different _p_-values: % HC estimate: [~,seHC] = hac(MAL12,'type','HC','weights','HC3','display','off'); se_AGELag2HC = seHC(AGELag2Idx); t_AGELag2HC = coeff_AGELag2/se_AGELag2HC; p_AGELag2HC = 2*(1-tcdf(t_AGELag2HC,dfeAL12)) %% % HAC estimate: [~,seHAC] = hac(MAL12,'type','HAC','weights','BT','display','off'); se_AGELag2HAC = seHAC(AGELag2Idx); t_AGELag2HAC = coeff_AGELag2/se_AGELag2HAC; p_AGELag2HAC = 2*(1-tcdf(t_AGELag2HAC,dfeAL12)) %% % The HC _p_-value shows |AGELag2| to be relatively insignificant, while % the HAC _p_-value shows it to be potentially quite significant. The CNLM % _p_-value lies in-between. % % There are a number of problems with drawing reliable inferences from % these results. First, without a more thorough analysis of the residual % series (as in the example on "Residual Diagnostics"), there is no reason % to choose one standard error estimator over the other. Second, the % standard error estimates are only consistent asymptotically, and the % sample here is small, even by econometric standards. Third, the % estimators require several, sometimes arbitrarily selected, _nuisance % parameters_ (|'weights'|, |'bandwidth'|, |'whiten'|) that can % significantly change the results, especially in small samples. Finally, % the newly-constituted _t_ and _F_ statistics, formed with the robust % standard errors, do _not_ follow _t_ and _F_ distributions in finite % samples. % % In short, the significance estimates here may be no better than the % traditional ones, based on CNLM assumptions. Modifications of HAC-based % tests, such as _KVB tests_ [11], are effective in addressing issues with % individual nuisance parameters, but they do not address the larger % problem of applying asymptotic techniques to finite samples. %% Bootstrapped Tests % % Another response to size distortions in traditional specification tests % is _bootstrapping_. The test statistic computed from the original data is % maintained, but its distribution is re-evaluated through simulated % resampling, with the goal of producing a more accurate _p_-value. % % Resampling data from a population is a standard technique for evaluating % the distribution of a statistic. The nature of economic time series, % however, usually makes this impractical. Economies have fixed histories. % Generating realistic alternative paths, with statistical properties % similar to empirical data, requires additional assumptions. % % In a bootstrapped test, a null model is fit to the available data, and % the null distribution of residuals is used to approximate the population % distribution of innovations. The residuals are then resampled, with % replacement, to generate new residual series. Corresponding bootstrap % responses are computed using the fixed predictor histories. Finally, the % new response data is used to refit the alternative (original) model and % recompute the test statistic. This process is repeated, to generate a % bootstrapped distribution. % % For comparison, we bootstrap the the _t_ statistic of |AGELag2| under the % null hypothesis that its coefficient is zero. The null model is: MAL1 = MAL{1} %% % |AGELag1|, highly insignificant in the model |MAL12| that includes both % |AGELag1| and |AGELag2|, becomes more significant in the absence of % |AGELag2|, but its role is still minor relative to the predictors in % |M0|. Its coefficient becomes negative, contrary to our understanding of % it as a predictor of default risk. The inference may be that |AGELag1| is % irrelevant. Nevertheless, we maintain it to evaluate the specific % restriction of |MAL12| to |MAL1|, reducing the lag order by 1: DTAL1 = DTAL{1}; % Lag order 1 table DTAL12 = DTAL{2}; % Lag order 2 table (one less observation) numBoot = 1e3; % Number of statistics res0 = MAL1.Residuals.Raw; % Bootstrap "population" [~,IdxBoot] = bootstrp(numBoot,[],res0); % Bootstrap indices ResBoot = res0(IdxBoot); % Bootstrap residuals IGD0 = DTAL1.IGD - res0; % Residual-free response IGDB = zeros(size(DTAL12,1),numBoot); % Bootstrap responses DTBoot = DTAL12; tBoot = zeros(numBoot,1); % Bootstrap t statistics for boot = 1:numBoot IGDBoot = IGD0 + ResBoot(:,boot); IGDBoot(1) = []; % Trim to size of DTBoot IGDBoot(IGDBoot < 0) = 0; % Set negative default rates to 0 DTBoot.IGD = IGDBoot; MBoot = fitlm(DTBoot); tBoot(boot) = MBoot.Coefficients.tStat(AGELag2Idx); IGDB(:,boot) = IGDBoot; end %% % The procedure generates |numBoot| bootstrap responses, which replace the % original response to the fixed predictor data: figure hold on bootDates = dates(3:end); hIGD = plot(bootDates,IGDB,'b.'); hIGDEnd = plot(bootDates,IGDB(:,end),'b-'); hIGD0 = plot(bootDates,DTAL12.IGD,'ro-','LineWidth',2); hold off xlabel('Date') ylabel('Default Rate') title('{\bf Bootstrap Responses}') legend([hIGD(end),hIGDEnd,hIGD0],'Bootstrap Responses',... 'Typical Bootstrap Response',... 'Empirical Response',... 'Location','NW') %% % The bootstrapped _p_-value is not very different from the original % _p_-value, |p_AGELag2|, found using Student's _t_ distribution: p_AGELag2 p_AGELag2Boot = sum(tBoot > t_AGELag2)/length(tBoot) %% % However, a histogram shows that the distribution of bootstrapped _t_ % statistics has shifted: figure hold on numBins = 50; hHist = histogram(tBoot,numBins,'Normalization','Probability',... 'FaceColor',[.8 .8 1]); x = hHist.BinLimits(1):0.001:hHist.BinLimits(end); y = tpdf(x,dfeAL12); hPDF = plot(x,y*hHist.BinWidth,'m','LineWidth',2); hStat = plot(t_AGELag2,0,'ro','MarkerFaceColor','r'); line([t_AGELag2 t_AGELag2],1.2*[0 max(hHist.Values)],'Color','r') axis tight legend([hHist,hPDF,hStat],'Bootstrap {\it t} Distribution',... 'Student''s {\it t} Distribution',... 'Original {\it t} Statistic',... 'Location','NE') xlabel('{\it t}') title('{\bf Bootstrap {\it t} Statistics}') hold off %% % The _t_ statistic is less significant in the bootstrap distribution, % suggesting the possibile influence of nonspherical innovations on the % original test. % % The bootstrapped test has its own difficulties. To enforce nonnegativity % of default rates, it is necessary to trim bootstrapped responses with % negative values. The consequences for inference are unclear. Moreover, % the bootstrapped test relies, fundamentally, on the assumption that the % empirical distribution of residuals faithfully represents the relevant % characteristics of the distribution of innovations in the DGP. In smaller % samples, this is hard to justify. % % There are many variations of the bootstrap. For example, the _wild % bootstrap_ [7], which combines robust estimation with residual % resampling, seems to perform well with smaller samples in the presence of % heteroscedasticity. %% Likelihood-Based Tests % % Versions of _t_ and _F_ tests, formulated using CNLM assumptions, can % provide reliable inferences in a variety of situations where the % innovations distribution departs from specification. Likelihood-based % tests, by contrast, require a formal model of the innovations to operate % at all. Data likelihoods are typically computed under the assumption of % independent and normally distributed innovations with fixed variance. % This underlying model of the DGP can be adjusted to accomodate different % innovation patterns, including higher probabilities for extreme events, % but the strong _distributional assumption_ remains. % % Like _F_ statistics, data likelihoods (or, in practice, loglikelihoods) % measure the fit of an entire model, or lag structure, rather than the % significance of individual model terms. Likelihoods are based on the % joint probability of the data under the distributional assumption, rather % than on residual sums of squares. Larger likelihoods indicate a better % fit, but to evaluate the relative quality of models, the statistical % significance of likelihood _differences_ must be evaluated. % % Consider the normal loglikelihoods of the OLS estimates of |MAL12| and % its restrictions. We begin by constructing |MAL2|, with only AGELag2, to % complete the set of restrictions considered: DTAL2 = [AGELags(:,2),DataTable]; MAL2 = fitlm(DTAL2) %% % Unrestricted loglikelihood of MAL12: uLLOLS = MAL12.LogLikelihood %% % Restricted loglikelihoods of M0, MAL1, and MAL2: rLLOLS = [M0.LogLikelihood;MAL1.LogLikelihood;MAL2.LogLikelihood] %% % These are OLS loglikelihoods, based on the residual series. For example, % the loglikelihood of the data with respect to |M0| is computed using: resM0 = M0.Residuals.Raw; MSEM0 = M0.MSE; muM0 = mean(resM0); LLM0 = sum(log(normpdf(resM0,muM0,sqrt(MSEM0)))) %% % Since OLS does not, necessarily, maximize likelihoods, unless CNLM % assumptions are satisfied, it is possible for restrictions of the model % parameter space to _increase_ data likelihoods. We see this for the % restrictions |M0| and |MAL2|. This suggests, once again, a nonnormal % innovations process. % % For comparison, consider measures based on _maximum_ likelihood estimates % (MLEs) of model coefficients, using the |arima| function. We fit ARMAX % models with zero-order AR and MA specifications (that is, pure regression % models): % Prepare data: LLOLS = [uLLOLS;rLLOLS]; DataAL12 = table2array(DTAL12); y = DataAL12(:,7); X = DataAL12(:,1:6); PredCols = {1:6,3:6,[1,3:6],2:6}; ModelNames = {'MAL12','M0','MAL1','MAL2'}; % Compute MLEs: LLMLE = zeros(4,1); Mdl = arima(0,0,0); options = optimset('fmincon'); options = optimset(options,'Display','off','Diagnostics','off',... 'Algorithm','sqp','TolCon',1e-7); for model = 1:4 [~,~,LL] = estimate(Mdl,y,'X',X(:,PredCols{model}),... 'print',false,'options',options); LLMLE(model) = LL; end % Display results: fprintf('\nLoglikelihoods\n') fprintf('\n%-8s%-9s%-9s','Model |','OLSLL','MLELL') fprintf(['\n',repmat('=',1,24)]) for model = 1:4 fprintf(['\n%-6s','| ','%-9.4f%-9.4f'],... ModelNames{model},LLOLS(model),LLMLE(model)) end fprintf('\n') %% % In the case of MLEs, all of the restricted models have a reduced % likelihood of describing the data, as expected. The OLS and MLE measures % disagree on the model of largest likelihood, with OLS choosing |M0|, and % MLE choosing |MAL12|. % % Are the differences in likelihood significant? For MLEs, this question % has traditionally been addressed by some version of the likelihood ratio % test (implemented by |lratiotest|), the Wald test (implemented by % |waldtest|), or the Lagrange multiplier test (implemented by |lmtest|). % These are discussed in the example on "The Classical Model % Misspecification Tests" (CMM tests). The geometry of comparison for CMM % tests is based, fundamentally, on the optimality of the model % coefficients. They should not be used with OLS likelihoods unless there % is evidence that CNLM assumptions are satisfied. % % Like _F_ tests, CMM tests are only applicable to comparisons of _nested % models_, which are restrictions or extensions of each other. This is the % typical situation when evaluating lag structures. Unlike _F_ tests, CMM % tests are applicable to comparisons involving nonlinear models, nonlinear % restrictions, and nonnormal (but fully specified) innovation % distributions. This is important in certain econometric settings, but % rarely in lag order selection. A disadvantage of CMM tests is that they % assign significance to model differences only asymptotically, and so must % be used with caution in finite samples. % % For example, the likelihood ratio test, the most direct assesment of MLE % likelihood differences, can be used to evaluate the adequacy of various % restrictions: % Restrictions of |MAL12| to |M0|, |MAL1|, and |MAL2|: dof = [2;1;1]; % Number of restrictions [hHist,pValue] = lratiotest(LLMLE(1),LLMLE(2:4),dof) %% % Restrictions of |MAL1| and |MAL2| to |M0|: dof = [1;1]; % Number of restrictions [hHist,pValue] = lratiotest(LLMLE(3:4),LLMLE(2),dof) %% % At the default 5% significance level, the test fails to reject the null, % restricted model in favor of the alternative, unrestricted model in all % cases. That is, the statistical case for including any lag structure is % weak. The original model, |M0|, might be chosen solely for reasons of % model parsimony. % % An alternative to CMM tests are the various forms of _information % criteria_ (IC). IC also consider goodness-of-fit, measured by % likelihoods, but penalize for lack of parsimony, measured by the number % of model coefficients. As with pure likelihoods, adjusted IC likelihoods % provide a relative, but not an absolute, measure of model adequacy. % However, there are no commonly-used hypothesis tests, corresponding to % CMM tests, for evaluating the significance of IC differences. The main % advantage, in practice, is that IC can be used to compare non-nested % models, though this is often irrelevant when comparing lag structures. % % The following table compares two common IC, AIC and BIC, together with % the OLS equivalent, the adjusted $R^2$: AR2 = [MAL12.Rsquared.Adjusted; M0.Rsquared.Adjusted; ... MAL1.Rsquared.Adjusted; MAL2.Rsquared.Adjusted]; AIC = [MAL12.ModelCriterion.AIC; M0.ModelCriterion.AIC; ... MAL1.ModelCriterion.AIC; MAL2.ModelCriterion.AIC]; BIC = [MAL12.ModelCriterion.BIC; M0.ModelCriterion.BIC; ... MAL1.ModelCriterion.BIC; MAL2.ModelCriterion.BIC]; fprintf('\nSize-Adjusted Fit\n') fprintf('\n%-8s%-7s%-9s%-9s','Model |','AR2','AIC','BIC') fprintf(['\n',repmat('=',1,32)]) for model = 1:4 fprintf(['\n%-6s','| ','%-7.4f','%-9.4f','%-9.4f'],... ModelNames{model},AR2(model),AIC(model),BIC(model)) end fprintf('\n') %% % When comparing models, higher adjusted $R^2$, and lower IC, indicate a % better trade-off between the fit and the reduced degrees of freedom. The % results show a preferance for including a lag structure when % evaluated by adjusted $R^2$, but not when evaluated by IC. This kind of % disagreement is not uncommon, especially with small samples, and further % suggests the comparative use of multiple testing methods. % % BIC, with its typically heavier penalties on additional coefficients, % tends to choose simpler models, though often not as simple as those % selected by sequential _t_ and _F_ testing with standard settings. BIC % has some superior large-sample properties, such as being asymptotically % consistent, but Monte Carlo studies have shown AIC can outperform BIC in % correctly identifying the DGP in small data samples [6]. An alternative % version of AIC, AICc, corrects for small samples, and is especially % useful in these situations. %% Testing Up, Testing Down % % When attempting to specify significant but parsimonious lag structures in % econometric models, two general strategies are often used. The first is % to start with a small model, then test additional lags until their % individual significance, or the joint significance of the entire lag % structure, drops below a set level. This is called _testing up_. % Alternatively, a generous initial lag structure is systematically trimmed % until the largest lag, or the entire lag structure, becomes significant. % This is called _testing down_. % % Testing up starts with a parsimonious description of the data, such as a % static model with contemporaneous values of relevant predictors but no % dynamic terms. It then proceeds from the specific to the general. Each % step in the process evaluates the effect of adding a new lag, usually % using some combination of _t_ tests, _F_ tests, CMM tests, or IC. It % stops when adding a new lag becomes insignificant at some predetermined % level. It guarantees, in this way, that the initial model parsimony is % maintained to some degree. % % Acknowledging Occam's razor and principles of scientific method, testing % up offers a number of advantages. Simple models are less costly to % compute, easier to interpret and detect misspecification, work better % with small samples, and are more amenable to generalization. Moreover, % they often produce better forecasts [10]. % % However, testing up is often discouraged for lag order selection, and % economic modeling in general. There is a common scenario where % significant lags lie beyond the first insignificant one, such as with % seasonal lags. Automatic testing up will not detect these. Moreover, % sequential testing in the presence of omitted variables, which have not % yet been added to the model, creates estimator bias and distortions of % test sizes and power, ultimately leading to incorrect inferences. % Omitted variable bias is discussed in the examples on "Spurious % Regression" and "Lagged Variables and Estimator Bias." % % As a consequence, testing down is often recommended [9]. This strategy % begins with a model that includes all potential explanatory variables. % That is, it includes a mix of predictors with more or less significance % in explaining the variation of the response. It then proceeds from the % general to the specific (sometimes called GETS). Each step in the process % evaluates the effect of removing a predictor, using the same sorts of % tests used for testing up. It stops when the restricted model achieves % some predetermined level of significance. % % There are several advantages to this approach. If the initial model and % lag structure is comprehensive enough, then all of the testing is done, % at least in principle, in the absence of ommitted variable bias. % Localized tests, such as tests on the largest lag, can lead to models % that continue to contain a mix of significant and insignificant lags, but % since they are all present in the model, they can be examined for joint % significance. A disadvantage to this approach is the lack of theoretical % guidance, or even good heuristic tips, when choosing the initial lag % order in different modeling situations. % % The following table shows _p_ values of _t_ statistics on coefficients of % lagged |AGE| in lag structures of order 1 to 5: fprintf('\nt Statistic p Values\n') fprintf('\n%-11s%-5s%-5s%-5s%-5s%-5s','Model |',... 'AL1','AL2','AL3','AL4','AL5') fprintf(['\n',repmat('=',1,35)]) for lag = 1:5 pVals = MAL{lag}.Coefficients.pValue(2:lag+1); fprintf(['\n%-9s','| ',repmat('%-5.2f',1,lag)],... MALName{lag},pVals(1:lag)) end fprintf('\n') %% % At a 15% significance level, testing up from |M0| adds no lags to the % model, since it fails to reject a zero coefficient for |AgeLag1| in the % first model tested, |MAL1|. At the same level, testing down from the % largest model, sequentially evaluating the significance of the largest % lag, |MAL12| is selected, adding two lags to |M0|. The relative % significance of specific lags across different lag structures highlights % the risk of automating these localized evaluations. % % _F_ statistics add useful information about joint significance. _F_ tests % on additional lags, relative to the model with all previous lags, are % equivalent to _t_ tests, with the same _p_ values. However, _F_ tests of % an entire lag structure, relative to a static specification, can provide % hints of significant lags prior to the largest lag. _F_ ratios are % computed by the |coefTest| method of the |LinearModel| class: fprintf('\nF Statistic p Values\n') fprintf('\n%-11s%-5s%-5s','Model |','Last','All') fprintf(['\n',repmat('=',1,20)]) for lag = 1:5 % Sequential F test (last lag = 0): HSq = [zeros(1,lag),1,zeros(1,4)]; pSq = coefTest(MAL{lag},HSq); % Static F test (all lags = 0): HSt = [zeros(lag,1),eye(lag),zeros(lag,4)]; pSt = coefTest(MAL{lag},HSt); fprintf(['\n%-9s','| ','%-5.2f','%-5.2f'],MALName{lag},pSq,pSt) end fprintf('\n') %% % The _F_ statistics cannot jump-start a stalled testing-up strategy, but % when testing down, they do give reason to reconsider the largest model, % with its relatively lower _p_ value. The increased significance of % |AgeLag3| and |AgeLag4| in |MAL12345|, indicated by the _t_ statistic _p_ % values, raise the joint significance of that lag structure. Still, the % most significant overall lag structure is in |MAL12|, consistent with % testing-down via _t_ statistics. % % Predictably, Monte Carlo studies show that automated testing up % strategies will frequently under-fit when the DGP is a supermodel, and % testing down will likewise over-fit when the DGP is a submodel [14]. % Performance, in either case, is improved by systematically adjusting % significance levels to account for varying degrees of freedom. In % general, however, the statistical consequences of excluding relevant lags % is usually considered more serious than including irrelevant lags, and % rejection tolerances must be set accordingly. % % In practice, hybrid strategies are usually preferred, moving predictors % in and out of the model until some measure of fit is optimized, and an % economically sensible model is obtained. Stepwise regression (described % in the example on "Predictor Selection") is one way to automate this % approach. With modern computing power, there is also the possibility, in % some cases, to exhaustively evaluate all models of relevance. As this % example illustrates, however, automation of model selection procedures % must be viewed with some skepticism. The process is necessarily dynamic, % tests have varying degrees of relevance, and decisions ultimately require % some consideration of economic theory and modeling goals. %% Special Cases % % Because of the difficulties of using standard testing procedures in % modeling contexts where CNLM assumptions are violated, a number of % specialized procedures have been developed for use with specific model % types. In some cases, appropriate lag orders can be determined solely % through data analysis. Other cases require sequential estimation and % evaluation of a range of candidate models. % % $\bullet$ *ARMA Models*. Stationary time series are often represented, % theoretically, by infinite-order MA processes [18]. ARMA models translate % these representations into finite, rational form. Lag orders for the AR % and MA components of a model must be chosen together, to achieve a % balance between accuracy and model parsimony. The standard identification % method [4], described in the example "Box-Jenkins Model Selection," % examines patterns in sample autocorrelation functions to determine the % relative significance of candidate lag structures. % % $\bullet$ *ARDL Models*. Many economic variables are influenced by % exogenous driving processes that can impart persistent shocks to a DGP. % Theoretically, these are represented by infinite-order DL models, but as % with ARMA models, finite, rational forms are required for practical % estimation. Standard techniques, such as those proposed by Almon [1] and % Koyck [13], assign weights to the lag structure in such a way that the % model can be transformed into AR, ARMA, or ARMAX form. The methods are % more ad hoc than data-driven, and subject to the problems of collinearity % that come from working with many lags of a predictor at proximate times. % (See the example on "Collinearity and Estimator Variance".) % % $\bullet$ *GARCH Models*. GARCH models are commonly used to model % patterns of hetroscedasticity in an innovations process, especially in % financial applications. Like ARMA and ARDL models, they combine two types % of lags, with orders that must be balanced appropriately. In practice, % there are methods for tranforming GARCH models to ARMA form [8], where % Box-Jenkins methods can be applied, but this is rarely done in practice. % For most economic and financial series, lag orders of 1 and 1 seem to % serve well. % % $\bullet$ *Unit Root Tests*. Unit root and stationarity tests, such as % |adftest| and |lmctest|, use dynamic models of the test process, and % require users to choose a lag order. This nuisance parameter can have % significant effect on the test results. The potential presence of % nonstationary data in this setting calls into question the use of % standard _t_ and _F_ tests. However, Sims, Stock, and Watson [17] have % shown that they are justified when the regression includes all % deterministic components of the DGP. % % $\bullet$ *VAR Models*. VAR models are a generic, widely-used form for % representing systems of interacting economic variables. They require a % lag order that captures the relevant past history of _all_ variables in % the model. Since the models are multivariate, estimation costs grow % quickly with increasing lag order, so a parsimonious selection procedure % is essential. Lütkepohl [15] discusses a variety of strategies, most of % which are multivariate generalizations of the techniques presented in % this example. %% Summary % % This example surveys common strategies for lag order selection, and makes % the case for adapting strategies to individual data sets and models. The % data considered here are few, and far from the idealizations of % asymptotic analysis. It is also quite possible the model under study may % be misspecified, confounding its own evaluation. However, these obstacles % are fairly typical in econometric practice. Without considered, % "hands-on" application, guided by some sense of economic reality, lag % order selection presents many opportunities for distorted inference that % can lead to poorly performing models. Familiarity with the common % difficulties, however, can help guide the path to sharper specifications. % % Of course, it is not always necessary to choose a "best" model or lag % order. Often, given the statistical uncertainty, it is enough to % eliminate a large subset of highly unlikely candidates, leaving a smaller % subset for further analysis and data collection. The strategies of this % example serve that purpose well. % %% References % % [1] Almon, S. "The Distributed Lag Between Capital Appropriations and % Expenditures." _Econometrica._ Vol. 33, 1965, pp. 178-196. % % [2] Andrews, D. W. K. "Heteroskedasticity and Autocorrelation Consistent % Covariance Matrix Estimation." _Econometrica._ Vol. 59, 1991, pp. % 817-858. % % [3] Andrews, D. W. K., and J. C. Monohan. "An Improved Heteroskedasticity % and Autocorrelation Consistent Covariance Matrix Estimator." % _Econometrica._ Vol. 60, 1992, pp. 953-966. % % [4] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. _Time Series % Analysis: Forecasting and Control._ Upper Saddle River, NJ: % Prentice-Hall, 1994. % % [5] Banerjee, A. N., and J. R. Magnus. "On the Sensitivity of the Usual % _t_- and _F_-Tests to Covariance Misspecification." _Journal of % Econometrics._ Vol. 95, 2000, pp. 157-176. % % [6] Burnham, K. P., and D. R. Anderson. _Model Selection and Multi-Model % Inference: A Practical Information-Theoretic Approach_. New York: % Springer, 2010. % % [7] Davidson, R., and E. Flachaire. "The Wild Bootstrap, Tamed at Last." % _Journal of Econometrics._ Vol. 146, 2008, pp. 162-169. % % [8] Hamilton, J.D. _Time Series Analysis._ Princeton, NJ: Princeton % University Press, 1994. % % [9] Hendry, D. F. _Econometrics: Alchemy or Science?_ Oxford: Oxford % University Press, 2001. % % [10] Keuzenkamp, H. A., and M. McAleer. "Simplicity, Scientific Inference % and Economic Modeling." _Economic Journal._ Vol. 105, 1995, pp. 1-21. % % [11] Kiefer, N. M., T. J. Vogelsang, and H. Bunzel. "Simple Robust % Testing of Regression Hypotheses." _Econometrica._ Vol. 68, 2000, pp. % 695-714. % % [12] King, M. L. "Robust Tests for Spherical Symmetry and Their % Application to Least Squares Regression." _Annals of Statistics._ Vol. 8, % 1980, pp. 1265-1271. % % [13] Koyck, L. M. _Distributed Lags Models and Investment Analysis._ % Amsterdam: North-Holland, 1954. % % [14] Krolzig, H. M., and Hendry, D.F. "Computer Automation of % General-To-Specific Model Selection Procedures." _Journal of Economic % Dynamics & Control_. Vol. 25, 2001, pp. 831-866. % % [15] Lütkepohl, H. _New Introduction to Multiple Time Series Analysis._ % New York: Springer, 2007. % % [16] Qin, H., and A. T. K. Wan. "On the Properties of the _t_- and % _F_-Ratios in Linear Regressions with Nonnormal Errors." _Econometric % Theory._ Vol. 20, No. 4, 2004, pp. 690-700. % % [17] Sims, C. A., J. H. Stock, and M. W. Watson. "Inference in Linear % Time Series Models with some Unit Roots." _Econometrica._ Vol. 58, 1990, % pp. 113-144. % % [18] Wold, H. _A Study in the Analysis of Stationary Time Series._ % Uppsala, Sweden: Almqvist & Wiksell, 1938.