www.gusucode.com > stats 源码程序 matlab案例代码 > stats/ExamineModelQualityUsingResidualsExample.m

    %% Examine Model Quality Using Residuals
% This example shows how to use residuals to help you discover errors,
% outliers, or correlations in the model or data.
%%
% Load the sample data.

% Copyright 2015 The MathWorks, Inc.

load carsmall
%%
% Fit a model of |MPG| as a function of |Cylinders| (nominal) and |Weight|
% .
ds = dataset(Weight,MPG,Cylinders);
ds.Cylinders = ordinal(ds.Cylinders);
mdl = fitlm(ds,'MPG ~ Cylinders*Weight + Weight^2');
%%
% Examine the residuals.
plotResiduals(mdl)
%%
% The default plot, histogram plot, shows the range of the residuals and
% their frequencies. The observations above 12 are potential outliers.
%%
% Display the probability plot.
plotResiduals(mdl,'probability')
%%
% The probability plot, shows how the distribution of the residuals
% compares to a normal distribution with matched variance.
%%
% The two potential outliers appear on this plot as well. Otherwise, the
% probability plot seems reasonably straight, meaning a reasonable fit to
% normally distributed residuals.
%%
% You can identify the two outliers and remove them from the data.
outl = find(mdl.Residuals.Raw > 12)
%%
% To remove the outliers, use the |'Exclude'| name-value pair argument.
mdl2 = fitlm(ds,'MPG ~ Cylinders*Weight + Weight^2',...
    'Exclude',outl);
%%
% Examine a residuals plot of |mdl2| .
plotResiduals(mdl2)
%%
% The new residuals plot looks fairly symmetric, without obvious problems.
% However, there might be some serial correlation among the residuals.
% Create a new plot to see if such an effect exists.
plotResiduals(mdl2,'lagged')
%%
% The scatter plot shows many more crosses in the upper-right and
% lower-left quadrants than in the other two quadrants, indicating positive
% serial correlation among the residuals.
%%
% Another potential issue is when residuals are large for large
% observations. See if the current model has this issue.
plotResiduals(mdl2,'fitted')
%%
% There is some tendency for larger fitted values to have larger residuals.
% Perhaps the model errors are proportional to the measured values.