www.gusucode.com > stats 源码程序 matlab案例代码 > stats/DetermineOutliersUsingCooksDistanceExample.m
%% Determine Outliers Using Cook's Distance % This example shows how to use Cook's Distance to determine the outliers % in the data. % Copyright 2015 The MathWorks, Inc. %% % Load the sample data and define the independent and response variables. load hospital X = double(hospital(:,2:5)); y = hospital.BloodPressure(:,1); %% % Fit the linear regression model. mdl = fitlm(X,y); %% % Plot the Cook's distance values. plotDiagnostics(mdl,'cookd') %% % The dashed line in the figure corresponds to the recommended threshold % value, |3*mean(mdl.Diagnostics.CooksDistance)|. The plot has some observations % with Cook's distance values greater than the threshold value, which for % this example is 3*(0.0108) = 0.0324. In particular, there are two Cook's % distance values that are relatively higher than the others, which exceed % the threshold value. You might want to find and omit these from your data % and rebuild your model. %% % Find the observations with Cook's distance values that exceed the threshold % value. find((mdl.Diagnostics.CooksDistance)>3*mean(mdl.Diagnostics.CooksDistance)) %% % Find the observations with Cook's distance values that are relatively % larger than the other observations with Cook's distances exceeding the % threshold value. find((mdl.Diagnostics.CooksDistance)>5*mean(mdl.Diagnostics.CooksDistance))