www.gusucode.com > stats 源码程序 matlab案例代码 > stats/GeneralizedLinearModelWorkflowExample.m
%% Generalized Linear Model Workflow % This example shows how to fit a generalized linear model and analyze the % results. A typical workflow involves the following: import data, fit a % generalized linear model, test its quality, modify it to improve the quality, % and make predictions based on the model. It computes the probability that % a flower is in one of two classes, based on the Fisher iris data. %% Step 1. Load the data. % Load the Fisher iris data. Extract the rows that have classification versicolor % or virginica. These are rows 51 to 150. Create logical response variables % that are |true| for versicolor flowers. load fisheriris X = meas(51:end,:); % versicolor and virginica y = strcmp('versicolor',species(51:end)); %% Step 2. Fit a generalized linear model. % Fit a binomial generalized linear model to the data. mdl = fitglm(X,y,'linear',... 'distr','binomial') %% Step 3. Examine the result, consider alternative models. % Some $p$-values in the |pValue| column are not very small. Perhaps the % model can be simplified. % % See if some 95% confidence intervals for the coefficients include 0. If % so, perhaps these model terms could be removed. confint = coefCI(mdl) %% % Only two of the predictors have coefficients whose confidence intervals % do not include 0. %% % The coefficients of |'x1'| and |'x2'| have the largest $p$-values. Test % whether both coefficients could be zero. M = [0 1 0 0 0 % picks out coefficient for column 1 0 0 1 0 0]; % picks out coefficient for column 2 p = coefTest(mdl,M) %% % The $p$-value of about 0.14 is not very small. Drop those terms from the % model. mdl1 = removeTerms(mdl,'x1 + x2') %% % Perhaps it would have been better to have |stepwiseglm| identify the model % initially. mdl2 = stepwiseglm(X,y,... 'constant','Distribution','binomial','upper','linear') %% % |stepwiseglm| included |'x2'| in the model, because it neither adds nor % removes terms with $p$-values between 0.05 and 0.10. %% Step 4. Look for outliers and exclude them. % Examine a leverage plot to look for influential outliers. plotDiagnostics(mdl2,'leverage') %% % There is one observation with a leverage close to one. Using the Data % Cursor, click the point, and find it has index 69. %% % See if the model coefficients change when you fit a model excluding this % point. oldCoeffs = mdl2.Coefficients.Estimate; mdl3 = fitglm(X,y,'linear',... 'distr','binomial','pred',2:4,'exclude',69); newCoeffs = mdl3.Coefficients.Estimate; disp([oldCoeffs newCoeffs]) %% % The model coefficients do not change, suggesting that the response at % the high-leverage point is consistent with the predicted value from the % reduced model. %% Step 5. Predict the probability that a new flower is versicolor. % Use |mdl2| to predict the probability that a flower with average measurements % is versicolor. Generate confidence intervals for your prediction. [newf,newc] = predict(mdl2,mean(X)) %% % The model gives almost a 50% probability that the average flower is versicolor, % with a wide confidence interval about this estimate.