www.gusucode.com > stats 源码程序 matlab案例代码 > stats/EstimateResubstitutionLossOfBoostingEnsembleExample.m
%% Train Classification Ensemble % Create a predictive classification ensemble using all available predictor % variables in the data. Then, train another ensemble using fewer % predictors. Compare the in-sample predictive accuracies of the % ensembles. %% % Load the |census1994| data set. load census1994 %% % Train an ensemble of classification models using the entire data set and % default options. Mdl1 = fitcensemble(adultdata,'salary') %% % |Mdl| is a |ClassificationEnsemble| model. Some notable % characteristics of |Mdl| are: % % * Because two classes are represented in the data, LogitBoost is % the ensemble-aggregation algorithm. % * Because the ensemble-aggregation method is a boosting algorithm, % classification trees that allow a maximum of 10 splits compose the % ensemble. % * One hundred trees compose the ensemble. % %% % Use the classification ensemble to predict the labels of a random set of % five observations from the data. Compare the predicted labels with their % true values. rng(1) % For reproducibility [pX,pIdx] = datasample(adultdata,5); label = predict(Mdl1,pX); table(label,adultdata.salary(pIdx),'VariableNames',{'Predicted','Truth'}) %% % Train a new ensemble using |age| and |education| only. Mdl2 = fitcensemble(adultdata,'salary ~ age + education'); %% % Compare the resubstitution losses between |Mdl1| and |Mdl2|. rsLoss1 = resubLoss(Mdl1) rsLoss2 = resubLoss(Mdl2) %% % The in-sample misclassification rate for the ensemble that uses all % predictors is lower.