www.gusucode.com > stats 源码程序 matlab案例代码 > stats/EstimateGeneralizationErrorOfBoostingEnsemblesExample.m
%% Estimate Generalization Error of Boosting Ensemble % Estimate the generalization error of ensemble of boosted % classification trees. %% % Load the |ionosphere| data set. load ionosphere %% % Cross-validate an ensemble of classification trees using AdaBoostM1 and % 10-fold cross-validation. Specify that each tree should be split a % maximum of five times using a decision tree template. rng(5); % For reproducibility t = templateTree('MaxNumSplits',5); Mdl = fitcensemble(X,Y,'Method','AdaBoostM1','Learners',t,'CrossVal','on'); %% % |Mdl| is a |ClassificationPartitionedEnsemble| model. %% % Plot the cumulative, 10-fold cross-validated, misclassification rate. % Display the estimated generalization error of the ensemble. kflc = kfoldLoss(Mdl,'Mode','cumulative'); figure; plot(kflc); ylabel('10-fold Misclassification rate'); xlabel('Learning cycle'); estGenError = kflc(end) %% % |kfoldLoss| returns the generalization error by default. However, % plotting the cumulative loss allows you to monitor how the loss changes % as weak learners accumulate in the ensemble. %% % The ensemble achieves a misclassification rate of around 0.06 after % accumulating about 50 weak learners. Then, the misclassification rate % increase slightly as more weak learners enter the ensemble. %% % If you are satisfied with the generalization error of the ensemble, then, % to create a predictive model, train the ensemble again using all of the % settings except cross-validation. However, it is good practice to tune % hyperparameters, such as the maximum number of decision splits per tree % and the number of learning cycles.