www.gusucode.com > stats 源码程序 matlab案例代码 > stats/EstimateGeneralizationErrorOfBoostingEnsembleExample.m

    %% Estimate Generalization Error of Boosting Ensemble
% Estimate the generalization error of an ensemble of boosted
% regression trees.
%%
% Load the |carsmall| data set. Choose the number of cylinders, volume
% displaced by the cylinders, horsepower, and weight as predictors of fuel
% economy.
load carsmall
X = [Cylinders Displacement Horsepower Weight];
%%
% Cross-validate an ensemble of regression trees using 10-fold
% cross-validation.  Using a decision tree template, specify that each tree
% should be a split once only.
rng(1); % For reproducibility
t = templateTree('MaxNumSplits',1);
Mdl = fitrensemble(X,MPG,'Learners',t,'CrossVal','on');
%%
% |Mdl| is a |RegressionPartitionedEnsemble| model.
%%
% Plot the cumulative, 10-fold cross-validated, mean-squared error (MSE).
% Display the estimated generalization error of the ensemble.
kflc = kfoldLoss(Mdl,'Mode','cumulative');
figure;
plot(kflc);
ylabel('10-fold cross-validated MSE');
xlabel('Learning cycle');

estGenError = kflc(end)
%%
% |kfoldLoss| returns the generalization error by default.  However,
% plotting the cumulative loss allows you to monitor how the loss changes
% as weak learners accumulate in the ensemble.
%%
% The ensemble achieves an MSE of around 23.5 after accumulating about
% 30 weak learners.
%%
% If you are satisfied with the generalization error of the ensemble, then,
% to create a predictive model, train the ensemble again using all of the
% settings except cross-validation. However, it is good practice to tune
% hyperparameters such as the maximum number of decision splits per tree
% and the number of learning cycles..