www.gusucode.com > stats 源码程序 matlab案例代码 > stats/CompareClassificationTreePredictorSelectionAlgorithmsExample.m
%% Compare Classification Tree Predictor-Selection Algorithms % At each node, |fitctree| chooses the best predictor to split using an % exhaustive search by default. Alternatively, you can choose to split the % predictor that shows the most evidence of dependence with the response by % conducting curvature tests. This example statistically compares % classification trees grown via exhaustive search for the best splits and % grown by conducting curvature tests with interaction. %% % Load the |census1994| data set. load census1994.mat rng(1) % For reproducibility %% % Grow a default classification tree using the training set, |adultdata|, which is a table. % The response-variable name is |'salary'|. C1 = fitctree(adultdata,'salary') %% % |C1| is a full |ClassificationTree| model. Its |ResponseName| property % is |'salary'|. |C1| uses an exhaustive search to find the best predictor % to split on based on maximal splitting gain. %% % Grow another classification tree using the same data set, but specify to % find the best predictor to split using the curvature test with % interaction. C2 = fitctree(adultdata,'salary','PredictorSelection','interaction-curvature') %% % |C2| also is a full |ClassificationTree| model with |ResponseName| % equal to |'salary'|. %% % Conduct a 5-by-2 paired _F_ test to compare the accuracies of the two % models using the training set. Because the response-variable names in % the data sets and the |ResponseName| properties are all equal, and the % response data in both sets are equal, you can omit supplying the response % data. h = testckfold(C1,C2,adultdata,adultdata) %% % |h = 0| indicates to not reject the null hypothesis that |C1| and |C2| % have the same accuracies at 5% level.