www.gusucode.com > stats 源码程序 matlab案例代码 > stats/StatisticallyComparePerformanceOfTwoECOCExample.m

    %% Statistically Compare Performance of Two ECOC Classification Models
% One way to select predictors or features is to train two models where one
% that uses a subset of the predictors that trained the other.
% Statistically compare the predictive performances of the models.  If
% there is sufficient evidence that model trained on fewer predictors
% performs better than the model trained using more of the predictors, then
% you can proceed with a more efficient model.
%%
% Load Fisher's iris data set.  Plot all 2-dimensional combinations of predictors.

% Copyright 2015 The MathWorks, Inc.

load fisheriris
d = size(meas,2); % Number of predictors
pairs = combnk(1:d,2);

figure;
for j = 1:size(pairs,1);
    subplot(3,2,j);
    gscatter(meas(:,pairs(j,1)),meas(:,pairs(j,2)),species);
    xlabel(sprintf('meas(:,%d)',pairs(j,1)));
    ylabel(sprintf('meas(:,%d)',pairs(j,2)));
    legend off;
end
%%
% Based on the scatterplot, |meas(:,3)| and |meas(:,4)| seem like they
% separate the groups well.
%%
% Create an ECOC template. Specify to use a one-versus-all coding
% design.
t = templateECOC('Coding','onevsall');
%%
% By default, the ECOC model uses linear SVM binary learners.  You can choose
% other, supported algorithms by specifying them using the
% |'Learners'| name-value pair argument.
%%
% Test whether an ECOC model that is just trained using predictors 3 and 4
% performs at most as well as an ECOC model that is trained using all
% predictors. Rejecting this null hypothesis means that the
% ECOC model trained using predictors 3 and 4 performs better than the ECOC
% model trained using all predictors. Suppose $C_1$ represents
% the classification error of the ECOC model trained using predictors 3 and
% 4 and $C_2$ represents the classification error of the ECOC model trained
% using all predictors, then the test is:
%
% $$\begin{array}{l}
% {H_0}:{C_1} \ge {C_2}\\
% {H_1}:{C_1} < {C_2}
% \end{array}$$
% 
% By default, |testckfold| conducts a 5-by-2 _k_-fold _F_ test, which is not
% appropriate as a one-tailed test.  Specify to conduct a 5-by-2
% _k_-fold _t_ test.
rng(1); % For reproducibility
[h,pValue] = testckfold(t,t,meas(:,pairs(1,:)),meas,species,...
    'Alternative','greater','Test','5x2t')
%%
% The |h = 0| indicates that there is not enough evidence to suggest that
% the model trained using predictors 3 and 4 is more accurate than the 
% model trained using all predictors.