www.gusucode.com > stats 源码程序 matlab案例代码 > stats/PredictResubstitutionLabelsOfECOCModelsUsingCustomBinaryExample.m

    %% Predict Resubstitution Labels of ECOC Models Using a Custom Binary Loss Function
%%
% Load Fisher's iris data set.

% Copyright 2015 The MathWorks, Inc.

load fisheriris
X = meas;
Y = categorical(species);
n = numel(Y);           % Sample size
classOrder = unique(Y); % Class order
K = numel(classOrder);  % Number of classes
%%
% Train an ECOC model using SVM binary classifiers. It is good practice to
% standardize the predictors and define the class order. Specify to
% standardize the predictors using an SVM template.
t = templateSVM('Standardize',1);
Mdl = fitcecoc(X,Y,'Learners',t,'ClassNames',classOrder);
%%
% |t| is an SVM template object. The software uses default values for empty
% options in |t| during training. |Mdl| is a |ClassificationECOC| model.
%%
% SVM scores are signed distances from the observation to the decision
% boundary.  Therefore, the domain is $(-\infty,\infty)$.  Create a custom
% binary loss function that:
%
% * Maps the coding design matrix (_M_) and positive-class classification
% scores (_s_) for each learner to the binary loss for each observation 
% * Uses linear loss
% * Aggregates the binary learner loss using the median  
% 
% You can create a separate function for the binary loss function, and then
% save it on the MATLAB(R) path.  Or, you can specify an anonymous binary
% loss function.
customBL = @(M,s)nanmedian(1 - bsxfun(@times,M,s),2)/2;
%%
% Predict resubstitution labels and estimate the median binary loss per
% class.  Print the median negative binary losses per class for a random
% set of 10 observations.
[label,NegLoss] = resubPredict(Mdl,'BinaryLoss',customBL);

rng(1); % For reproducibility
idx = randsample(n,10);
classOrder
table(Y(idx),label(idx),NegLoss(idx,:),'VariableNames',...
    {'TrueLabel','PredictedLabel','NegLoss'})
%%
% The column order corresponds to the elements of |classOrder|.
% The software predicts the label based on the maximum negated loss.  The
% results seem to indicate that the median of the linear losses might not
% perform as well as other losses.