www.gusucode.com > stats 源码程序 matlab案例代码 > stats/CCodeGenerationForImageClassifierExample.m

    %% C Code Generation for Image Classifier
% This example shows how to generate C code from a MATLAB function that
% classifies images of digits using a trained classification model. This
% example demonstrates an alternative workflow to
% <docid:vision_examples.example-HOGDigitClassificationExample Digit
% Classification Using HOG Features>.  However, to support code generation
% in that example, you can follow the code generation steps in this
% example.
%%
% In addition to a Statistics and Machine Learning Toolbox(TM) license,
% this example requires a MATLAB(R) Coder(TM) license.
%%
% Automated image classification is an ubiquitous tool.  For example, a
% trained classifier can be deployed to a drone to identify anomalies on
% land, or a machine that scans hand-written zip codes on letters.  In the
% latter example, after the machine finds the zip code and stores
% individual images of digits, the machine must guess which digits are in
% the images to reconstruct the zip code. 
%%
% This example shows how to train and optimize a multiclass
% error-correcting output codes (ECOC) classification model to classify
% digits based on the pixel intensities in raster images. The ECOC model is
% composed of binary support vector machine (SVM) learners. Then, this
% example shows how to generate C code that uses the trained model to
% classify new images.  The data are synthetic images of warped digits of
% various fonts, which simulates hand-written digits.

%% Generating Code for Statistics and Machine Learning(TM) Toolbox Functions
% To generate C code, MATLAB Coder:
%
% * Requires a properly configured compiler
% * Requires supported functions to be in a MATLAB function that you
% declare. For the basic workflow, see <docid:stats_ug.bvbgk_a-1 Code
% Generation for Statistics and Machine Learning Toolbox(TM) Functions>.
% * Forbids objects as input arguments of the declared function.  
%
%%
% For the last limitation, you should note that:
%
% * Trained classification models are objects.
% * MATLAB Coder supports |predict| to classify observations using trained
% models, but does not support fitting the model.
%
%%
% To workaround the code generation limitations for classification, train
% the classification model using MATLAB, then pass the resulting model
% object to <docid:stats_ug.bvclu99 |saveCompactModel|>. |saveCompactModel|
% reduces the memory footprint of the model (that is, makes it compact) if
% necessary, and then saves the trained model to disk as a structure array.
% Like the compact model, the structure array contains only information to
% classify new observations.
%%
% After saving the model to disk, load the model in the MATLAB function by
% using <docid:stats_ug.bvcl05n-1 |loadCompactModel|>. |loadCompactModel|
% loads the saved structure array, and then reconstructs the original
% compact model object.  In the MATLAB function, you can pass the model and
% predictor data set, which can be an input argument of the function, to
% |predict| to classify the observations.

%% Code Generation for Classification Workflow
% Before deploying an image classifier onto a device:
% 
% # Obtain a sufficient amount of labeled images. 
% # Decide which features to extract from the images.
% # Train and optimize a classification model. This step includes choosing
% an appropriate algorithm and tuning hyperparameters.
% # Save the model to disk by using |saveCompactModel|.
% # Declare a function for classifying new images.  The function must load
% the model by using |loadCompactModel|, and can return labels, among other
% things such as classification scores.
% # Set up your C compiler.
% # Decide which environment to execute the generated code.
% # Generate C code for the function.
%

%% Load and Preprocess Data
% Load the |digitimages| data set from the |matlabroot/examples/stats|
% directory.
load digitimages
%%
% |images| is a 28-by-28-by-3000 array of |uint16| integers.  Each page is
% a raster image of a digit.  Each element is a pixel intensity.
% Corresponding labels are in |Y|.  For more details, enter |Description|
% at the command line.
%%
% Store the number of observations and number of predictor variables.
% Create a data partition that specifies to hold out 20% of the data.
% Extract training and test set indices from the data partition.
rng(1); % For reproducibility
n = size(images,3);
p = numel(images(:,:,1));
cvp = cvpartition(n,'Holdout',0.20);
idxTrn = training(cvp);
idxTest = test(cvp);
%%
% Display nine random images from the data.
figure;
for j = 1:9
    subplot(3,3,j);
    selectImage = datasample(images,1,3);
    imshow(selectImage,[]);
end
%%
% Because raw pixel intensities vary widely, you should normalize their
% values before training a classification model.  Rescale the pixel
% intensities so that they range in the interval [0,1].  That is, suppose
% $p_{ij}$ is pixel intensity $j$ within image $i$.  For image $i$, rescale
% all of its pixel intensities using this formula.
%
% $$\hat p_{ij} = \frac{p_{ij} - \min_j(p_{ij})}{\max_j(p_{ij}) - \min_j(p_{ij})}.$$
%
X = double(images);

for i = 1:n
    minX = min(min(X(:,:,i)));
    maxX = max(max(X(:,:,i)));
    X(:,:,i) = (X(:,:,i) - minX)/(maxX - minX);
end
%%
% If you have an Image Processing Toolbox(TM) license, then, to efficiently
% rescale pixel intensities of images to [0,1], use
% <docid:images_ref.f6-116811 |mat2gray|>.
%%
% For code generation, the predictor data for training must be in a table
% of numeric variables or a numeric matrix.  
%%
% Reshape the data to a matrix such that predictor variables (pixel
% intensities) correspond to columns and images (observations) to rows.
% Because |reshape| takes elements columwise, you must transpose its 
% result.
X = reshape(X,[p,n])';
%%
% To ensure that preprocessing the data maintains the image, plot the
% first observation in |X|.
figure;
imshow(reshape(X(1,:),sqrt(p)*[1 1]),[],'InitialMagnification','fit')
%% Extract Features
% Computer Vision System Toolbox(TM) offers several feature-extraction
% techniques for images.  One such technique is the extraction of histogram
% of oriented gradient (HOG) features.  For an example that trains an ECOC
% model using HOG features, see
% <docid:vision_examples.example-HOGDigitClassificationExample Digit
% Classification Using HOG Features>.  For details on other supported
% techniques, see <docid:vision_ug.buk94l0-1 Local Feature Detection and
% Extraction>. This example proceeds using the rescaled pixel intensities
% as predictor variables.

%% Train and Optimize Classification Model
% Linear SVM models are often applied to image data sets for
% classification.  However, SVM are binary classifiers, and there are 10
% possible classes in the data set.
%%
% You can create a multiclass model of multiple binary SVM learners using
% <docid:stats_ug.bue3oc9 |fitcecoc|>.  |fitcecoc| combines multiple binary
% learners using a coding design.  By default, |fitcecoc| applies the
% one-versus-one design, which specifies to train binary learners using
% observations from all combinations of pairs of classes. For example, in a
% problem with 10 classes, |fitcecoc| must train 45 binary SVM models.
%%
% In general, when you train a classification model, you should tune the
% hyperparameters (parameters not fit during training) until you achieve a
% satisfactory generalization error.  That is, you should cross-validate
% models for particular sets of hyperparameters, and then compare the
% out-of-fold misclassification rates.  
%%
% You can choose your own sets of hyperparameter values, or you can specify
% to implement Bayesian optimization (For general details on Bayesian
% optimization, see <docid:stats_ug.bvan2t_-1 Bayesian Optimization
% Workflow>.).  This example proceeds by cross-validation over a chosen
% grid of values.
%%
% Cross-validate an ECOC model of SVM binary learners using the training
% observations using 5-fold cross-validation. Although the values of the
% predictors have the same range, to avoid numerical difficulties during
% training, standardize the predictors. Also, optimize the ECOC coding
% design and the SVM box constraint.  To illustrate, use all combinations
% of these values:
%
% * For the ECOC coding design, use one-versus-one and one-versus-all.
% * For the SVM box constraint, use three logarithmically-spaced values
% from 0.1 to 100 each.
%
% For all models, store the 5-fold cross-validated misclassification rates.
coding = {'onevsone' 'onevsall'};
boxconstraint = logspace(-1,2,3);
cvLoss = nan(numel(coding),numel(boxconstraint)); % For preallocation

for i = 1:numel(coding)
    for j = 1:numel(boxconstraint)
        t = templateSVM('BoxConstraint',boxconstraint(j),'Standardize',true);
        CVMdl = fitcecoc(X(idxTrn,:),Y(idxTrn),'Learners',t,'KFold',5,...
            'Coding',coding{i});
        cvLoss(i,j) = kfoldLoss(CVMdl);
        fprintf('cvLoss = %f for model using %s coding and box constraint=%f\n',...
            cvLoss(i,j),coding{i},boxconstraint(j))
    end
end
%%
% Determine the hyperparameter indices that yield the minimal
% misclassification rate. Train an ECOC model using the training data.
% Standardize the training data and supply the observed, optimal
% hyperparameter combination.
minCVLoss = min(cvLoss(:))
linIdx = find(cvLoss == minCVLoss);
[bestI,bestJ] = ind2sub(size(cvLoss),linIdx);
bestCoding = coding{bestI}
bestBoxConstraint = boxconstraint(bestJ)

t = templateSVM('BoxConstraint',bestBoxConstraint,'Standardize',true);
Mdl = fitcecoc(X(idxTrn,:),Y(idxTrn),'Learners',t,'Coding',bestCoding);
%%
% Construct a confusion matrix for the test set images.
testImages = X(idxTest,:);
testLabels = predict(Mdl,testImages);
confustionMatrix = confusionmat(Y(idxTest),testLabels,'Order',Mdl.ClassNames)
%%
% Rows of |confusionMatrix| correspond to true labels, and columns
% correspond to predicted labels.  The order of the rows and columns
% correspond to the order of the classes in |Mdl.ClassNames|.
% |confusionMatrix(i,j)| is the number of test set images that actually
% contain the digit |Mdl.ClassNames(i)|, and |Mdl| guessed digit
% |Mdl.ClassNames(j)|. Therefore, diagonal elements indicate correct
% classification. |Mdl| seems to correctly classify most images.
%%
% If you are satisfied with the performance of |Mdl|, then you can proceed
% to generate code for prediction. Otherwise, you can continue adjusting
% hyperparameters.  For example, you can try training the SVM learners
% using different kernel functions.

%% Save Classification Model to Disk
% |Mdl| is a predictive classification model, but you must prepare it for
% code generation.  Save |Mdl| to your present working directory using
% |saveCompactModel|.
saveCompactModel(Mdl,'DigitImagesECOC');
%%
% |saveCompactModel| compacts |Mdl|, converts it to a structure array, and
% saves it in the MAT-file |DigitImagesECOC.mat|.

%% Declare Prediction Function for Code Generation
% Declare the MATLAB function |predictDigitECOC.m|.  The function should:
%
% * Include the code generation directive |%#codegen| somewhere in the
% function.
% * Accept image data commensurate with |X|
% * Load |DigitImagesECOC.mat| using |loadCompactModel|
% * Return predicted labels
%
% <include> predictDigitECOC.m </include>
%
%%
% Verify that the prediction function returns the same test set labels as
% |predict|.
pfLabels = predictDigitECOC(testImages);
verifyPF = sum(pfLabels == testLabels) == numel(testLabels)
%%
% The number of matching labels equals the test-set size, and so the
% |predictDigitECOC| yields the expected results.
%% Set Up Your C Compiler 
% To generate C code, you must have access to a C compiler, and the
% compiler must be configured properly. For more details, see
% <docid:coder_gs.bsc70tk-2 Setting Up Your C Compiler>.
%%
% Select a C compiler using <docid:matlab_ref.btw17rw-1 |mex|>.
mex -setup

%% Decide Which Environment to Execute Generated Code
% Generated code can run:
%
% * Inside the MATLAB environment as a C-MEX file
% * Outside the MATLAB environment as a standalone executable 
% * Outside the MATLAB environment as a shared utility linked to another
% standalone executable
%
%%
% This example generates a MEX file to be run in the MATLAB environment.
% Generating such a MEX file allows you to analyze and verify the input and
% output arguments of the MEX function using MATLAB tools before deploying
% the function outside the MATLAB environment.  In the MEX function, you
% can include code for verification, but not for code generation by
% declaring the commands as extrinsic using <docid:coder_ref.brdjlmz-1
% |coder.extrinsic|>. Extrinsic commands can include functions that do not
% have code generation support.  All extrinsic commands in the MEX function
% run in MATLAB, but |codegen| does not generate code for them.
%%
% If you plan to deploy the code outside the MATLAB environment, then you
% must generate a standalone executable.  One way to specify your compiler
% choice is by using the |-config| option of |codegen|.  For example, to
% generate a static C executable, specify |-config:exe| when you call
% codegen. For more details on setting code generation options, see the
% |-config| option of <docid:coder_ref.br46oyi-1 |codegen|>.

%% Compile MATLAB Function to MEX File
% Compile |predictDigitECOC.m| to a MEX file using |codegen|.  Specify
% these options:
%
% * '-report' -- Generates a compilation report that identifies the
% original MATLAB code and the associated files that |codegen| creates
% during code generation. 
% * '-args' -- MATLAB Coder requires that you specify the properties of all
% the input arguments of the function. One way to do this is to provide
% |codegen| with an example of input values.  Consequently, MATLAB Coder
% infers the properties from the example values.  Specify the test set
% images commensurate with |X|.
%
codegen predictDigitECOC -report -args testImages
%%
% |codegen| successfully generated the code for the prediction function.
% You can view the report by clicking the link at the command line.  If
% code generation was unsuccessful, then the report can be help you debug.
%%
% |codegen| creates the directory |pwd/codegen/mex/predictDigitECOC|, where
% |pwd| is your present working directory.  In the child directory,
% |codegen| generates, among other things, the MEX-file
% |predictDigitECOC_mex.mexw64|.
%%
% Verify that the MEX file returns the same labels as |predict|.
mexLabels = predictDigitECOC_mex(testImages);
verifyMEX = sum(mexLabels == testLabels) == numel(testLabels)
%%
% The number of matching labels equals the test-set size, and so the
% MEX-file yields the expected results.