www.gusucode.com > stats 源码程序 matlab案例代码 > stats/SpecifyPriorProbabilitesWhenTrainingNaiveBayesClassifierExample.m

    %% Specify Prior Probabilites When Training Naive Bayes Classifiers
% Construct a naive Bayes classifier for Fisher's iris data set. Also, specify
% prior probabilities during training.
%%
% Load Fisher's iris data set.
load fisheriris
X = meas;
Y = species;
classNames = {'setosa','versicolor','virginica'}; % Class order 
%%
% |X| is a numeric matrix that contains four petal measurements for 150
% irises.  |Y| is a cell array of character vectors that contains the corresponding
% iris species.
%%
% By default, the prior class probability distribution is the relative
% frequency distribution of the classes in the data set, which in this case
% is 33% for each species.  However, suppose you know that in the
% population 50% of the irises are setosa, 20% are versicolor, and 30% are
% virginica.  You can incorporate this information by specifying this
% distribution as a prior probability during training.
%%
% Train a naive Bayes classifier.  Specify the class order and
% prior class probability distribution.
prior = [0.5 0.2 0.3];
Mdl = fitcnb(X,Y,'ClassNames',classNames,'Prior',prior)
%%
% |Mdl| is a trained |ClassificationNaiveBayes| classifier, and some of its
% properties appear in the Command Window.  The software treats the
% predictors as independent given a class, and, by default, fits them using
% normal distributions.
%%
% The naive Bayes algorithm does not use the prior class probabilities
% during training.  Therefore, you can specify prior class probabilities
% after training using dot notation.  For example, suppose that you want to
% see the difference in performance between a model that uses the 
% default prior class probabilities and a model that uses |prior|.
%%
% Create a new naive Bayes model based on |Mdl|, and specify that the
% prior class probability distribution is an empirical class distribution.
defaultPriorMdl = Mdl;
FreqDist = cell2table(tabulate(Y));
defaultPriorMdl.Prior = FreqDist{:,3};
%%
% The software normalizes the prior class probabilities to sum to |1|.
%%
% Estimate the cross-validation error for both models using 10-fold cross
% validation.
rng(1); % For reproducibility
defaultCVMdl = crossval(defaultPriorMdl);
defaultLoss = kfoldLoss(defaultCVMdl)
CVMdl = crossval(Mdl);
Loss = kfoldLoss(CVMdl)
%%
% |Mdl| performs better than |defaultPriorMdl|.