www.gusucode.com > stats_featured 源码程序 matlab案例代码 > stats_featured/customdist1demo.m

    %% Fitting Custom Univariate Distributions
% This example shows how to use the Statistics and Machine Learning Toolbox(TM)
% function |mle| to fit custom distributions to univariate data.
%
% Using |mle|, you can compute maximum likelihood parameter estimates, 
% and estimate their precision, for many kinds of distributions beyond 
% those for which the Toolbox provides specific fitting functions.
%
% To do this, you need to define the distribution using one or more M
% functions.  In the simplest cases, you can write code to compute the
% probability density function (PDF) for the distribution that you want to
% fit, and |mle| will do most of the remaining work for you.  This example
% covers those cases.  In problems with censored data, you must also write
% code to compute the cumulative distribution function (CDF) or the
% survival function (SF).  In some other problems, it may be advantageous
% to define the log-likelihood function (LLF) itself.  The second part of
% this example, <docid:stats_examples.example-ex99562812>, covers both of those latter cases.

%   Copyright 2004-2014 The MathWorks, Inc.



%% Fitting Custom Distributions:  A Zero-Truncated Poisson Example
% Count data are often modelled using a Poisson distribution, and you can
% use the Statistics and Machine Learning Toolbox function |poissfit| to
% fit a Poisson model. However, in some situations, counts that are zero do
% not get recorded in the data, and so fitting a Poisson distribution is
% not straight-forward because of those "missing zeros".  This example will
% show how to fit a Poisson distribution to zero-truncated data, using the
% function |mle|.

%%
% For this example, we'll use simulated data from a zero-truncated Poisson
% distribution. First, we generate some random Poisson data.
rng(18,'twister');
n = 75;
lambda = 1.75;
x = poissrnd(lambda,n,1);

%%
% Next, we remove all the zeros from the data to simulate the truncation.
x = x(x > 0);
length(x)

%%
% Here's a histogram of these simulated data.  Notice that the data look
% reasonably like a Poisson distribution, except that there are no zeros.
% We will fit them with a distribution that is identical to a Poisson on
% the positive integers, but that has no probability at zero.  In this
% way, we can estimate the Poisson parameter lambda while accounting for
% the "missing zeros".
hist(x,[0:1:max(x)+1]);

%%
% The first step is to define the zero-truncated Poisson distribution by
% its probability function (PF).  We will create a function to compute the
% probability for each point in x, given a value for the Poisson
% distribution's mean parameter lambda.  The PF for a zero-truncated
% Poisson is just the usual Poisson PF, renormalized so that it sums to
% one.  With zero truncation, the renormalization is just 1-Pr{0}.  The
% easiest way to create a function for the PF is to use an anonymous
% function.
pf_truncpoiss = @(x,lambda) poisspdf(x,lambda) ./ (1-poisscdf(0,lambda));

%%
% For simplicity, we have assumed that all the x values given to this
% function will be positive integers, with no checks.  Error checking, or a
% more complicated distribution, would probably take more than a single line
% of code, suggesting that the function should be defined in a separate file.

%%
% The next step is to provide a reasonable rough first guess for the
% parameter lambda.  In this case, we'll just use the sample mean.
start = mean(x)

%%
% We provide |mle| with the data, and with the anonymous function, using
% the 'pdf' parameter. (The Poisson is discrete, so this is really a
% probability function, not a PDF.)  Because the mean parameter of the
% Poisson distribution must be positive, we also specify a lower bound for
% lambda.  |mle| returns the maximum likelihood estimate of lambda, and,
% optionally, approximate 95% confidence intervals for the parameters.
[lambdaHat,lambdaCI] = mle(x, 'pdf',pf_truncpoiss, 'start',start, 'lower',0)

%%
% Notice that the parameter estimate is smaller than the sample mean.
% That's just as it should be, because the maximum likelihood estimate
% accounts for the missing zeros not present in the data.

%%
% We can also compute a standard error estimate for lambda, using the
% large-sample variance approximation returned by |mlecov|.
avar = mlecov(lambdaHat, x, 'pdf',pf_truncpoiss);
stderr = sqrt(avar)


%% Supplying Additional Values to the Distribution Function: A Truncated Normal Example
% It sometimes also happens that continuous data are truncated.  For
% example, observations larger than some fixed value might not be recorded
% because of limitations in the way the data are collected.  This example
% will show how to fit a normal distribution to truncated data, using the
% function |mle|.

%%
% For this example, we simulate data from a truncated normal distribution.
% First, we generate some random normal data.
n = 75;
mu = 1;
sigma = 3;
x = normrnd(mu,sigma,n,1);

%%
% Next, we remove any observations that fall beyond the truncation
% point, xTrunc.  Throughout this example, we'll assume that xTrunc is
% known, and does not need to be estimated.
xTrunc = 4;
x = x(x < xTrunc);
length(x)

%%
% Here's a histogram of these simulated data.  We will fit them with a
% distribution that is identical to a normal for x < xTrunc, but that has
% zero probability above xTrunc.  In this way, we can estimate the normal
% parameters mu and sigma while accounting for the "missing tail".
hist(x,(-10:.5:4));

%%
% As in the previous example, we will define the truncated normal
% distribution by its PDF, and create a function to compute the probability
% density for each point in x, given values for the parameters mu and
% sigma.  With the truncation point fixed and known, the PDF for a
% truncated normal is just the usual normal PDF, truncated, and then
% renormalized so that it integrates to one.  The renormalization is just
% the CDF evaluated at xTrunc.  For simplicity, we'll assume that all x
% values are less than xTrunc, without checking.  We'll use an anonymous
% function to define the PDF.
pdf_truncnorm = @(x,mu,sigma) normpdf(x,mu,sigma) ./ normcdf(xTrunc,mu,sigma);

%%
% The truncation point, xTrunc, is not being estimated, and so it is not
% among the distribution parameters in the PDF function's input argument list.
% xTrunc is also not part of the data vector input argument.  With an
% anonymous function, we can simply refer to the variable xTrunc that has
% already been defined in the workspace, and there is no need to worry
% about passing it in as an additional argument.

%%
% We also need to provide a rough starting guess for the parameter
% estimates.  In this case, because the truncation is not too extreme, the
% sample mean and standard deviation will probably work well.
start = [mean(x),std(x)]

%%
% We provide |mle| with the data, and with the anonymous function, using the
% 'pdf' parameter.  Because sigma must be positive, we also specify lower
% parameter bounds.  |mle| returns the maximum likelihood estimates of mu and
% sigma as a single vector, as well as a matrix of approximate 95%
% confidence intervals for the two parameters.
[paramEsts,paramCIs] = mle(x, 'pdf',pdf_truncnorm, 'start',start, 'lower',[-Inf 0])

%%
% Notice that the estimates of mu and sigma are quite a bit larger than the
% sample mean and standard deviation.  This is because the model fit has
% accounted for the "missing" upper tail of the distribution.

%%
% We can compute an approximate covariance matrix for the parameter
% estimates using |mlecov|.  The approximation is usually reasonable in large
% samples, and the standard errors of the estimates can be approximated by
% the square roots of the diagonal elements.
acov = mlecov(paramEsts, x, 'pdf',pdf_truncnorm)
stderr = sqrt(diag(acov))


%% Fitting a More Complicated Distribution: A Mixture of Two Normals
% Some datasets exhibit bimodality, or even multimodality, and fitting a
% standard distribution to such data is usually not appropriate. However, a
% mixture of simple unimodal distributions can often model such data very
% well.  In fact, it may even be possible to give an interpretation to the
% source of each component in the mixture, based on application-specific
% knowledge.
%
% In this example, we will fit a mixture of two normal distributions to
% some simulated data.  This mixture might be described with the following
% constructive definition for generating a random value:
%
%    First, flip a biased coin.  If it lands heads, pick a value at random
%    from a normal distribution with mean mu_1 and standard deviation
%    sigma_1. If the coin lands tails, pick a value at random from a normal
%    distribution with mean mu_2 and standard deviation sigma_2.

%%
% For this example, we'll generate data from a mixture of Student's t
% distributions rather than using the same model as we are fitting.  This
% is the sort of thing you might do in a Monte-Carlo simulation to test how
% robust a fitting method is to departures from the assumptions of the
% model being fit.  Here, however, we'll fit just one simulated data set.
x = [trnd(20,1,50) trnd(4,1,100)+3];
hist(x,-2.25:.5:7.25);

%%
% As in the previous examples, we'll define the model to fit by creating a
% function that computes the probability density.  The PDF for a mixture of
% two normals is just a weighted sum of the PDFs of the two normal
% components, weighted by the mixture probability.  This PDF is simple
% enough to create using an anonymous function.  The function takes six
% inputs:  a vector of data at which to evaluate the PDF, and the
% distribution's five parameters.  Each component has parameters for its
% mean and standard deviation; the mixture probability makes a total of five.
pdf_normmixture = @(x,p,mu1,mu2,sigma1,sigma2) ...
                         p*normpdf(x,mu1,sigma1) + (1-p)*normpdf(x,mu2,sigma2);

%%
% We'll also need an initial guess for the parameters.  The more parameters
% a model has, the more a reasonable starting point matters.  For this
% example, we'll start with an equal mixture (p = 0.5) of normals, centered
% at the two quartiles of the data, with equal standard deviations.  The
% starting value for standard deviation comes from the formula for the
% variance of a mixture in terms of the mean and variance of each component.
pStart = .5;
muStart = quantile(x,[.25 .75])
sigmaStart = sqrt(var(x) - .25*diff(muStart).^2)
start = [pStart muStart sigmaStart sigmaStart];

%%
% Finally, we need to specify bounds of zero and one for the mixing
% probability, and lower bounds of zero for the standard deviations.  The
% remaining elements of the bounds vectors are set to +Inf or -Inf, to
% indicate no restrictions.
lb = [0 -Inf -Inf 0 0];
ub = [1 Inf Inf Inf Inf];
paramEsts = mle(x, 'pdf',pdf_normmixture, 'start',start, ...
                          'lower',lb, 'upper',ub)
%%
% The default for custom distributions is 200 iterations.
statset('mlecustom')

%%
% Override that default, using an options structure created
% with the  |statset| function.  Also increase the function
% evaluation limit.
options = statset('MaxIter',300, 'MaxFunEvals',600);
paramEsts = mle(x, 'pdf',pdf_normmixture, 'start',start, ...
                          'lower',lb, 'upper',ub, 'options',options)

%%
% It appears that the final iterations to convergence mattered only in the
% last few digits of the result.  Nonetheless, it is always a good idea to
% make sure that convergence has been reached.
%
% Finally, we can plot the fitted density against a probability histogram
% of the raw data, to check the fit visually.
bins = -2.5:.5:7.5;
h = bar(bins,histc(x,bins)/(length(x)*.5),'histc');
h.FaceColor = [.9 .9 .9];
xgrid = linspace(1.1*min(x),1.1*max(x),200);
pdfgrid = pdf_normmixture(xgrid,paramEsts(1),paramEsts(2),paramEsts(3),paramEsts(4),paramEsts(5));
hold on
plot(xgrid,pdfgrid,'-')
hold off
xlabel('x')
ylabel('Probability Density')


%% Using Nested Functions: A Normal Example with Unequal Precisions
% It sometimes happens when data are collected that each observation was
% made with a different precision or reliability.  For example, if several
% experimenters each make a number of independent measurements of the same
% quantity, but each reports only the average of their measurements, the
% reliability of each reported data point will depend on the number of raw
% observations that went into it.  If the original raw data are not
% available, an estimate of their distribution must account for the fact
% that the data that are available, the averages, each have different
% variances.  This model actually has an explicit solution for maximum
% likelihood parameter estimates.  However, for the purposes of
% illustration, we will use |mle| to estimate the parameters.

%%
% Assume that we have 10 data points, where each one is actually the
% average of anywhere from 1 to 8 observations.  Those original
% observations are not available, but we know how many there were for each
% of our data points.  We need to estimate the mean and standard deviation
% of the raw data.
x = [0.25 -1.24 1.38 1.39 -1.43 2.79 3.52 0.92 1.44 1.26];
m = [   8     2    1    3     8    4    2    5    2    4];

%%
% The variance of each data point is inversely proportional to the number
% of observations that went into it, so we will use 1/m to weight the
% variance of each data point in a maximum likelihood fit.
w = 1 ./ m

%%
% In the model we're fitting here, we could define the distribution by its
% PDF, but using a log PDF is somewhat more natural, because the normal PDF
% is of the form
%
%    c .* exp(-0.5 .* z.^2),
%
% and |mle| would have to take the log of the PDF anyway, to compute the
% log-likelihood.  So instead, we will create a function that computes the
% log PDF directly.

%%
% The log PDF function has to compute the log of the probability density
% for each point in x, given values for mu and sigma.  It will also need to
% account for the different variance weights.  Unlike the previous
% examples, this distribution function is a little more complicated than a
% one-liner, and is most clearly written as a separate function in its own
% file. Because the log PDF function needs the observation counts as
% additional data, the most straight-forward way to accomplish this fit is
% to use nested functions.
%
% We've created a separate file for a function called
% <matlab:edit('wgtnormfit.m') |wgtnormfit.m|>.  This function contains
% data initialization, a nested function for the log PDF in the weighted
% normal model, and a call to the |mle| function to actually fit the model.
% Because sigma must be positive, we must specify lower parameter bounds.
% The call to |mle| returns the maximum likelihood estimates of mu and
% sigma in a single vector.
type wgtnormfit.m

%%
% In |wgtnormfit.m|, we pass |mle| a handle to the nested function |logpdf_wn|,
% using the 'logpdf' parameter.  That nested function refers to the observation
% counts, m, in the computation of the weighted log PDF.  Because the vector
% m is defined in its parent function, |logpdf_wn| has access to it, and there
% is no need to worry about passing m in as an explicit input argument.

%%
% We need to provide a rough first guess for the parameter estimates.  In
% this case, the unweighted sample mean and standard deviation should be
% ok, and that's what |wgtnormfit.m| uses.
start = [mean(x),std(x)]

%%
% To fit the model, we run the fitting function.
paramEsts = wgtnormfit

%%
% Notice that the estimate of mu is less than two-thirds that of the sample
% mean.  That's just as it should be:  the estimate is be influenced most
% by the most reliable data points, i.e., the ones that were based on the
% largest number of raw observations.  In this dataset, those points tend
% to pull the estimate down from the unweighted sample mean.


%% Using a Parameter Transformation: The Normal Example (continued)
% In maximum likelihood estimation, confidence intervals for the parameters
% are usually computed using a large-sample normal approximation for the
% distribution of the estimators.  This is often a reasonable assumption,
% but with small sample sizes, it is sometimes advantageous to improve that
% normal approximation by transforming one or more parameters.  In this
% example, we have a location parameter and a scale parameter.  Scale
% parameters are often transformed to their log, and we will do that
% here with sigma.  First, we'll create a new log PDF function, and
% then recompute the estimates using that parameterization.
%
% The new log PDF function is created as a nested function within the
% function <matlab:edit('wgtnormfit2.m') |wgtnormfit2.m|>.  As in the
% first fit, this file contains data initialization, a nested function
% for the log PDF in the weighted normal model, and a call to the |mle|
% function to actually fit the model.  Because sigma can be any positive
% value, log(sigma) is unbounded, and we no longer need to specify lower or
% upper bounds.  Also, the call to |mle| in this case returns both the
% parameter estimates and confidence intervals.
type wgtnormfit2.m

%%
% Notice that |wgtnormfit2.m| uses the same starting point, transformed to
% the new parameterization, i.e., the log of the sample standard deviation.
start = [mean(x),log(std(x))]

%%
[paramEsts,paramCIs] = wgtnormfit2

%%
% Since the parameterization uses log(sigma), we have to transform back to
% the original scale to get an estimate and confidence interval for sigma.
% Notice that the estimates for both mu and sigma are the same as in the
% first fit, because maximum likelihood estimates are invariant to
% parameterization.
muHat = paramEsts(1)
sigmaHat = exp(paramEsts(2))
%%
muCI = paramCIs(:,1)
sigmaCI = exp(paramCIs(:,2))