www.gusucode.com > stats 源码程序 matlab案例代码 > stats/BootstrapResamplingExample.m
%% Bootstrap Resampling % % Copyright 2015 The MathWorks, Inc. %% % The bootstrap procedure involves choosing random % samples with replacement from a data set and analyzing each sample % the same way. Sampling with replacement means that each observation % is selected separately at random from the original dataset. So a particular % data point from the original data set could appear multiple times % in a given bootstrap sample. The number of elements in each bootstrap % sample equals the number of elements in the original data set. The % range of sample estimates you obtain enables you to establish the % uncertainty of the quantity you are estimating. %% % This example from Efron and Tibshirani compares Law School Admission Test % (LSAT) scores and subsequent law school grade point average (GPA) for a % sample of 15 law schools. load lawdata plot(lsat,gpa,'+') lsline %% % The least-squares fit line indicates that higher LSAT scores go with % higher law school GPAs. But how certain is this conclusion? The plot % provides some intuition, but nothing quantitative. %% % You can calculate the correlation coefficient of the variables using the % |corr|function. rhohat = corr(lsat,gpa) %% % Now you have a number describing the positive connection between LSAT and % GPA; though it may seem large, you still do not know if it is % statistically significant. %% % Using the |bootstrp| function you can resample the |lsat| and |gpa| % vectors as many times as you like and consider the variation in the % resulting correlation coefficients. rng default % For reproducibility rhos1000 = bootstrp(1000,'corr',lsat,gpa); %% % This resamples the |lsat| and |gpa| vectors 1000 times and computes the % |corr| function on each sample. You can then plot the result in a % histogram. histogram(rhos1000,30,'FaceColor',[.8 .8 1]) %% % Nearly all the estimates lie on the interval [0.4 1.0]. %% % It is often desirable to construct a confidence interval for a parameter % estimate in statistical inferences. Using the |bootci| function, you can % use bootstrapping to obtain a confidence interval for the |lsat| and % |gpa| data. ci = bootci(5000,@corr,lsat,gpa) %% % Therefore, a 95% confidence interval for the correlation coefficient % between LSAT and GPA is [0.33 0.94]. This is strong quantitative evidence % that LSAT and subsequent GPA are positively correlated. Moreover, this % evidence does not require any strong assumptions about the probability % distribution of the correlation coefficient. %% % Although the |bootci| function computes the Bias % Corrected and accelerated (BCa) interval as the default type, it is % also able to compute various other types of bootstrap confidence intervals, % such as the studentized bootstrap confidence interval.