www.gusucode.com > stats 源码程序 matlab案例代码 > stats/FindGoodLassoPenaltyUsingEdgeExample.m
%% Find Good Lasso Penalty Using Edge % To determine a good lasso-penalty strength for a linear classification % model that uses a logistic regression learner, compare test-sample edges. % %% % Load the NLP data set. Preprocess the data as in % <docid:stats_ug.bu6xpem-1>. load nlpdata Ystats = Y == 'stats'; X = X'; Partition = cvpartition(Ystats,'Holdout',0.30); testIdx = test(Partition); XTest = X(:,testIdx); YTest = Ystats(testIdx); %% % Create a set of 11 logarithmically-spaced regularization strengths from % $10^{-8}$ through $10^{1}$. Lambda = logspace(-8,1,11); %% % Train binary, linear classification models that use each of the % regularization strengths. Solve the objective function using SpaRSA. % Lower the tolerance on the gradient of the objective function to |1e-8|. % rng(10); % For reproducibility CVMdl = fitclinear(X,Ystats,'ObservationsIn','columns',... 'CVPartition',Partition,'Learner','logistic','Solver','sparsa',... 'Regularization','lasso','Lambda',Lambda,'GradientTolerance',1e-8) %% % Extract the trained linear classification model. Mdl = CVMdl.Trained{1} %% % |Mdl| is a |ClassificationLinear| model object. Because |Lambda| is a % sequence of regularization strengths, you can think of |Mdl| as 11 % models, one for each regularization strength in |Lambda|. %% % Estimate the test-sample edges. e = edge(Mdl,X(:,testIdx),Ystats(testIdx),'ObservationsIn','columns') %% % Because there are 11 regularization strengths, |e| is a 1-by-11 vector of % edges. %% % Plot the test-sample edges for each regularization strength. Identify % the regularization strength that maximizes the edges over the grid. figure; plot(log10(Lambda),log10(e),'-o') [~, maxEIdx] = max(e); maxLambda = Lambda(maxEIdx); hold on plot(log10(maxLambda),log10(e(maxEIdx)),'ro'); ylabel('log_{10} test-sample edge') xlabel('log_{10} Lambda') legend('Edge','Max edge') hold off %% % Several values of |Lambda| yield similarly high edges. Higher values of % lambda lead to predictor variable sparsity, which is a good quality of a % classifier. %% % Choose the regularization strength that occurs just before % the edge starts decreasing. LambdaFinal = Lambda(5); %% % Train a linear classification model using the entire data set and specify % the regularization strength yeilding the maximal edge. MdlFinal = fitclinear(X,Ystats,'ObservationsIn','columns',... 'Learner','logistic','Solver','sparsa','Regularization','lasso',... 'Lambda',LambdaFinal); %% % To estimate labels for new observations, pass |MdlFinal| and the new data % to |predict|.