www.gusucode.com > matlab 案例源码 matlab代码程序 > matlab/ComputeMeanValueWithMapReduceExample.m
%% Compute Mean Value with MapReduce % This example shows how to compute the mean of a single variable in a % data set using |mapreduce|. It demonstrates a simple use of |mapreduce| % with one key, minimal computation, and an intermediate state % (accumulating intermediate sum and count). % Copyright 1984-2014 The MathWorks, Inc. %% Prepare Data % Create a datastore using the |airlinesmall.csv| data set. This % 12-megabyte data set contains 29 columns of flight information for % several airline carriers, including arrival and departure times. In this % example, select |ArrDelay| (flight arrival delay) as the variable of % interest. ds = datastore('airlinesmall.csv', 'TreatAsMissing', 'NA'); ds.SelectedVariableNames = 'ArrDelay'; %% % The datastore treats |'NA'| values as missing, and replaces the missing % values with |NaN| values by default. Additionally, the % |SelectedVariableNames| property allows you to work with only the % selected variable of interest, which you can verify using |preview|. preview(ds) %% Run MapReduce % The |mapreduce| function requires a map function and a reduce function as % inputs. The mapper receives chunks of data and outputs intermediate % results. The reducer reads the intermediate results and produces a final % result. %% % In this example, the mapper finds the count and sum of the arrival delays % in each chunk of data. The mapper then stores these values as the % intermediate values associated with the key |'PartialCountSumDelay'|. %% % Display the map function file. % % <include>meanArrivalDelayMapper.m</include> % %% % The reducer accepts the count and sum for each chunk stored by the % mapper. It sums up the values to obtain the total count and total sum. % The overall mean arrival delay is a simple division of the values. % |mapreduce| only calls this reducer once, since the mapper only adds a % single unique key. The reducer uses |add| to add a single key-value pair % to the output. %% % Display the reduce function file. % % <include>meanArrivalDelayReducer.m</include> % %% % Use |mapreduce| to apply the map and reduce functions to the datastore, % |ds|. meanDelay = mapreduce(ds, @meanArrivalDelayMapper, @meanArrivalDelayReducer); %% % |mapreduce| returns a datastore, |meanDelay|, with files in the % current folder. %% % Read the final result from the output datastore, |meanDelay|. readall(meanDelay)