www.gusucode.com > database 案例源码程序 matlab代码 > database/AnalyzeLargeDataSetsDatabaseMapReduceExample.m
%% Analyze Large Data in Database Using MapReduce % This example shows how to analyze large data sets that are stored in a % database. You can access large data sets using a % |<docid:database_ug.bufomil DatabaseDatastore>| object with Database % Toolbox(TM). After creating a |DatabaseDatastore| object, you can run % algorithms on large data sets using a tall array. For an example of using % a |DatabaseDatastore| object with tall arrays, see % <docid:database_examples.example-ex44807149>. Alternatively, you can % write a MapReduce algorithm that defines the chunking and reduction of % the data. % % This example uses MapReduce to determine the mean arrival delay of a % large set of flight data that is stored in a database. This example % modifies the <docid:import_export.bujibs2> example to use a % |DatabaseDatastore| instead of a <docid:matlab_ref.budsjo2-1>. You can % use MapReduce to modify other MATLAB(R) examples that analyze data, as % described in <docid:import_export.buhnu4_>. % % The |DatabaseDatastore| object does not support using a parallel pool % with Parallel Computing Toolbox(TM) installed. To analyze data using tall % arrays or run MapReduce algorithms, set the global execution environment % to be the local MATLAB(R) session. %% Create |DatabaseDatastore| Object % Set the global execution environment to be the local MATLAB(R) session. mapreducer(0); %% % The file |airlinesmall.csv| contains the large set of flight data. Load % this file into a Microsoft(R) SQL Server(R) database table % |airlinesmall|. This table contains 123,523 records. %% % Using a JDBC driver, create a database connection |conn| to a % Microsoft(R) SQL Server(R) database with Windows(R) authentication. % Specify a blank user name and password. Here, the code assumes that you % are connecting to a database |toy_store|, a database server |dbtb04|, and % port number |54317|. conn = database('toy_store','','','Vendor','Microsoft SQL Server', ... 'Server','dbtb04','PortNumber',54317,'AuthType','Windows'); %% % Create a |DatabaseDatastore| object |dbds| using the database connection % |conn| and SQL query |sqlquery|. This SQL query retrieves arrival-delay % data from the table |airlinesmall|. sqlquery = 'select ArrDelay from airlinesmall'; dbds = databaseDatastore(conn,sqlquery); %% Define Mapper and Reducer Functions % To process large data sets in chunks, you can write your own mapper % function. This example uses the mapper function % |meanArrivalDelayMapper.m|. This function reads arrival-delay data from % the |DatabaseDatastore| object, determines the number of delays and the % total delay in the chunk, and stores both values in % <docid:matlab_ref.buikx8i-1>. Display the mapper function file. % % <include>meanArrivalDelayMapper.m</include> % %% % To process large data sets in chunks, you can write your own reducer % function. This example uses the reducer function % |meanArrivalDelayReducer.m|. This reducer function reads intermediate % values for the number of delays and the total arrival delay. Then, this % function determines the overall mean arrival delay. |mapreduce| calls % this reducer function once since the mapper function adds only one key to % |KeyValueStore|. Display the reducer function file. % % <include>meanArrivalDelayReducer.m</include> % %% Run MapReduce Using Mapper and Reducer Functions % To determine the mean arrival delay in the flight data, run MapReduce % with the |DatabaseDatastore| object |dbds|, mapper function % |meanArrivalDelayMapper|, and reducer function |meanArrivalDelayReducer|. outds = mapreduce(dbds,@meanArrivalDelayMapper,@meanArrivalDelayReducer); %% Display Output from MapReduce % Read the table |outtab| from the output datastore |outds| using % |readall|. outtab = readall(outds) %% % The table has only one row containing one key-value pair. %% % Display the mean arrival delay |meanArrDelay| from the table |outtab|. meanArrDelay = outtab.Value{1} %% Close |DatabaseDatastore| Object and Database Connection close(dbds)