www.gusucode.com > matlab 案例源码 matlab代码程序 > matlab/CleanTimetableWithMissingDuplicateOrIrregularTimesExample.m
%% Clean Timetable with Missing, Duplicate, or Nonuniform Times % This example shows how to create a _regular_ timetable from one that has % missing, duplicate, or nonuniform times. A timetable is a type of table % that associates a time-stamp, or _row time_, with each row of data. In a % regular timetable, the row times are sorted and unique, and differ by the same % regular time step. The example also shows how to export the data from a % timetable for use with other functions. % % Timetables can be irregular. They can contain rows that are not sorted by % their row times. Timetables can contain multiple rows with the same row % time, though the rows can have different data values. Even when row times % are sorted and unique, they can differ by time steps of different sizes. % Timetables can even contain |NaT| or |NaN| values to indicate missing row % times. % % Timetables provide a number of different ways to resolve missing, % duplicate, or nonuniform times, and to resample or aggregate data to % regular row times. % % * To find missing row times, use *|ismissing|*. % * To remove missing times and data, use *|rmmissing|*. % * To sort a timetable by its row times, use *|sortrows|*. % * To make a timetable with unique and sorted row times, use *|unique|* and *|retime|*. % * To remove duplicate times, specify a unique time vector and use *|retime|*. % * To make a regular timetable, specify a regular time vector and use *|retime|*. % % <<../timetable_cleaning.png>> % %% Load Timetable % Load a sample timetable from the MAT-file |badTimes| that contains weather % measurements taken over several hours on June 9, 2016. The timetable % includes temperature, rainfall, and wind speed measurements taken at % irregular times throughout that day. load(fullfile(matlabroot,'examples','matlab','badTimes')); TT %% Remove Rows with Missing Times % Remove rows that have |NaT|, or a missing value, as the row time. To find % missing values in the vector of row times, use the |ismissing| function. % |ismissing| returns a logical vector that contains |1| wherever |TT.Time| % has a missing value. Index back into the timetable to keep only those rows % that do not have missing values as row times. Assign those rows to |TT2|. TF = ismissing(TT.Time); TT2 = TT(~TF,:); TT2 %% % This method removes only the rows that have missing row times. The table % variables might still have missing data values. For example, the last row % of |TT2| has |NaN| values for the |Rain| and |Windspeed| variables. % %% Remove Rows with Missing Times or Missing Data % You can remove missing row times and missing data values using the % |rmmissing| function. |rmmissing| removes any timetable row that has a % missing row time, missing data values, or both. % % Display the missing row time and missing data values of |TT|. Then remove % all missing values from |TT|. TT %% TT = rmmissing(TT) %% Sort Timetable and Determine Whether It Is Regular % Determine whether |TT| is sorted. Then, sort the timetable on its row % times using the |sortrows| function. TF = issorted(TT) %% TT = sortrows(TT) %% % Determine whether |TT| is regular. A regular timetable has the same time % interval between consecutive row times. Even a sorted timetable can have % time steps that are not uniform. TF = isregular(TT) %% % Display the differences between row times. diff(TT.Time) %% Remove Duplicate Rows % Timetables can have duplicate rows. Timetable rows are duplicates if they % have the same row times and the same data values. In this example, the % last two rows of |TT| are duplicates. % % To remove the duplicate rows, use the |unique| function. |unique| returns % the unique rows and sorts them by their row times. TT = unique(TT) %% Find Rows with Duplicate Times and Different Data % Timetables can have rows with duplicate row times but different data % values. In this example, |TT| has several rows with the same row times % but different values. % % Find the rows that have duplicate row times. First, sort the row times and % find consecutive times that have no difference between them. Times with % no difference between them are the duplicates. Index back into the vector of % row times and return a unique set of times that identify the duplicate % row times in |TT|. dupTimes = sort(TT.Time); TF = (diff(dupTimes) == 0); dupTimes = dupTimes(TF); dupTimes = unique(dupTimes) %% % Index into the timetable to display the rows with duplicate row times. % When you index on times, the output timetable contains all rows with matching row times. TT(dupTimes,:) %% Select First and Last Rows with Duplicate Times % Select either the first and the last of the rows with duplicate row % times using the |unique| and |retime| functions. % % First, create a vector of unique row times from |TT| using the % |unique| function. uniqueTimes = unique(TT.Time); %% % Select the first row from each set of rows that have duplicate times. TT2 = retime(TT,uniqueTimes) %% % Select the last rows from each set of rows that have duplicate times. % Specify the |'previous'| method of |retime| to copy data from the last row. % When you specify |'previous'|, then |retime| starts at the end of the % vector of row times and stops when it encounters a duplicate row time. % Then it copies the data from that row. TT2 = retime(TT,uniqueTimes,'previous') %% Aggregate Data from All Rows with Duplicate Times % Aggregate data from rows that have duplicate row times. For example, you % can calculate the means of several measurements of the same quantity taken % at the same time. % % Calculate the mean temperature, rainfall, and wind speed for rows with % duplicate row times using the |retime| function. TT = retime(TT,uniqueTimes,'mean') %% Make Timetable Regular % Create a regular timetable using |retime|. Interpolate the data onto a % regular hourly time vector. To use linear interpolation, specify |'linear'|. % Each row time in |TT| begins on the hour, and there is a one-hour interval % between consecutive row times. TT = retime(TT,'hourly','linear') %% Extract Regular Timetable Data % You can export the timetable data for use with functions to analyze data % that is regularly spaced in time. For example, the Econometrics % Toolbox(TM) % and the Signal Processing Toolbox(TM) have functions you can use for further % analysis on regularly spaced data. % % Extract the timetable data as an array. You can use the |Variables| % property to return the data as an array when the table variables can be % concatenated. A = TT.Variables; A(1:5,:) %% % |TT.Variables| is equivalent to using curly braces to access all % variables. A2 = TT{:,:}; A2(1:5,:)