www.gusucode.com > matlab 案例源码 matlab代码程序 > matlab/CleanTimetableWithMissingDuplicateOrIrregularTimesExample.m

    %% Clean Timetable with Missing, Duplicate, or Nonuniform Times
% This example shows how to create a _regular_ timetable from one that has
% missing, duplicate, or nonuniform times. A timetable is a type of table 
% that associates a time-stamp, or _row time_, with each row of data. In a 
% regular timetable, the row times are sorted and unique, and differ by the same 
% regular time step. The example also shows how to export the data from a 
% timetable for use with other functions.
% 
% Timetables can be irregular. They can contain rows that are not sorted by
% their row times. Timetables can contain multiple rows with the same row
% time, though the rows can have different data values. Even when row times
% are sorted and unique, they can differ by time steps of different sizes.
% Timetables can even contain |NaT| or |NaN| values to indicate missing row
% times.
%
% Timetables provide a number of different ways to resolve missing,
% duplicate, or nonuniform times, and to resample or aggregate data to
% regular row times.
%
% * To find missing row times, use *|ismissing|*.
% * To remove missing times and data, use *|rmmissing|*.
% * To sort a timetable by its row times, use *|sortrows|*.
% * To make a timetable with unique and sorted row times, use *|unique|* and *|retime|*.
% * To remove duplicate times, specify a unique time vector and use *|retime|*.
% * To make a regular timetable, specify a regular time vector and use *|retime|*.
%
% <<../timetable_cleaning.png>>
%

%% Load Timetable
% Load a sample timetable from the MAT-file |badTimes| that contains weather
% measurements taken over several hours on June 9, 2016. The timetable
% includes temperature, rainfall, and wind speed measurements taken at
% irregular times throughout that day.
load(fullfile(matlabroot,'examples','matlab','badTimes'));
TT

%% Remove Rows with Missing Times
% Remove rows that have |NaT|, or a missing value, as the row time. To find 
% missing values in the vector of row times, use the |ismissing| function. 
% |ismissing| returns a logical vector that contains |1| wherever |TT.Time| 
% has a missing value. Index back into the timetable to keep only those rows 
% that do not have missing values as row times. Assign those rows to |TT2|.
TF = ismissing(TT.Time);
TT2 = TT(~TF,:);
TT2

%%
% This method removes only the rows that have missing row times. The table
% variables might still have missing data values. For example, the last row
% of |TT2| has |NaN| values for the |Rain| and |Windspeed| variables.
%
%% Remove Rows with Missing Times or Missing Data
% You can remove missing row times and missing data values using the
% |rmmissing| function. |rmmissing| removes any timetable row that has a
% missing row time, missing data values, or both.
%
% Display the missing row time and missing data values of |TT|. Then remove
% all missing values from |TT|.
TT

%%
TT = rmmissing(TT)


%% Sort Timetable and Determine Whether It Is Regular
% Determine whether |TT| is sorted. Then, sort the timetable on its row
% times using the |sortrows| function.
TF = issorted(TT)

%%
TT = sortrows(TT)

%%
% Determine whether |TT| is regular. A regular timetable has the same time
% interval between consecutive row times. Even a sorted timetable can have
% time steps that are not uniform.
TF = isregular(TT)

%% 
% Display the differences between row times.
diff(TT.Time)

%% Remove Duplicate Rows
% Timetables can have duplicate rows. Timetable rows are duplicates if they
% have the same row times and the same data values. In this example, the
% last two rows of |TT| are duplicates.
%
% To remove the duplicate rows, use the |unique| function. |unique| returns
% the unique rows and sorts them by their row times.
TT = unique(TT)

%% Find Rows with Duplicate Times and Different Data
% Timetables can have rows with duplicate row times but different data
% values. In this example, |TT| has several rows with the same row times
% but different values.
%
% Find the rows that have duplicate row times. First, sort the row times and
% find consecutive times that have no difference between them. Times with
% no difference between them are the duplicates. Index back into the vector of
% row times and return a unique set of times that identify the duplicate
% row times in |TT|.
dupTimes = sort(TT.Time);
TF = (diff(dupTimes) == 0);
dupTimes = dupTimes(TF);
dupTimes = unique(dupTimes)

%%
% Index into the timetable to display the rows with duplicate row times.
% When you index on times, the output timetable contains all rows with matching row times.
TT(dupTimes,:)

%% Select First and Last Rows with Duplicate Times
% Select either the first and the last of the rows with duplicate row
% times using the |unique| and |retime| functions.
%
% First, create a vector of unique row times from |TT| using the
% |unique| function. 
uniqueTimes = unique(TT.Time);

%%
% Select the first row from each set of rows that have duplicate times.
TT2 = retime(TT,uniqueTimes)

%%
% Select the last rows from each set of rows that have duplicate times.
% Specify the |'previous'| method of |retime| to copy data from the last row.
% When you specify |'previous'|, then |retime| starts at the end of the
% vector of row times and stops when it encounters a duplicate row time.
% Then it copies the data from that row.
TT2 = retime(TT,uniqueTimes,'previous')

%% Aggregate Data from All Rows with Duplicate Times
% Aggregate data from rows that have duplicate row times. For example, you
% can calculate the means of several measurements of the same quantity taken 
% at the same time.
%
% Calculate the mean temperature, rainfall, and wind speed for rows with
% duplicate row times using the |retime| function.
TT = retime(TT,uniqueTimes,'mean')

%% Make Timetable Regular
% Create a regular timetable using |retime|. Interpolate the data onto a
% regular hourly time vector. To use linear interpolation, specify |'linear'|.
% Each row time in |TT| begins on the hour, and there is a one-hour interval 
% between consecutive row times.
TT = retime(TT,'hourly','linear')

%% Extract Regular Timetable Data
% You can export the timetable data for use with functions to analyze data
% that is regularly spaced in time. For example, the Econometrics
% Toolbox(TM)
% and the Signal Processing Toolbox(TM) have functions you can use for further
% analysis on regularly spaced data.
%
% Extract the timetable data as an array. You can use the |Variables|
% property to return the data as an array when the table variables can be
% concatenated.
A = TT.Variables;
A(1:5,:)

%%
% |TT.Variables| is equivalent to using curly braces to access all
% variables.
A2 = TT{:,:};
A2(1:5,:)