How to subset data based on time range

1 view (last 30 days)
Hi,
I am wondering how to subset data based on time range ? For example I would like to subset the maximum number of rows that fall within an 18 hours time range in the table below:
data = {"2017-01-12 09:50:46" "2017-01-15 13:50:46" "2017-01-15 14:50:50" "2017-01-15 17:52:06" "2017-01-15 18:52:22" "2017-01-16 11:52:22" "2017-01-19 09:52:22"}';
t = cell2table(data);
t.data = datetime(t.data, 'InputFormat', 'yyyy-MM-dd HH:mm:ss')
The thing is, I don’t know what is the start point or the end point. All I know is that the correct answer is the most inclusive one (i.e the one with the largest number of rows). In the example here, the correct answer is t.data(2:5) because all the points fall within 18 hours of each other and because it is the answer with the largest number of rows (4). The data is set up in a way where there can only be one correct answer (only one combination has the largest number of rows). I hope I am being clear.
How would I do that ?
Thank you,

Accepted Answer

Star Strider
Star Strider on 5 Mar 2020
Edited: Star Strider on 5 Mar 2020
I am not certain what result you want.
This computes all values of ‘t.data’ that are within 18 hours of a particular row value:
data = {"2017-01-12 09:50:46" "2017-01-15 13:50:46" "2017-01-15 14:50:50" "2017-01-15 17:52:06" "2017-01-15 18:52:22" "2017-01-19 09:52:22"}';
t = cell2table(data);
t.data = datetime(t.data, 'InputFormat', 'yyyy-MM-dd HH:mm:ss');
h18 = t.data + hours(18);
for k = 1:size(t.data,1)
within18{k,:} = t.data((t.data >= t.data(k)) & (t.data <= h18(k)));
end
For example:
Within18_3 = [within18{3}]
contains:
Within18_3 =
3×1 datetime array
15-Jan-2017 14:50:50
15-Jan-2017 17:52:06
15-Jan-2017 18:52:22
The first row in ‘Within18_3’ are the third row value and the next two rows are those within 18 hours of it.
EDIT — (5 Mar 2020 at 17:36)
If you want ±9 hours instead, this works:
h18 = [t.data + hours(-9), t.data + hours(9)];
for k = 1:size(t.data,1)
within18{k,:} = t.data((t.data >= h18(k,1)) & (t.data <= h18(k,2)));
end
For example:
Within18_3 = [within18{3}]
now contains:
Within18_3 =
4×1 datetime array
15-Jan-2017 13:50:46
15-Jan-2017 14:50:50
15-Jan-2017 17:52:06
15-Jan-2017 18:52:22
  2 Comments
012786534
012786534 on 5 Mar 2020
Hi Star Strider,
Indeed, I should have been clearer. My apologies. I have clarified my question above. Thank you.
Star Strider
Star Strider on 5 Mar 2020
To get the set with the greatest number of rows:
[~,idx] = max(cellfun(@(x)size(x,1), within18))
Out = [within18{idx}]
That works however you want to define it, however it will return only the first set if there are several with the same maximum number of rows.
To return all that have the maximum number of rows:
rowsizes = cellfun(@(x)size(x,1), within18);
maxrows = max(rowsizes)
idx = find(rowsizes == maxrows)
Out = [within18{idx}]

Sign in to comment.

More Answers (1)

Guillaume
Guillaume on 5 Mar 2020
It's not too clear what you mean by subsetting.If you want to bin the dates in ranges of 18 hours then use discretize:
discretize(t.data, 'hours(18)') %split into 18 hours bins. Returns the bin indices

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!