slow for and while loops

Hi all, I'm running those lines that suppose to run on two tables (T2 and T1) and check if the difference in the first column is less than coinc_win. both columns in the tables are data from the same clock and I want the code to match line from table T2 to the correspond lines in table T1 and write the lines to T3. this is what I've written so far, but it runs pretty slow, anyone has suggestions how to improve it?
coinc_win = 500;
correction = 150;
j = 1;
r = 1;
tic
for i = 1: height(T2(:,1))
while j <= height(T1(:,1))
while (T2.time(i)+correction) - T1.time(j) > coinc_win || T2.time(i)+correction > T1.time(j)
j = j+1;
break;
end
while T2.time(i) - T1.time(j) < coinc_win && (T2.time(i)+correction) > T1.time(j)
T3.deltaT(r) = T2.time(i) - T1.time(j);
T3.energy_G(r) = T2.E(i);
T3.energy_S(r) = T1.E(j);
r= r+1;
j = j+1;
end
if T2.time(i) < T1.time(j)
break;
end
end
end
toc

12 Comments

No real thoughts right off the bat. You could try the "Run and Time" profiler, see which functions eat up most of your time, and then look to improve those.
There are a few things I don't understand about your code. Maybe you can clarify on these:
coinc_win = 500;
correction = 150;
j = 1;
r = 1;
tic
for i = 1: height(T2(:,1))
while j <= height(T1(:,1))
while (T2.time(i)+correction) - T1.time(j) > coinc_win || T2.time(i)+correction > T1.time(j)
% What is the purpose of this loop? It always breaks after the first Iteration, making it more like
% an if statement. If that is intended, use if statement instead.
% Also, part of the condition is redundant.
% If the first part (T2.time(i)+correction) - T1.time(j) > coinc_win evaluates true, the second
% part must always be true anyway. The first part has therefore no affect since the second part already
% includes this case. Check the condition!
% What is the purpose behind this loop?
j = j+1;
break;
end
while T2.time(i) - T1.time(j) < coinc_win && (T2.time(i)+correction) > T1.time(j)
T3.deltaT(r) = T2.time(i) - T1.time(j);
T3.energy_G(r) = T2.E(i);
T3.energy_S(r) = T1.E(j);
r= r+1;
j = j+1;
end
if T2.time(i) < T1.time(j)
break;
end
end
end
toc
sani
sani on 21 Jan 2020
Hi, thanks for your help.
So the general purpose is to point on one T2 row at a time and search the matching row in T1. I'd lite that the index of T1 advance only when the T1(j) value is over T2(i), else I'd like to promote i and rescan from where j stoped. and if the condition occurs then I'd like to keep the date from both of the tables in one different table.
the while loop supposes to prevent i from advanced before j arrive at the values where it close to the value of T2(i), is IF is much faster than while for loops (generally speaking)?
and about the redundant condition you are right, I didn't notice that, thanks!
So, first of all, you should be aware that matlab isn't really made for iterating through arrays with this sort of loops, instead it excels at using vectorization to speed things up. But if your tables aren't too huge, your code shouldn't be too slow. Maybe you can publish some example data. It would help to see the code in action and get a feeling for what you mean by "pretty slow".
IF is not a loop, it is just for checking a condition. In your case the while loop (under which I commented above) checks a condition, increments j and than breaks immediately. This is nothing else than checking a condition. This is more abour readibility of your code then it is about performance.
But I have to add that your code has some issues:
You increment j in your while loop
while T2.time(i) - T1.time(j) < coinc_win && (T2.time(i)+correction) > T1.time(j)
T3.deltaT(r) = T2.time(i) - T1.time(j);
T3.energy_G(r) = T2.E(i);
T3.energy_S(r) = T1.E(j);
r= r+1;
j = j+1;
end
but you are never checking if T1.time is long enough. This can lead to exceeds matrix dimensions errors when reaching the bottom of your table.
Another potential error is, that j is only increment within while loops. Problem with this is that it is difficult to make sure, that your outer while loop ( while j <= height(T1(:,1)) ) is not getting stuck in an infinite loop. I think in your case you can get away with this, because of the structure of your data, but it really is a dangerous issue.
Another issue you might want to think about is that some of the values from T2 are skipped if they are to close to each other (closer than correction variable). Did you check the values in T3 if they show what you would expect?
Maybe you can write out some of the data you expect for a small example?
@sani, can you explain what you want to do, not in term of code but in term of goal. I.E don't tell us you want to iterate but tell us, I want to extract xxx from T2 where its time matches that of T1 and calculate the mean of yyy or some such.
It is most likely that whatever you want to do can be achieved more easily without any loop.
Also, note that height(T2(:,1)) is a convoluted (and marginally slower) way of writing height(T2).
sani
sani on 21 Jan 2020
Hi, firstly thanks for everyon's help!
I'm trying to match events that accure in 2 different files symultanusly, within a window of some time (coinc_win) and with some correction of time for tollerance incase that the data from the "slower file" arrived faster than the "faster file"
besicly, in analogy, there is a machine gun and a pistol shooting while i'm opening a stopwatch, and I want to copy to a different array/table the lines of both of them when they shoot in a defined window frame.
hope this explenation help.
those are the data dets for T1 and T2, the 1st column is the time.
and again, thanks!
I assume you are trying to match the two tables by timestamps.
If that is true, you can create two timetables instead of tables.
Thereafter use the synchronize (R2016b onwards) function to join the two tables together.
The function allows you control how exactly the two tables are synchronised.
Thereafter it should be simpler to compare the data from two tables without any loops.
Yes, as Mohammad stated timetables and synchronize should be able to help. I assume that the time is the first column in your two files but it's not clear what encoding it uses. Can you explain?
Otherwise, ismembertol would be a lot faster than than your double while loops. But in the first place, I'd recommend going with timetables as they have other benefits that may be of use to you.
sani
sani on 24 Jan 2020
thanks, I think this can be a proper solution!
the 1st column is time in nanosec from the beginning of the measurement. it seems like using timetable in my case is not straightforward because of the time format, is there's a way to determine float as time format?
timetables can accept either datetime format or duration format. you can convert to duration as follows.
H = 0;
MI = 0;
S = 0;
MS = NS ./ 1000; % NS can be an array of nanoseconds
D = duration(H,MI,S,MS)
sani
sani on 24 Jan 2020
when I try to applay thet the times not fit.
the first line in T2 should have timestemp of 0.084241514 sec and instead it is 00:01:24
A much simpler way to convert a numeric array in nanoseconds to a duration type is with:
dur = seconds(NS * 1e-9);
See example code in my answer.

Sign in to comment.

 Accepted Answer

Guillaume
Guillaume on 24 Jan 2020
Edited: Guillaume on 24 Jan 2020
After looking at it a bit more closely, what you're trying to do is a bit too advanced for synchronize. I would still recommend that you convert to timetable at the end, but for your synchronisation, you'll have to do it manually. Easiest is with ismembertol:
matchwindow = 500; %matching window within which two events are considered equal, in nanoseconds.
S_offset = 150; %offset for S table, in nanoseconds.
G = readtable('T1.txt'); %I would recommend a better name than T1 or G. A name that clearly describes what the table represent
G.Properties.VariableNames = {'Time_G', 'Energy_G'};
S = readtable('T2.txt'); %I would recommend a better name than T1 or S. A name that clearly describes what the table represent
S.Properties.VariableNames = {'Time_S', 'Energy_S'}; %using different variable names so that the tables can be horizontally concatenated
%find intersection of time within the matchwindow and merged the matching rows into a new table
[ismatch, matches] = ismembertol(S.Time_S + S_offset, G.Time_G, matchwindow, 'DataScale', 1); %DataScale is one to make the tolerance absolute
merged = [G(matches(ismatch), :), S(ismatch, :)];
merged.Delta = merged.Time_S - merged.Time_G;
%convert to timetable
merged = convertvars(merged, {'Time_G', 'Time_S', 'Delta'}, @(var) seconds(var * 1e-9));
merged = table2timetable(merged, 'RowTimes', 'Time_G')
edit: spelling

2 Comments

sani
sani on 24 Jan 2020
great, it seems to be working!
there is a way I'll be able to see the time format as HH:mm:sssssssss?
HH:mm:sssssssss is not a valid duration format. But you can certainly set the format of the durations to anything valid, e.g.:
merged.Time_G.Format = 'hh:mm:ss.SSSSSSSSS'

Sign in to comment.

More Answers (0)

Categories

Asked:

on 21 Jan 2020

Edited:

on 24 Jan 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!