Processing array where the elements are sometimes min/sec and sometimes hour/min/sec
1 view (last 30 days)
Show older comments
I have some race data of the form
raceTime = {'28:44','54:08','1:02:34','1:58:33'};
Because some times are less than an hour, and some are more, the inputs are in both hh:mm:ss and mm:ss format. It will never be the case that ##:## represents hh:mm.
I'm trying to get the duration (in, say, minutes) of these race times. Thoughts on the most elegant way to process this? (I can think of a few inelegant ways.)
1 Comment
dpb
on 15 Oct 2018
Yeah, the inflexibility of some of the input forms is maddening at times like these, agreed...
Answers (6)
dpb
on 16 Oct 2018
Edited: dpb
on 16 Oct 2018
I won't claim it's elegant and certainly not "most" so, but...
function et=raceDuration(tstring)
% return durations for [hh:]mm:ss input cell string array
hms=cellfun(@(s) split(s,":"),tstring,'uni',0); % get pieces (not all complete)
for i=1:length(hms)
try
et(i)=duration(str2double(hms{i}).');
catch
et(i)=duration([0 str2double(hms{i}).']);
end
end
works for your example data.
Any other way to fixup the missing hours that came to me at least so far seemed more painful than the loop; unfortunately no way to put a try...catch...end construct in a cellfun anonymous function to deal with the missing field.
ALTERNATE
(W/ attribution to Stephen for (again) reminding me sscanf will handle unusual cases more gracefully than I always think will...)
hms=cellfun(@(s) sscanf(s,"%d:"),raceTime,'uni',0)';
te=duration(cell2mat(cellfun(@(x) [zeros(1,3-length(x)) x.'],hms,'uni',0).'));
>> te
te =
4×1 duration array
00:28:44
00:54:08
01:02:34
01:58:33
>>
And, the above "trick" cleans up the original function quite a bit, too...
function et=raceDuration(tstring)
% return durations for [hh:]mm:ss input cell string array
hms=cellfun(@(s) sscanf(s,"%d:"),raceTime,'uni',0)'; % get pieces (not all complete)
N=numel(hms);
te(N,1)=duration(); % preallocate
for i=1:N
try
et(i)=duration(hms{i}.');
catch
et(i)=duration([0 hms{i}.']);
end
end
ADDENDUM
And, of course, you can change the Format property...
>> te.Format='m'
te =
4×1 duration array
28.733 min
54.133 min
62.567 min
118.55 min
>> te.Format='s'
te =
4×1 duration array
1724 sec
3248 sec
3754 sec
7113 sec
>>
depending on how want the result to look...
0 Comments
Stephen23
on 16 Oct 2018
Edited: Stephen23
on 16 Oct 2018
As long as the last unit is always the same then you could use this:
>> C = {'28:44','54:08','1:02:34','1:58:33'};
>> V = [60,1,1/60]; % [H,M,S]
>> F = @(s)V(end-nnz(s==':'):end)*sscanf(s,'%d:');
>> M = cellfun(F,C)
M =
28.733 54.133 62.567 118.550
It will be reasonably efficient as it does not change/duplicate the input data, and uses efficient sscanf and matrix multiplication. For maximum speed replace cellfun with a preallocated loop.
0 Comments
the cyclist
on 16 Oct 2018
4 Comments
dpb
on 17 Oct 2018
I hadn't seen this until after I added the alternate solution triggered by Peter's, but I commented identically the same idea that it seems as though that would be a relatively easy option to have included in the function design and seems to me a reasonable if not highly important enhancement.
Peter Perkins
on 17 Oct 2018
cyclist, if you know that the text is a mixture of those two formats, can't you convert using one format, and then go back and convert the things that failed, using the second format? Maybe someone else already suggested that.
>> raceTime = {'28:44','54:08','1:02:34','1:58:33'};
>> t = duration(raceTime,'InputFormat','mm:ss')
t =
1×4 duration array
00:28:44 00:54:08 NaN NaN
>> i = isnan(t)
i =
1×4 logical array
0 0 1 1
>> t(i) = duration(raceTime(i),'Format','hh:mm:ss')
t =
1×4 duration array
00:28:44 00:54:08 01:02:34 01:58:33
Or just tack on a leading hours field where needed?
>> raceTime(i) = strcat('0:',raceTime(i))
raceTime =
1×4 cell array
{'0:28:44'} {'0:54:08'} {'1:02:34'} {'1:58:33'}
>> t = duration(raceTime,'Format','hh:mm:ss')
t =
1×4 duration array
00:28:44 00:54:08 01:02:34 01:58:33
1 Comment
dpb
on 17 Oct 2018
" can't you convert using one format, and then go back and convert the things that failed,"
That was my first approach altho I put in try...catch block. The logical addressing is good...if could fold into an anonymous function somehow--have to mull that over.
dpb
on 17 Oct 2018
Edited: dpb
on 17 Oct 2018
OK, thanks to Peter for triggering the idea on how to add the missing hour substring dynamically! :)
>> et=cellfun(@(s) duration(sscanf([repmat('00:',sum(s==':')==1) s],'%d:').','Format','hh:mm:ss'),raceTime)
et =
1×4 duration array
00:28:44 00:54:08 01:02:34 01:58:33
>>
I am using R2017b so duration is still limited to the three numeric inputs; it doesn't accept the time string form. Not sure which release had the enhancement; but if one has that then can remove the call to sscanf and parse the augmented string directly--
cellfun(@(s) duration([repmat('00:',sum(s==':')==1) s]),'Format','hh:mm:ss'),raceTime,'uni',0)
Seems like it wouldn't be too much of a stretch to let leading missing field(s) be implied zeros automagically...
5 Comments
dpb
on 17 Oct 2018
That's a good catch to pull duration out of the cellfun, cyclist.
Interesting the relative poor showing of sscanf; as so often the case, sometimes what we think is a bottleneck may turn out not to be or vice versa...by the rank of Peter's, I'd guess probably the try...catch loop would fare pretty well as well altho I hadn't tried any timings was mostly just playing "golf" to see if could get it down to the one-liner as entertainment! :)
See Also
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!