Compare dates between a matrix and a given range and read the values
Show older comments
I have some Text file where values (>thousands) are wriiten like this:-
Dates Time Values1 Values2 Values3
31/03/2021 15:01:34 56.45 89.85324 1000.98
31/03/2021 15:06:34 78.34 90.75836 1000.99
1/04/2021 9:01:34 60.29 72.89434 1001.50
2/03/2021 13:01:34 72.56 60.35986 1001.68
..... Upto thousnds of values
I want to check this file with a date range like (31/03/2021 15:01:34 to 31/03/21 to 15:06:34) or (1/04/2021 to 2/03/201)
and if this range is available I read those values from their corresponding tables.
Is there any short method to do this.
Please help me in this matter.
Accepted Answer
More Answers (1)
This is not a real answer, but my preference for very large files. I occasionally have to deal with csv files that are several million lines long. Reading all that into memory and trying to parse date strings is expensive. I find that it's much faster to split the file externally and then only deal with the chunk that's necessary.
You'll need to be familiar with how your file is formatted, but this is an example. My lines start with a date, so:
startdate='02/28/2021'; % include this date
stopdate='03/01/2021'; % exclude this date; set to '' to read up to last line
logfile='/path/to/my/giant/logfile.log';
tempfile='/dev/shm/tempms.log'; % don't need to touch the disk
delimiter=',';
% prepare temporary file
[~,b]=system(['wc -l ' logfile ' | cut -d '' '' -f 1']);
totallc=str2double(b);
[~,b]=system(['grep -n ' startdate ' ' logfile ' 2>/dev/null | head -n 1 | cut -d '':'' -f 1']);
startline=str2double(b)-1;
system(['tail -n ' num2str(totallc-startline) ' ' logfile ' > ' tempfile]);
if ~isempty(stopdate)
[~,b]=system(['grep -n ' stopdate ' ' tempfile ' 2>/dev/null | head -n 1 | cut -d '':'' -f 1']);
stopline=str2double(b)-1;
system(['head -n ' num2str(stopline) ' ' tempfile ' | sponge ' tempfile]);
end
% now you can read the temp file instead of the whole log.
Of course, this is in bash, but the idea is the same in other environments. I could have made this neater by writing a bash script to do the job and just calling it from within my m-file. You don't have to do everything in Matlab.
4 Comments
acun67 acu
on 31 Mar 2021
DGM
on 31 Mar 2021
Matching a time string might be more problematic, as this expects the test string to actually exist in the file. It doesn't parse anything and compare to see if the time entries are within a range. If the entry interval is small, you might be reasonably assured to find an hour or minute match, but that's otherwise a limitation.
Walter Roberson
on 31 Mar 2021
wc implies unix / linux, and if you have that then split can often be useful.
DGM
on 31 Mar 2021
oof. Yeah, that would probably simplify it.
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!