Problem using regexp to extract certain lines
3 views (last 30 days)
Show older comments
I am trying to extract lines that begin with /keylog/midi from a file that looks like:
/keylog/midi 144 60 72 1.001300
/keylog/oscp 144 60 0.006736 1.030209
/keylog/oscp 144 60 0.000000 2.852801
/keylog/oscp 144 60 0.000000 2.869148
/keylog/midi 144 60 0 2.870843
And I need to separate the two lines from each other. The code I have right now is:
Fid=fopen('keyData.txt');
myLines=fgetl(Fid);
while ~feof(Fid)
myLines=fgetl(Fid);
on=regexp(myLines,'(/keylog/midi)(\s\d+\s\d+\s[^0]\s\d+(.)\d+)', 'match'); % on must equal the line that's fourth line is greater than 0
off=regexp(myLines, '(/keylog/midi)\s\d+\s\d+\s[0]+\s\d+(.)\d+', 'match'); % off must equal the line that's fourth number is 0
end
When I run the code
off='/keylog/midi 144 60 0 2.870843'
on={}
What is wrong with my regexp for on?
0 Comments
Accepted Answer
Simon
on 22 Nov 2013
Hi!
I would suggest another approach. It is (in my opinion) easier to debug, because you can track your steps easily.
% open and read file
fid = fopen(FileName);
FC = textscan(fid, '%s', 'delimiter', '\n', 'whitespace', '');
fclose(fid);
FC = FC{1};
% remove blanks on start and end of line
FC = strtrim(FC);
% find all lines with '/keylog/midi'
FCmidi = FC(strncmp('/keylog/midi', FC, 12));
% read each remaining line, skipping the string '/keylog/midi'
C = cellfun(@(x) sscanf(x, '%*s %d %d %d %f'), FCmidi, 'UniformOutput', false);
% format C: each column is one log entry
C = [C{:}];
% on/of flag is in row 3
onoff = C(3, :);
0 Comments
More Answers (2)
Walter Roberson
on 22 Nov 2013
[^0]\s should be [^0]\d* in order to eat the digits after the first non-zero one (e.g., [^0] will match the 7, and then the \d* will match the 2.
In the off expression, [0]+ will match one or more 0's. Will there ever be multiple 0's there, such as 00 ? If not then it would make more sense to get rid of the + and change the [0] to just 0
3 Comments
Walter Roberson
on 22 Nov 2013
Edited: Walter Roberson
on 22 Nov 2013
Note: use \. to indicate a literal period.
Could you show your modified regular expressions?
Yamoussa SANOGO
on 15 Oct 2019
Edited: Yamoussa SANOGO
on 15 Oct 2019
Hi there, I know this question has been around for a while, but I would add my suggestion in the case somebody else has the same problem. My approch would be a simple lookahead like this :
text =
" /keylog/midi 144 60 72 1.001300
/keylog/oscp 144 60 0.006736 1.030209
/keylog/oscp 144 60 0.000000 2.852801
/keylog/oscp 144 60 0.000000 2.869148
/keylog/midi 144 60 0 2.870843 " ;
rule = '(?<=\/keylog\/midi)(\s*\d*\s*\d*\s*\d*\.?\d*\s*\d*\.?\d*)' ;
matched_data = regexp(text,rule, 'match');
Then convert the matched data to string :
matched_data = [matched_data{:}];
This approch can be generalized by making the prefix '/oscp' and '/midi' a string variable and concatenate with the rest of the matching rule.
0 Comments
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!