How can I filter out specific lines of text with textscan?

8 views (last 30 days)
Hello, I have a large log file containing both words and numbers that I'm pulling data out of. The log file follows a similar format each time its generated but contains some occassional lines that can appear randomly throughout and always contain the same string of text. I imported the file into matlab using textscan to get each line into a separate array cell and to make searching and pulling out data easier I want to remove lines from the array/textscan that contain the same words I.E "Real time = " and "(Orbit) Eclipse Factor at". Is there a way to fiilter out lines containing either of these strings when I'm initially running textscan or will I need to create a loop that removes instances of each of these lines?
An example of what the file looks like:
(System) Subsystem:: compute_data_value: Voltage 10
Real time = #/#/#### ##:##:##.#####
(System) Subsystem:: compute_data_value: Voltage2 20
(System) Subsystem:: compute_data_value: NumCharging 8
(Orbit) Eclipse Factor at #/##/#### ##:##:##.###
(Orbit) Eclipse Factor at #/##/#### ##:##:##.###
(System) Subsystem:: compute_data_value: NumCharging2 4
Real time = #/#/#### ##:##:##.#####
Real time = #/#/#### ##:##:##.#####
Real time = #/#/#### ##:##:##.#####
(System) Subsystem:: compute_data_value: Current
(Orbit) Eclipse Factor at #/##/#### ##:##:##.### 4
(System) Subsystem:: compute_data_value: Current2 6
*Note that the Real Time.... and the (Orbit) Eclipse..... lines dont always appear in the same lines consistently between log files which messes up pulling data when I know Current2 is 4 lines below Voltage in other log files when the two strings I'm looking to remove are located in other places
Currently my setup looks like this:
filename = 'filename';
RT = fopen(filename);
termRemove1 = '(Orbit) Eclipse Factor at';
termRemove2 = 'Real time = ';
TxtSearch = textscan(RT,'%s','delimiter','\n');
TR1 = find(contains(TxtSearch{1,1}, termRemove1));
TR2 = find(contains(TxtSearch{1,1}, termRemove2));
(At this point I was thinking of running a loop to delete the lines listed in TR1 and TR2 which are lines containing the strings to be removed, but there are over 9000 entries so I wanted to see if there was a way to remove the lines containg the string when I textscan the file, as well as the issue of the array lines being replaced with [ ] rather than removing them and lowering the total line count

Accepted Answer

dpb
dpb on 18 Jul 2022
Edited: dpb on 20 Jul 2022
Piece o' cake -- but use more recent tools than textscan will simplify coding somewhat...
filename = 'filename';
termsRemove=["(Orbit) Eclipse Factor at";"Real time = "];
text=readlines(f);
text=text(~startsWith,text,termsRemove);

More Answers (0)

Categories

Find more on Data Import and Export in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!