How to extract data out of a text file from specific point in a string

7 views (last 30 days)
Hello everyone,
I'm currently trying to extract data from a text file that is not formatted to have columns and the data needed is in the middle of a string of text (see file attached). Essentially I need to store the first number that appears in each line into a matrix, but when I use textscan, or fscanf, it results in an empty cell array. I have also tried to use the extractBetween function and this has given me the closest result I am looking for but it still extracts unwanted text. The code I tried is below.
fid=fileread('input.txt');
startPat = ": ";
endPat = "[";
str=extractBetween(fid,startPat,endPat);
Ideally the final result after extracting the data will be to put the 9 values into matrix (i.e A = [1.2 20 150 20000 60 2.05 8 398 0.045]), or have each value set as their own variable.
How would I go about doing this?

Accepted Answer

Walter Roberson
Walter Roberson on 11 Nov 2022
fid = readlines('input.txt');
A = rmmissing(double(regexp(fid,'[+-]?\d+(\.\d*)?', 'once', 'match')));
Note: this particular version of the code does not support floating point numbers, but it does support negative and positive numbers. It also does not support decimal point without a leading digit in front of it -- for example it would support 0.5 but not .5
  2 Comments
Image Analyst
Image Analyst on 11 Nov 2022
Seems like it does support floating point numbers.
format long g
fid = readlines('input.txt');
A = rmmissing(double(regexp(fid,'[+-]?\d+(\.\d*)?', 'once', 'match')))
A = 9×1
1.0e+00 * 1.2 20 150 20000 60 2.05 8 398 0.045

Sign in to comment.

More Answers (1)

Image Analyst
Image Analyst on 11 Nov 2022
Edited: Image Analyst on 11 Nov 2022
Try this. It's one way
format long g
textLines = readlines('input.txt');
index = 1;
for k = 1 : numel(textLines)
if ~contains(textLines{k}, '[')
continue;
end
numbers(index) = str2double(extractBetween(textLines{k}, ':', '['));
index = index + 1;
end
numbers' % Show in command window.
ans = 9×1
1.0e+00 * 1.2 20 150 20000 60 2.05 8 398 0.045

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Tags

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!