Text file data segregation

7 views (last 30 days)
MANISH R
MANISH R on 20 Sep 2022
Commented: Chunru on 28 Sep 2022
I want to segregate a text file which contains data. Segregation should be done on the basis of asterix (*) in the text file. So the data should be read into one variable and when there is asterix it should start reading the data in another variable.
Eg.
**DATA1
.........................................
.........................................
.........................................
.........................................
**DATA2
........................................
..........................................
.........................................
..........................................
**DATA3
.............................................
...........................................
............................................
...................................
......................................
Contents below **DATA1 should be stored in DATA1 and should be there in workspace to view similarly contents under DATA2 should be visible. The code should automatically read that there is double ** and start reading the next data into that variable.
  2 Comments
Rik
Rik on 20 Sep 2022
You should not use numbered variables. Use a cell array instead.
If you read your file as text it is easy to loop through the lines. What have you tried so far?
MANISH R
MANISH R on 20 Sep 2022
I have tried to segregate it based on the number of lines using textscan command with header lines and it works but the number of lines in all the files are not same so i have to make a generic one.

Sign in to comment.

Answers (2)

Chunru
Chunru on 20 Sep 2022
type test.txt
**DATA1 10 20 30 .**DATA2 40 50 **DATA3 60
result = {};
fid = fopen("test.txt", "rt");
while ~feof(fid)
i = fscanf(fid, "**DATA%d");
result{i} = fscanf(fid, "%f")';
end
result
result = 1×3 cell array
{[10 20 30]} {[40 50]} {[60]}
  2 Comments
MANISH R
MANISH R on 28 Sep 2022
@Chunru Thank you but this is little away from what was required by me. Result Im looking for is, if i click DATA1 in workbench then i should get
10
20
30
if i click DATA2 then
40
50
then DATA3
60
all separately.
Anyways thanks for the reply
Chunru
Chunru on 28 Sep 2022
You are strongly encouraged to use cell array instead of different variable names such as DATA1, DATA2, and so on.

Sign in to comment.


Walter Roberson
Walter Roberson on 20 Sep 2022
Use fileread() to read the file as a character vector. Use
parts = regexp(TheVector, '\*\*', 'split');
parts(cellfun(@isempty, parts)) = [];
Now parts is a cell array of character vectors. The ** has been removed, so the sections would start immediately with the DATA characters. Process each section as is appropriate for your purposes.
For example it might make sense to use textscan() of the text with 'headerlines', 1, and with format '' (empty string). When you use textscan() with empty string as the format, and the data is numeric columns, then textscan will automatically figure out how many columns are present.
That said, if you fopen() the file and you ask to textscan() the fid with '' format, then it should stop reading at each point it encounters the ** since that is not something that can be interpreted as numeric. You would then fgetl() and save the variable name, and then you would run another textscan() which would pick up from after that line.
  1 Comment
MANISH R
MANISH R on 28 Sep 2022
@Walter Roberson Thank you. Splitting the data based on ** works fine but Im not able to write and store them as required. With textscan , headerlines it becomes dependent on number of lines which is not ideal because each file has different number of lines. It would be helpful if you could ellaborate little more as Im new to matlab coding. Thanks

Sign in to comment.

Categories

Find more on Data Import and Export in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!