reading and processing text file

1 view (last 30 days)
Mohammad
Mohammad on 11 Oct 2014
Commented: Mohammad on 13 Oct 2014
Hi,
This probably a simple question for the expert but I am swamped by the too many options. I have a text file (a sample is attached here). The file starts with header where the first line is named as structure and this line is followed by few more text lines and then three columns header and the data. After this data, another line (named as structure) starts and the same continues for couple of more structures. I want to read this file and write it as excel file where each structure start firsts and read data and then the next structure starts in parallel with first one. I have shown it in the excell file attached here. Since there are so many data and couple of structures it would be great if I can write a code. I vastly appreciate any help. I already tried with textscan but could not get it working.
  1 Comment
Mohammad
Mohammad on 11 Oct 2014
My apologies. I did not upload the correct file in the first time. Updated is the right txt file. Also could anybody help me how to extract only the numerical data portion from the file excluding all the text.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 11 Oct 2014
Edited: Guillaume on 11 Oct 2014
While it's certainly possible to read it the file in matlab and write it to excel the way you want, if it's the only thing you do I don't see the point.
Just open the text file in excel and the importer will do exactly what you want.
Now, if you do insist on doing it matlab:
excel = actxserver('Excel.Application');
wb = excel.Workbooks.OpenText('test_rafiq.txt'); Just call excel text importer with default values
wb.SaveAs('test_rafiq.xlsx');
wb.Close;
excel.Quit;
delete(excel);
  4 Comments
Guillaume
Guillaume on 11 Oct 2014
Right, it's a lot more complicated. I would read the file fgetl, parse it using regexp and save the result into two cell arrays (one for the structures, one for the data) that expand as appropriate. You then just merge the cell arrays and write the whole lot to excel.
Try this:
fid = fopen('test_rafiq.txt', 'rt');
instruct = false;
indata = false;
structs = {};
scount = -2; %start at -2 so first iteration puts it at 1
srow = 0;
data = {};
dcount = -3; %start at -3 so first iteration puts it at 1
drow = 0;
line = fgetl(fid);
while ischar(line)
if isempty(line)
instruct = false;
indata = false;
elseif instruct
structs(srow, scount:scount+1) = regexp(line, '(.*:)\s+(.*)', 'tokens', 'once');
srow = srow+1;
elseif indata
data(drow, dcount:dcount+2) = regexp(line, '\S+', 'match');
drow = drow+1;
elseif strfind(line, 'Structure:')
scount = scount + 3;
structs(1, scount:scount+1) = regexp(line, '(.*:)\s+(.*)', 'tokens', 'once');
srow = 2;
instruct = true;
else
%assume it's a 'Dose ...' line
dcount = dcount + 4;
data(1, dcount:dcount+2) = regexp(line, '\S.*?\[.*?]', 'match');
drow = 2;
indata = true;
end
line = fgetl(fid);
end
fclose(fid);
%make both cell arrays the same number of columns
if size(structs, 2) < size(data, 2)
structs{1, size(data, 2)} = [];
elseif size(data, 2) < size(structs, 2)
data{1, size(structs, 2)} = [];
end
xlswrite(fullfile(pwd, 'test_rafiq.xlsx'), [structs; data]);
Mohammad
Mohammad on 13 Oct 2014
Thank you so much. It solves the part I need. I appreciate your help.
Rafiq

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!