Clear Filters
Clear Filters

textscan output into one cell

9 views (last 30 days)
amen45
amen45 on 17 Dec 2015
Commented: amen45 on 18 Dec 2015
I'm reading in a text file that contains many sets of strings separated by line and then by tab. I need to use textscan or something similar because of variability in the txt files I'll be using. Currently, I'm using code like the below to separate each of the strings into their own cell array, and then haphazardly group them into one cell(my desired output). I'm wondering if there is a more efficient way to go about this, and if it's possible to do this without the second loop or even the second text scan. Thank you in advance for your help!
for i=1:length(data)
check=char(data{i});
row= textscan(check, '%s', 'Delimiter', ' ');
for j= 1:length(row{1})
a{i,j}=row{1}{j}
end
end
Example text format:
disorganizedstring
disorganized string string
stringtype1 stringtype2 stringtype3 stringtype4
stringtype1 stringtype2 stringtype3 stringtype4
stringtype1 stringtype2 stringtype3 stringtype4
stringtype1 stringtype2 stringtype3 stringtype4
stringtype1 stringtype2 stringtype3 stringtype4
Current output:
'stringtype1' 'stringtype2' 'stringtype3' 'stringtype4'
'stringtype1' 'stringtype2' 'stringtype3' 'stringtype4'
'stringtype1' 'stringtype2' 'stringtype3' 'stringtype4'
'stringtype1' 'stringtype2' 'stringtype3' 'stringtype4'
'stringtype1' 'stringtype2' 'stringtype3' 'stringtype4'

Accepted Answer

Walter Roberson
Walter Roberson on 18 Dec 2015
filecontent = fileread('YourFile.txt');
content_by_line = regexp(filecontent, '\r?\n', 'split');
content_by_field = regexp( content_by_line(:), '\t', 'split');
max_fields = max( cellfun(@length, content_by_field) );
cellpad = repmat({{}}, 1, max_fields);
first_n_fields = @(C, n) C(1:n);
padded_content = cellfun(@(C) first_n_fields([C,cellpad], max_fields), content_by_field, 'Uniform', 0);
desired_output = vertcat(padded_content{:});
This reads the file and splits it into lines and then splits the lines into fields. Then it finds the line with the most fields, and constructs padding as long as that. It then goes through and pads each line and takes the first N outputs: in this way without having to test how many fields there were on the line, each line is padded out to the same length. Once you have the cell array of cell arrays that are all the same length, a simple change converts it to a 2D cell array.
  1 Comment
amen45
amen45 on 18 Dec 2015
Thank you so much! this was very helpful!

Sign in to comment.

More Answers (0)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!