why wont text scan read all rows?

Question

Aidan Goy on 4 Dec 2020

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/678503-why-wont-text-scan-read-all-rows

Commented: dpb on 4 Dec 2020

car_data.txt

fileID = fopen('car_data.txt');
data = textscan(fileID,'%d %d %d %d %d %f %d %d %s', 'headerLines', 1);
fclose(fileID);

Why does this only read the first row of my text file and store it as a 1x9? I want it to read all lines and store as 406x9. Have i missed out some arguents to make it continue reading the following lines?

When i view the 1x9 array it creates here is the result

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

dpb on 4 Dec 2020

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/678503-why-wont-text-scan-read-all-rows#answer_565423

Open in MATLAB Online

MPG	Cylinders	displacement	horsepower	weight	acceleration	model year	origin	car name
8	307	130	3504	12	70	1	chevrolet chevelle malibu
8	350	165	3693	11.5	70	1	buick skylark 320
8	318	150	3436	11	70	1	plymouth satellite
8	304	150	3433	12	70	1	amc rebel sst
8	302	140	3449	10.5	70	1	ford torino
...

'Cuz it's tab-delimited and the string data with embedded blanks aren't quote-delimited.

You can use

fmt=[repmat('%d',1,8) '%s'];
data=textscan(fileID,fmt,'delimiter','\t','headerLines', 1);

and joy should ensue, but...

I'd strongly suggest to use readtable and the new(ish) table object instead.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Mathieu NOE on 4 Dec 2020

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/678503-why-wont-text-scan-read-all-rows#answer_565438

Open in MATLAB Online

hello

your last column has more than one word (2 or 3, so variable size)

so with the given arguments txtscan can't read more than the first line , and also only the first word

why not using readtable ?

T = readtable('car_data.txt'); 
% gives : 
% T =
% 
%   406×9 table
% 
%     MPG     Cylinders    displacement    horsepower    weight    acceleration    modelYear    origin                    carName                 
%     ____    _________    ____________    __________    ______    ____________    _________    ______    ________________________________________
% 
%       18        8             307           130         3504           12           70          1       {'chevrolet chevelle malibu'           }
%       15        8             350           165         3693         11.5           70          1       {'buick skylark 320'                   }
%       18        8             318           150         3436           11           70          1       {'plymouth satellite'                  }
%       16        8             304           150         3433           12           70          1       {'amc rebel sst'                       }
%  and so on...
      

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 3

Star Strider on 4 Dec 2020

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/678503-why-wont-text-scan-read-all-rows#answer_565458

Open in MATLAB Online

The first column causes problems because it includes ‘NA’ in a field that uses '%d'.

The fix for that is to reasd it as a string, use strrep to replace 'NA' with 'NaN', then use str2double to convert it to a double array:

fileID = fopen('car_data.txt');
data = textscan(fileID,'%s %d %d %d %d %f %d %d %s', 'HeaderLines',1, 'Delimiter','\t');
fclose(fileID);
data{1} = str2double(strrep(data{1}, 'NA','NaN'));

There are likely other ways to deal with it, such as I suggested in my previous Answer to How do I read in this text file using fopen fclose and fscanf and then split each column into variables? .

4 Comments
Show 2 older commentsHide 2 older comments

Star Strider on 4 Dec 2020

Joseph Wilson —

I fail to understand the reason for that. The code reads the file correctly, and would (if you allow it to) create a full matrix.

I would still go with readtable, as I suggested previously.

dpb on 4 Dec 2020

You're correct, sorry. The '\t\ delimiter fixes the parsing of the blank-containing strings; I was still thinking of the default delimiter.

As you, I also suggested to use readtable as much simpler.

Sign in to comment.

Answer 4

Joseph Wilson on 4 Dec 2020

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/678503-why-wont-text-scan-read-all-rows#answer_565508

Open in MATLAB Online

fileID = fopen('car_data.txt');
fmt=[repmat('%q',1,8) '%s %*[^\n]']; %have to have %*[^\n] to ignore the rest of that row which changes size
data = textscan(fileID,fmt,'headerLines',1);
fclose(fileID);
for count = 1:length(data)-1
data_double(:,count) = str2double(strrep(data{count}, 'NA','NaN')); %This yeilds a 406X8 matrix
end
data_string = data{9}; %This yeilds a 406X1 cell matrix with all the make data

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

why wont text scan read all rows?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (4)

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

why wont text scan read all rows?

0 Comments Show -2 older commentsHide -2 older comments

Answers (4)

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

4 Comments Show 2 older commentsHide 2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments