How can i ignore characters while reading a numeric txt file?
24 views (last 30 days)
Show older comments
I got multiple multicolumn txt files. Where there are some charachters(the character is "NA"). How can i avoid them while reading? I am avoiding the first row using dlmread command. I am using this code.
%Specifying file directory
IGNfolder = '/home/krishna/Dokumente/Master_Ali/16/IDT/test';
% To get all the files in that directory and with desired file name pattern.
Allfiles = dir(fullfile(IGNfolder, '*.txt'));
allData = [];
for k = 1:length(Allfiles)
initialFileName = Allfiles(k).name;
fullFileName = fullfile(IGNfolder, initialFileName);
fprintf(initialFileName,'%2d\n', k);
READ=dlmread(fullFileName,'',1,0);
allData(end+1:end+size(READ,1), :) = READ;
end
2 Comments
Stephen23
on 3 Sep 2018
dlmread will not handle this. Use textscan, importdata, or readtable, with the appropriate options.
Please upload a sample file by clicking the paperclip button.
Answers (2)
Stephen23
on 3 Sep 2018
Edited: Stephen23
on 3 Sep 2018
Using importdata (untested):
S = dir(fullfile(IGNfolder,'*.txt'));
C = cell(1,numel(S));
for k = 1:numel(S)
F = fullfile(IGNfolder,S(k).name)
T = importdata(F,',',1);
C{k} = T.data;
end
M = vertcat(C{:});
opt = {'HeaderLines',1,'CollectOutput',true,'Delimiter',',','TreatAsEmpty',{'NA','N/A'}};
X = repmat('%f',1,26);
S = dir(fullfile(IGNfolder,'*.txt'));
C = cell(1,numel(S));
for k = 1:numel(S)
F = fullfile(IGNfolder,S(k).name)
[fid,msg] = fopen(F,'rt');
assert(fid>=3,msg)
C(k) = textscan(fid,X,opt{:});
fclose(fid);
end
M = vertcat(C{:});
Output:
>> M
M =
Columns 1 through 12:
0.51000 10.44027 1.53846 10.81340 1.53846 10.83553 1.53846 10.89902 1.53846 10.85667 1.53846 10.85213
1.47000 10.21903 1.42857 9.80351 1.42857 9.91497 1.42857 9.96833 1.42857 10.20028 1.42857 10.08012
1.46000 10.09988 1.33333 10.24402 1.33333 10.30637 1.33333 10.33871 1.33333 10.77707 1.33333 10.65089
1.41000 9.85219 1.25000 11.00083 1.25000 10.94897 1.25000 11.00017 1.25000 11.34980 1.25000 11.28627
1.36000 9.79367 1.17647 11.43716 1.17647 11.42240 1.17647 11.44660 1.17647 11.77617 1.17647 11.75801
1.33000 9.89546 1.11111 10.94117 1.11111 10.94683 1.11111 10.93697 1.11111 11.33102 1.11111 11.32472
1.28000 9.99880 1.05263 10.12867 1.05263 10.15378 1.05263 10.13399 1.05263 10.53235 1.05263 10.53300
1.25000 10.25309 1.00000 9.28602 1.00000 9.33928 1.00000 9.31170 1.00000 9.68811 1.00000 9.69369
1.20000 10.55945 0.95238 8.45368 0.95238 8.54671 0.95238 8.51130 0.95238 8.84071 0.95238 8.84865
1.18000 10.67868 0.90909 7.62452 0.90909 7.76574 0.90909 7.72400 0.90909 7.99120 0.90909 7.99571
1.16000 10.55945 0.86957 6.79553 0.86957 6.98521 0.86957 6.93953 0.86957 7.14597 0.86957 7.14055
1.14000 10.52541 0.83333 5.97972 0.83333 6.21388 0.83333 6.16201 0.83333 6.32453 0.83333 6.30598
1.10000 10.23996 0.80000 5.20458 0.80000 5.47753 0.80000 5.41209 0.80000 5.55368 0.80000 5.52393
1.09000 10.20359 0.76923 4.50164 0.76923 4.80133 0.76923 4.71530 0.76923 4.85276 0.76923 4.81838
0.89000 7.54433 0.74074 3.89087 0.74074 4.19544 0.74074 4.08659 0.74074 4.22140 0.74074 4.18695
0.85000 7.18539 0.71429 3.36941 0.71429 3.65163 0.71429 3.52320 0.71429 3.64749 0.71429 3.61175
0.80000 6.32794 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.79000 6.04025 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.75000 5.29832 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.74000 4.78749 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.50000 9.53242 1.54000 10.55340 1.54000 10.55349 1.54000 10.60159 1.54000 10.75720 1.54000 10.77853
1.48000 9.24956 1.43000 8.88561 1.43000 8.89266 1.43000 8.95441 1.43000 8.87256 1.43000 8.88452
1.44000 8.72583 1.33000 7.68780 1.33000 7.73128 1.33000 7.80267 1.33000 7.76883 1.33000 7.71848
1.43000 8.34522 1.25000 7.38802 1.25000 7.45453 1.25000 7.51660 1.25000 7.71713 1.25000 7.60990
1.43000 8.48673 1.18000 7.82531 1.18000 7.82510 1.18000 7.88502 1.18000 8.12907 1.18000 8.05167
1.36000 8.01301 1.11000 8.36919 1.11000 8.35548 1.11000 8.42055 1.11000 8.58844 1.11000 8.55749
1.34000 8.05833 1.05000 8.33397 1.05000 8.34118 1.05000 8.34404 1.05000 8.64799 1.05000 8.64084
1.33000 8.10772 1.00000 7.75942 1.00000 7.78272 1.00000 7.76964 1.00000 8.10170 1.00000 8.10535
1.28000 7.96555 0.95200 7.06867 0.95200 7.11203 0.95200 7.09231 0.95200 7.41760 0.95200 7.42678
1.20000 8.10772 0.90900 6.36924 0.90900 6.43896 0.90900 6.41492 0.90900 6.71244 0.90900 6.72284
1.18000 8.15479 0.87000 5.68250 0.87000 5.78452 0.87000 5.75712 0.87000 6.01190 0.87000 6.01947
1.12000 8.82026 0.83300 5.00680 0.83300 5.14335 0.83300 5.11278 0.83300 5.32080 0.83300 5.32209
1.10000 8.77338 0.80000 4.33923 0.80000 4.50824 0.80000 4.47272 0.80000 4.64301 0.80000 4.63641
1.07000 8.21879 0.76900 3.68612 0.76900 3.88324 0.76900 3.83849 0.76900 3.98955 0.76900 3.97577
1.06000 8.48673 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1.05000 8.58298 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1.05000 8.16622 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.97700 7.29980 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.93400 6.91771 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.90600 6.70319 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.88000 6.03787 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.87900 6.30810 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.85600 5.72685 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.84300 5.58350 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.78000 4.82028 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... etc
4 Comments
Stephen23
on 4 Sep 2018
Edited: Stephen23
on 4 Sep 2018
"However the problem still persist. I just want empty cell not "NaN/NA/0""
There is no such thing in MATLAB as an empty numeric element: all numeric elements must have a value (which could be Inf or NaN). While you could use a cell array (which do allow empty elements), I would strongly recommend against using a cell array as it makes processing numeric data pointlessly complex.
"I changed a bit and it gave what i wanted."
Your code change does not make any sense, and makes the loop totally useless:
- if you are not interested in collecting the data together then why do you still use C to store the data from all imported files?
- calling M = C{:} will allocate the content of the first cell to M, and discard all the rest. So you will pointlessly read all of the file data, just to throw it away (except for the first file):
>> C = {1,2,3};
>> M = C{:}
M = 1
In my answer I showed you how to collect all of the file data into one cell array C, and then concatenate that into one numeric matric M: that gives you all of the file data in two convenient forms to use. If this is not what you want then please clearly explain what you require.
See Also
Categories
Find more on Large Files and Big Data in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!