Clear Filters
Clear Filters

readtable with variable names exceeding number of columns

18 views (last 30 days)
I have been using readtable to read data files through release 14b and just updated to 15b and am now getting errors. The problem is that the data files have a different number of columns from file to file, and the final column name is followed by the delimiter (comma in this case). The columns I care about are always named the same, so it makes using the VariableNames property nice. However I now get the following error when reading the files using readtable:
Cannot interpret data in the file '\\path\whatever.txt'. Found 45 variable names but 44 data columns. You may need to specify a different format string, delimiter, or number of header lines.
The columns are titled as follows in the raw data file:
"Column 1, Column 2, ... Column N ,"
I believe the problem is the last comma, which shouldn't be there as it indicates another variable is expected. The trailing comma is not present in the data that follows. I believe it ignored this comma up until release 14b and no longer ignores it in 15b. How to I ignore this last blank variable using read table? Is it possible to create the table using other functions as a workaround? I am thinking maybe iteratively calling fgetl or using txtscan or something similar. Too many data files to simply delete the trailing comma in all of them given that the number of columns changes and the row where the headers is also changes.

Accepted Answer

Kelly Kearney
Kelly Kearney on 5 Nov 2015
I would recommend reading and parsing the variable names separately. For example:
fid = fopen(file, 'r');
str = fgetl(fid);
fclose(fid);
vars = regexp(str, ',', 'split');
if isempty(vars{end})
vars = vars(1:end-1);
end
Tbl = readtable(file, 'delimiter', ',', 'headerlines', 1, 'readvariablenames', false);
Tbl.Properties.VariableNames = vars;
  2 Comments
Thomas
Thomas on 5 Nov 2015
This works well, but creates a new problem. How do I coerce the variable names to be valid variable names. Things like 'Frequency (GHz)' don't map well
Thomas
Thomas on 5 Nov 2015
It looks like the following works:
validVars = matlab.lang.makeValidName(vars)

Sign in to comment.

More Answers (0)

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!