All tables being vertically concatenated must have the same number of variables.

81 views (last 30 days)
hi,
I am using the code below to extract the specific column from 79 csv files but I am getting the error saying all variables must be the same number of variables when I check the csv files, they all are of same length but I am not sure why I am getting this error. I am attaching one of the csv file and code along with error.
Code:
clearvars
close all
clc
%% mention the name of the parameter to extract
param='DisplayLength_m_s_'; %Change default prameter name with required parameter
%% select folder
dataFolder = uigetdir();
filePattern = fullfile(dataFolder, '*.csv');
list = dir(filePattern);
% Read each CSV file, extract rows and store them to 'Output'
Output = table();
for kk = 1:numel(list)
data = readtable(fullfile(list(kk).folder, list(kk).name));
VarNames =data.Properties.VariableNames;
ColIdx = find(strcmp(VarNames, param));
Output = [Output; data(:,ColIdx)];
end
Error:
Error using rowMean_practise (line 20)
All tables being vertically concatenated must have the same number of variables.
  4 Comments
Peter Perkins
Peter Perkins on 17 Jun 2022
Stephen23 has answered this below, but to be clear: you are reading files that apparently contain different numbers of columns, which means that your tables have different number of varaibles, which means you can't directly vertically concatenate them.
You need to either weed the extra varfiables out of your tables so that they all have the same variables, or (more likely) you need to grab just one variable from each table. That's what Stephen23 shows.
Stephen23
Stephen23 on 17 Jun 2022
"That's what Stephen23 shows."
Not quite: even if all of the files contain the same number of columns, on the first loop iteration there is a mismatch.

Sign in to comment.

Accepted Answer

Stephen23
Stephen23 on 17 Jun 2022
It is not clear to me why you want/need a table anyway. Why not simply concatenate that variable directly?:
prm = 'DisplayLength_m_s_';
out = []; % could be cell array, etc.
for kk = 1:numel(list)
fnm = ullfile(list(kk).folder, list(kk).name);
tbl = readtable(fnm);
out =[out;tbl.(prm)];
end

More Answers (2)

Jan
Jan on 17 Jun 2022
Edited: Jan on 17 Jun 2022
Output = table();
for kk = 1:numel(list)
data = readtable(fullfile(list(kk).folder, list(kk).name));
Output = outerjoin(Output, data);
end
  1 Comment
Peter Perkins
Peter Perkins on 17 Jun 2022
Edited: Peter Perkins on 17 Jun 2022
To clarify the advanced move that I think Jan was suggesting:
A join operation can be thought of as kind of like a horzcat, using keys to match up rows. But an outer join is special: it can preserve rows that don't match up. In that sense it is both kind of like horzcat and kind of like vertcat all at the same time. Pictures help:
> t1 = table([1;2;3],rand(3,1),rand(3,1),VariableNames=["key","X" "Y"])
t1 =
3×3 table
key X Y
___ _______ _______
1 0.71836 0.32515
2 0.96865 0.10563
3 0.53133 0.61096
>> t2 = table([3;4;5],rand(3,1),rand(3,1),VariableNames=["key","Y" "Z"])
t2 =
3×3 table
key Y Z
___ ________ _______
3 0.7788 0.26647
4 0.42345 0.15366
5 0.090823 0.28101
>> outerjoin(t1,t2,Key="key",MergeKeys=true)
ans =
5×5 table
key X Y_t1 Y_t2 Z
___ _______ _______ ________ _______
1 0.71836 0.32515 NaN NaN
2 0.96865 0.10563 NaN NaN
3 0.53133 0.61096 0.7788 0.26647
4 NaN NaN 0.42345 0.15366
5 NaN NaN 0.090823 0.28101
In an extreme case, no rows might match, and there might be no data vars in common, so outerjoin would put the left table in its own block in the upper left, and the right table alone in the bottom right.
So: Jan's suggestion is to use outerjoin to get what is in effect vertcat when the tables don't all have exactly the same vars. The fly in the ointment is that outerjoin won't merge the common data vars ((Y_t1 and Y_t2 in the above). But if you stop thinking of joining tables that have key variables, and just let outerjoin use all the common data vars as keys:
>> t1 = table(rand(3,1),rand(3,1),VariableNames=["X" "Y"])
t1 =
3×2 table
X Y
_______ _______
0.44009 0.87537
0.52714 0.51805
0.45742 0.94362
>> t2 = table(rand(3,1),rand(3,1),VariableNames=["Y" "Z"])
t2 =
3×2 table
Y Z
_______ _______
0.63771 0.67612
0.95769 0.28906
0.24071 0.67181
>> outerjoin(t1,t2,MergeKeys=true)
ans =
6×3 table
X Y Z
_______ _______ _______
NaN 0.24071 0.67181
0.52714 0.51805 NaN
NaN 0.63771 0.67612
0.44009 0.87537 NaN
0.45742 0.94362 NaN
NaN 0.95769 0.28906
you get the desired result, although in a weird row order (sorted by the joined Y). You'd probably want to add a temp var to control that:
>> t1 = table([1;2;3],rand(3,1),rand(3,1),VariableNames=["Ord" "X" "Y"])
t1 =
3×3 table
Ord X Y
___ _______ _______
1 0.24285 0.7655
2 0.91742 0.18866
3 0.26906 0.2875
>> t2 = table([4;5;6],rand(3,1),rand(3,1),VariableNames=["Ord" "Y" "Z"])
t2 =
3×3 table
Ord Y Z
___ ________ _______
4 0.091113 0.54659
5 0.57621 0.42573
6 0.68336 0.64444
>> outerjoin(t1,t2,MergeKeys=true)
ans =
6×4 table
Ord X Y Z
___ _______ ________ _______
1 0.24285 0.7655 NaN
2 0.91742 0.18866 NaN
3 0.26906 0.2875 NaN
4 NaN 0.091113 0.54659
5 NaN 0.57621 0.42573
6 NaN 0.68336 0.64444
This is advanced data munging.

Sign in to comment.


Karim
Karim on 17 Jun 2022

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!