Unable to Join Tables and Timetables

24 views (last 30 days)
I am trying to join two tables and timetables, but I keep getting the following error:
Error using tabular/join (line 144)
The key variable for the right table must have unique values.
Error in JoinTables (line 21)
tabC = join(tabA,tabB); %Trying to join tables
This happens also for the time table line. I thought this was because it wasn't able to identify the proper key variable to index by and so I tried using:
tabC = join(tabA,tabB,'Time',tabA(1,'Time'));
To force it to use the time column, but apparently it's an "invalid parameter name" and so can't recognize it?
Error using tabular/join (line 114)
Invalid parameter name: Time.
So, I tried creating table A, without a time column to get rid of the uniqueness issue, but then it just says there is a common variable to use as a key variable:
Cannot find a common table variable to use as a key variable.
Therefore, it seems like this is a closed loop issue that keeps circling back on itself. This is very frustrating. What am I doing wrong here?
Code:
Input = '2023-01-01 20:14:58';
TimeIn = datetime(Input,'InputFormat','yyyy-MM-dd HH:mm:ss');
datA = [1; 2; 3;];
timA = [TimeIn; TimeIn; TimeIn];
datB = [4; 5; 6];
timB = [TimeIn; TimeIn; TimeIn];
tabA = table(timA,datA);
tabB = table(timB,datB);
tabA = renamevars(tabA,'timA','Time');
tabA = renamevars(tabA,'datA','Data_A');
tabB = renamevars(tabB,'timB','Time');
tabB = renamevars(tabB,'datB','Data_B');
ttabA = table2timetable(tabA);
ttabB = table2timetable(tabB);
%tabC = join(tabA,tabB,'TimA',tabA(1,'TimA')); %Trying to join tables
tabC = join(tabA,tabB); %Trying to join tables
ttabC = join(ttabA,ttabB); %Trying to join time tables
ttabC = synchronize(ttabA,tabB); %Trying to sync time tables
  5 Comments
Stephen23
Stephen23 on 8 Mar 2023
Edited: Stephen23 on 8 Mar 2023
Note that your desired output table has repeated timestamps, which as the cyclist points out means that you do not have unique key values, which means that you are not doing a JOIN operation.
"I then tried explcitly specifying what the key variable names were when calling the join function"
The JOIN documentation currently lists these optional parameter names: "Keys", "LeftKeys", "RightKeys", "LeftVariables", "RightVariables", and "KeepOneCopy". I do not see "Time" in that list. If you want to specify the keys, then use the "Keys" parameter name together with the desired values (i.e. key names), just as the documentation explains and as the documentation examples show.
"When you say they must share variable names, does this mean that all the dimension names must align?"
It is unclear what you mean by "dimension names", I cannot find this term in the MATLAB table documentation.
"I was trying to essentially concatenate the two tables side by side using their "Time" dimension as the key variable since they share the same time indexing."
Horizontal concatenation is easy with your toy examples:
Time = datetime(2023,[1;1;1],1,20,14,58);
datA = [1;2;3];
datB = [4;5;6];
ttA = timetable(Time,datA)
ttA = 3×1 timetable
Time datA ____________________ ____ 01-Jan-2023 20:14:58 1 01-Jan-2023 20:14:58 2 01-Jan-2023 20:14:58 3
ttB = timetable(Time,datB)
ttB = 3×1 timetable
Time datB ____________________ ____ 01-Jan-2023 20:14:58 4 01-Jan-2023 20:14:58 5 01-Jan-2023 20:14:58 6
ttZ = [ttA,ttB]
ttZ = 3×2 timetable
Time datA datB ____________________ ____ ____ 01-Jan-2023 20:14:58 1 4 01-Jan-2023 20:14:58 2 5 01-Jan-2023 20:14:58 3 6
If the tables have different numbers of rows then clearly you will need some kind of JOIN operation, but you would need to consider the fact that your timestamps are duplicated (and ergo are NOT unique, which is a prerequisite for the LHS table in a JOIN operation). This is not a MATLAB problem, but a conceptual problem that you need to understand and figure out first: your statement "they share the same time indexing" is something that does not help: given lots of repeated timestamps, how is MATLAB supposed to uniquely indentify the rows to match if they cannot be uniquely identified? Perhaps you would need e.g. to specify an extra column/variable, e.g. a incremental integer which resets for each timestamp.
Solve that first, then the JOIN/INNERJOIN/whatever is easy.
Lets start with a simple question: do you want to match the rows of the two tables by:
  • location (i.e. row 1 matches with row1, row 2 matches with row 2, etc)
  • data value (i.e. a unique set of key values in the LH table which correspond to key values in the RH table)
  • Or a combination of these, e.g. a timestamps may be duplicated, but for any one timestamp you wish to keep the same relative row order?
"It definitely would be easier and more efficient to upload the actual data to work with, but unfortunately I cannot share that "
Replace your top-secret data with some random numbers and then upload it. Toy examples are rarely representative.
John
John on 14 Mar 2023
"The JOIN documentation currently lists these optional parameter names: "Keys", "LeftKeys", "RightKeys", "LeftVariables", "RightVariables", and "KeepOneCopy". I do not see "Time" in that list. If you want to specify the keys, then use the "Keys" parameter name together with the desired values (i.e. key names), just as the documentation explains and as the documentation examples show."
Yes, I tried doing exactly that, but realized that this doesn't work because the variable names do not match up with each defined timetable.
"It is unclear what you mean by "dimension names", I cannot find this term in the MATLAB table documentation."
This was a mistake on my part. I was interchanging terminology but was referring to the same thing.
"Replace your top-secret data with some random numbers and then upload it. Toy examples are rarely representative."
It is not my choice that I cannot share the data. I did create another toy example, yet this time the data structures are exactly the same as what I am working with so it is directly applicable. Although it is a continuation of this question, I created a new question since I already accepted a solution for this question and it is somewhat of a slightly different problem I am having.

Sign in to comment.

Accepted Answer

Sulaymon Eshkabilov
Sulaymon Eshkabilov on 6 Mar 2023
Edited: Sulaymon Eshkabilov on 7 Mar 2023
Note that both tables should have identical variable names to concatenate them, see - your example 1 (corrected):
Input = '2023-01-01 20:14:58';
TimeIn = datetime(Input,'InputFormat','yyyy-MM-dd HH:mm:ss');
datA = [1; 2; 3;];
timA = [TimeIn; TimeIn; TimeIn];
datB = [4; 5; 6];
timB = [TimeIn; TimeIn; TimeIn];
tabA = table(timA,datA);
tabB = table(timB,datB);
tabA = renamevars(tabA,'timA','Time');
tabA = renamevars(tabA,'datA','Data_A');
tabB = renamevars(tabB,'timB','Time');
tabB = renamevars(tabB,'datB','Data_A');
TTAB=[tabA;tabB]
TTAB = 6×2 table
Time Data_A ____________________ ______ 01-Jan-2023 20:14:58 1 01-Jan-2023 20:14:58 2 01-Jan-2023 20:14:58 3 01-Jan-2023 20:14:58 4 01-Jan-2023 20:14:58 5 01-Jan-2023 20:14:58 6
  2 Comments
John
John on 8 Mar 2023
Thank you @Sulaymon Eshkabilov, this answers the part of how to join tables at least. I changed my main code to do this and it worked.

Sign in to comment.

More Answers (0)

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!