join tables by categorical variable

I want to join two tables with indices of type categorical. However, I get an error.
I can convert both indices to string, then join, and then convert back to categorical again, but that seems ludicrous.
a = table(categorical(["a", "b", "c"]'), [1:3]')
b = table(categorical(["a", "b", "d"]'), [4:6]')
join(a,b,'Keys','Var1')
Gives "Left and right key variables 'Var1' and 'Var1' have incompatible types."

 Accepted Answer

cl
cl on 24 Jul 2020
Turns out that my Matlab installation was corrupted, causing the mis-leading error message I received. After re-installation the error message changed and with using innerjoin instead of join on Akira's suggestion the code executed without errors. Thanks.

More Answers (1)

Please try innerjoin or outerjoin functions, like:
c1 = innerjoin(a,b,'Keys','Var1');
c2 = outerjoin(a,b,'Keys','Var1','MergeKeys',true);
These outputs are as follows:
>> c1
c1 =
2×3 table
Var1 Var2_a Var2_b
____ ______ ______
a 1 4
b 2 5
>> c2
c2 =
4×3 table
Var1 Var2_a Var2_b
____ ______ ______
a 1 4
b 2 5
c 3 NaN
d NaN 6

8 Comments

this still doesn't work for me, same error message that keys have incompatible types. I'm on R2019b.
It seems strange. I've confirmed this works as expected on R2019b. To investigete, could you share the whole error message?
cl
cl on 20 Jul 2020
Edited: cl on 20 Jul 2020
Thank you for looking into this. The whole message in the command window is:
Error using tabular/innerjoin (line 101)
Left and right key variables 'Var1' and 'Var1' have incompatible types.
Error in test (line 4)
c1 = innerjoin(a,b,'Keys','Var1');
OK. This error represents a.Var1 and b.Var1 are different type of array.
So I would recommend confirming the type of a.Var1 and b.Var1 (Based on your sample code, both must be 3-by-1 categorical array).
in my example in the OP, they are both created categoricals, I confirmed this with
class(a.Var1)
class(b.Var1)
It only says they are categorical, without giving size.
The whole example reads as follows
a = table(categorical(["a", "b", "c"]'), [1:3]')
b = table(categorical(["a", "b", "d"]'), [4:6]')
class(a.Var1)
class(b.Var1)
c1 = innerjoin(a,b,'Keys','Var1');
same error message.
Thank you for the clarification. I've confirmed that the above code works at least in my R2019b environment (more specifically, R2019b Update 3 on Windows 10).
Thus, it might be related to your specific running environment (OS, MATLAB and/or Update version, etc). So, at this stage, I would recommend contacting our Technical Support team.
The problem, I believe, is that the categorical array used for Var1 in a does not have the same categories as the categorical array used for Var1 in b.
categories(a.Var1)
categories(b.Var1)
If you had two categorical arrays, one listing pieces of a house (door, window, chimney) and one listing pieces of a car (door, window, motor) those are both categorical arrays but they're different because each has a category the other doesn't (chimney, motor.)
As long as the categoricals are not ordinal, the different categories should not matter. If they are ordinal, the categories must be the same, with the same order.
But in any case, the OP says, "Turns out that my Matlab installation was corrupted, causing the mis-leading error message I received."

Sign in to comment.

Categories

Products

Release

R2019b

Asked:

cl
on 13 Jul 2020

Commented:

on 24 Jul 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!