How do I do an outer join of tables with conditional ranges on Key variables?

4 views (last 30 days)
I have two tables and would like to perform an outer join operation on them with a conditional tolerance on matching the values in the Key Variable Columns. Let's say I have the following two tables:
Range = [1.1 1.8 6.3 5.2 11.1 0.9]';
Codes = categorical(["a", "b", "c", "d", "e", "f"])';
Tleft = table(Range, Codes)
RefRange = [1 6 11]';
RefCodes = categorical(["a1", "c1", "e1"])';
Tright = table(RefRange,RefCodes)
I would like to do an outerjoin to connect the variables of Tright to the variables of Tleft when abs(RefRange-Range) <= 0.5.
My desired output would look something like this:
RefRange = [1 NaN 6 NaN 11 1]';
RefCodes = categorical(["a1", "undefined", "c1", "undefined","e1","a1"])';
T_desired = table(Range,Codes,RefRange,RefCodes)
  2 Comments
Stephen23
Stephen23 on 22 Sep 2024
Edited: Stephen23 on 22 Sep 2024
It would be nice if OUTERJOIN had an option for a match function (default=EQ).
What should happen if there are two or more matches?
Richard Lavery
Richard Lavery on 22 Sep 2024
I agree. I was hoping there as a solution that did not involve iterating through a loop.

Sign in to comment.

Accepted Answer

Animesh
Animesh on 22 Sep 2024
To perform custom outer join, you will need to iterate over each row in "Tleft" and find matching rows in "Tright" based on the tolerance condition "abs(RefRange - Range) <= 0.5". Here is the sample code snippet for the same:
% Initialize variables for the desired output
matchedRefRange = NaN(height(Tleft), 1);
matchedRefCodes = categorical(repmat("undefined", height(Tleft), 1));
for i = 1:height(Tleft)
matchIdx = find(abs(Tright.RefRange - Tleft.Range(i)) <= 0.5, 1);
if ~isempty(matchIdx)
matchedRefRange(i) = Tright.RefRange(matchIdx);
matchedRefCodes(i) = Tright.RefCodes(matchIdx);
end
end
T_desired = table(Tleft.Range, Tleft.Codes, matchedRefRange, matchedRefCodes, ...
'VariableNames', {'Range', 'Codes', 'RefRange', 'RefCodes'});

More Answers (1)

Venkat Siddarth Reddy
Venkat Siddarth Reddy on 22 Sep 2024
Hi Richard,
To achieve this, you can perform a condition-based join operation through a iterative process over "Tleft" and "Tright" variables.
Please refer to the following code snippet to achieve the outer join for the provided tables:
% Provided Data
Range = [1.1 1.8 6.3 5.2 11.1 0.9]';
Codes = categorical(["a", "b", "c", "d", "e", "f"])';
Tleft = table(Range, Codes);
RefRange = [1 6 11]';
RefCodes = categorical(["a1", "c1", "e1"])';
Tright = table(RefRange, RefCodes);
RefRange_out = NaN(size(Range));
RefCodes_out = categorical(repmat("undefined", size(Range)));
tolerance = 0.5;
for i = 1:height(Tleft)
for j = 1:height(Tright)
if abs(Tleft.Range(i) - Tright.RefRange(j)) <= tolerance
RefRange_out(i) = Tright.RefRange(j);
RefCodes_out(i) = Tright.RefCodes(j);
break; % Assuming only the first match is needed
end
end
end
T_desired = table(Range, Codes, RefRange_out, RefCodes_out)
T_desired = 6x4 table
Range Codes RefRange_out RefCodes_out _____ _____ ____________ ____________ 1.1 a 1 a1 1.8 b NaN undefined 6.3 c 6 c1 5.2 d NaN undefined 11.1 e 11 e1 0.9 f 1 a1
I hope it helps!
Regards
Venkat Siddarth V

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!