How do I do an outer join of tables with conditional ranges on Key variables?
4 views (last 30 days)
Show older comments
Richard Lavery
on 22 Sep 2024
Commented: Richard Lavery
on 22 Sep 2024
I have two tables and would like to perform an outer join operation on them with a conditional tolerance on matching the values in the Key Variable Columns. Let's say I have the following two tables:
Range = [1.1 1.8 6.3 5.2 11.1 0.9]';
Codes = categorical(["a", "b", "c", "d", "e", "f"])';
Tleft = table(Range, Codes)
RefRange = [1 6 11]';
RefCodes = categorical(["a1", "c1", "e1"])';
Tright = table(RefRange,RefCodes)
I would like to do an outerjoin to connect the variables of Tright to the variables of Tleft when abs(RefRange-Range) <= 0.5.
My desired output would look something like this:
RefRange = [1 NaN 6 NaN 11 1]';
RefCodes = categorical(["a1", "undefined", "c1", "undefined","e1","a1"])';
T_desired = table(Range,Codes,RefRange,RefCodes)
2 Comments
Accepted Answer
Animesh
on 22 Sep 2024
To perform custom outer join, you will need to iterate over each row in "Tleft" and find matching rows in "Tright" based on the tolerance condition "abs(RefRange - Range) <= 0.5". Here is the sample code snippet for the same:
% Initialize variables for the desired output
matchedRefRange = NaN(height(Tleft), 1);
matchedRefCodes = categorical(repmat("undefined", height(Tleft), 1));
for i = 1:height(Tleft)
matchIdx = find(abs(Tright.RefRange - Tleft.Range(i)) <= 0.5, 1);
if ~isempty(matchIdx)
matchedRefRange(i) = Tright.RefRange(matchIdx);
matchedRefCodes(i) = Tright.RefCodes(matchIdx);
end
end
T_desired = table(Tleft.Range, Tleft.Codes, matchedRefRange, matchedRefCodes, ...
'VariableNames', {'Range', 'Codes', 'RefRange', 'RefCodes'});
More Answers (1)
Venkat Siddarth Reddy
on 22 Sep 2024
Hi Richard,
To achieve this, you can perform a condition-based join operation through a iterative process over "Tleft" and "Tright" variables.
Please refer to the following code snippet to achieve the outer join for the provided tables:
% Provided Data
Range = [1.1 1.8 6.3 5.2 11.1 0.9]';
Codes = categorical(["a", "b", "c", "d", "e", "f"])';
Tleft = table(Range, Codes);
RefRange = [1 6 11]';
RefCodes = categorical(["a1", "c1", "e1"])';
Tright = table(RefRange, RefCodes);
RefRange_out = NaN(size(Range));
RefCodes_out = categorical(repmat("undefined", size(Range)));
tolerance = 0.5;
for i = 1:height(Tleft)
for j = 1:height(Tright)
if abs(Tleft.Range(i) - Tright.RefRange(j)) <= tolerance
RefRange_out(i) = Tright.RefRange(j);
RefCodes_out(i) = Tright.RefCodes(j);
break; % Assuming only the first match is needed
end
end
end
T_desired = table(Range, Codes, RefRange_out, RefCodes_out)
I hope it helps!
Regards
Venkat Siddarth V
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!