Find closest values from two different matrices

91 views (last 30 days)
Hello there,
I deal with spatial datasets. I have two mat files, each of them contain a variable which vary in space. Let's go in to the detail. The first mat file, namely data_rough.mat; it contains parameter rough which is distributed in Lon and Lat space. The second mat file, namely data_dat.mat, it contains parameter ep which is distributed in x and y.
My intention is, I want to get the values of rough where the location (Lon, Lat) is close to the locations of values ep (x, y). So, the output I want is x, y, ep, and rough and if possible, the Lon and Lat of the closest locations to the location of of ep).
PS.
The provided datasets (data_dat.mat) I attached might be close each other and may result the same values of rough (and their positions). It's fine.
  4 Comments
Torsten
Torsten on 12 Nov 2024 at 11:19
Edited: Torsten on 12 Nov 2024 at 11:19
x and y are also given as latitude and longitude or do they somehow have to transformed before they can be compared with Lat and Lon ?
Stephen23
Stephen23 on 12 Nov 2024 at 12:36
"No. It's nothing to do with interpolation at all. It's more like finding indices close to certain values (here, the closest the locations)."
The NEAREST option is supported by INTERP1, INTERP2, INTERP3, etc.

Sign in to comment.

Accepted Answer

Madheswaran
Madheswaran on 12 Nov 2024 at 12:29
Edited: Madheswaran on 12 Nov 2024 at 12:29
Hi,
If XY data lies in different coordinate system (say Cartesian coordinate system), then, finding the "closest point" between these different systems would not be straightforward because distance measurements have different meanings in each system. Direct comparison would be mathematically incorrect and could lead to unreliable results.
A recommended approach would be to convert the XY coordinates to Latitude/Longitude format. The transformation of XY coordinates to geographic coordinates might involve rotation, scaling and possibly flipping of the coordinates (if your problem allows it). This transformation process is coupled with optimization objectives as per your problem needs.
Some examples of the optimization objectives are listed below:
  • Minimize the total sum of distances between matched points (minimum aggregate distance)
  • Minimize the maximum distance between any matched pair (minimax objective)
This would become computationaly expensive and wouldn't be trivial to solve.
However if the XY data lies in the same coordinate system as the Latitude and Longitude, then this problem can be solved with a simple brute-force solution. I am assuming the X and Y are longitudes and latitudes respectively. Also, for calculating distance between two points, I have used Euclidean distance which might be good approximation for shorter distances, but to get the exact distance use haversine distance.
The below code implements a nearest neighbor search where for each point in the XY dataset, it finds the closest point in the Latitude-Longitude grid by computing distances to all grid points and selecting the minimum
load('rough_dat.mat'); % Contains Lat, Lon, rough
load('data_dat.mat'); % Contains x, y, ep
% Pre-allocate arrays to store results
n_points = length(x);
nearest_rough = zeros(n_points, 1);
nearest_lat = zeros(n_points, 1);
nearest_lon = zeros(n_points, 1);
% For each point in XY
for i = 1:n_points
min_dist = Inf;
% Compare with each point in Latitude - Longitude grid
for j = 1:length(Lat)
for k = 1:length(Lon)
% Calculate distance using Euclidean formula
dist = sqrt((x(i) - Lon(k))^2 + (y(i) - Lat(j))^2);
if dist < min_dist
min_dist = dist;
nearest_rough(i) = rough(j,k);
nearest_lat(i) = Lat(j);
nearest_lon(i) = Lon(k);
end
end
end
end
results = table(x, y, ep, nearest_rough, nearest_lat, nearest_lon, ...
'VariableNames', {'X', 'Y', 'EP', 'Rough', 'Latitude', 'Longitude'});
disp(results);
Hope this helps!
  2 Comments
Adi Purwandana
Adi Purwandana on 12 Nov 2024 at 12:43
Excellent! Thank you @Madheswaran, that's really helping!
Pavl M.
Pavl M. on 12 Nov 2024 at 14:12
Edited: Pavl M. on 13 Nov 2024 at 11:46
The solution is the best so far, while it is yet not optimal.Whether to sharpen the code by eliminating the 2 for loops by calculating the distance on matrices?
%Trained AI produced next code with my prompt engineering:
% Load data
load('rough_dat.mat'); % Contains Lat, Lon, rough
load('data_dat.mat'); % Contains x, y, ep
% Create meshgrid for Lat and Lon
[Lon_grid, Lat_grid] = meshgrid(Lon, Lat);
% Flatten the grids and rough matrix
Lon_flat = Lon_grid(:);
Lat_flat = Lat_grid(:);
rough_flat = rough(:);
% Combine Lat and Lon into a single matrix for knnsearch
LatLon = [Lat_flat, Lon_flat];
% Perform nearest neighbor search
[indices, distances] = knnsearch(LatLon, [y, x]);
%or, select which is better for you:
% Calculate the distance matrix
x_diff = x(:) - Lon_flat';
y_diff = y(:) - Lat_flat';
distances = sqrt(x_diff.^2 + y_diff.^2);
% Find the minimum distance and corresponding indices
[min_distances, min_indices] = min(distances, [], 2);
% Extract the nearest roughness, latitude, and longitude
nearest_rough = rough_flat(indices);
nearest_lat = Lat_flat(indices);
nearest_lon = Lon_flat(indices);
% Create results table
results = table(x, y, ep, nearest_rough, nearest_lat, nearest_lon, ...
'VariableNames', {'X', 'Y', 'EP', 'Rough', 'Latitude', 'Longitude'});
disp(results);
Here is the pseudocode which demnostrates the more optimal algorithm
l1 = length(Lon);
l2 = length(x);
l3 = length(Lat);
l4 = length(y);
%l5 = length(Az);
%l6 = length(z);
%this is from my code from before (original, no AI is trained to produce it yet):
%dimensions fitting:
Lon = Lon';
col = true;
%for x,y column vectors
if col
if l1 > l2
xl = [x; ones(l1-l2,1)*NaN]
elseif l1 < l2
xl = [Lon; ones(l2-l1,1)*NaN]
end
if l3 > l4
yl = [y; ones(l3-l4,1)*NaN]
elseif l3 < l4
yl = [Lat; ones(l4-l3,1)*NaN]
end
else
%ToDo: doable for row vectors as well:
if l1 > l2
xl = [x ones(1,l1-l2)*NaN]
elseif l1 < l2
xl = [Lon ones(1,l2-l1)*NaN]
end
if l3 > l4
yl = [y ones(1,l3-l4)*NaN]
elseif l3 < l4
yl = [Lat ones(1,l4-l3)*NaN]
end
end
depending on which vector of coordinates is lengthy:
dist = haversined(yl,xl,Lat,Lon);
dist is a matrix by length of max(x,Lon), max(y,Lat)
[c,d] = min(dist) % or similar which finds indeces of minimal value in matrix
nearest_rough = rough(d)
nearest_lat = Lat(preproc(d))
nearest_lon = Lon(preproc(d))
nearest_x = x(preproc(d))
nearest_y = y(preproc(d))
nearest_ep = ep(preproc(d))
It's advantage is more distance metrics incorporation and elimination of slowing down loops by replacement on matrix operations.
Let me know should I implement it in full (I can if you order from me).
When ground vehicles Haversine+(which more utile) distance, when aero winged vehicles than there is another, not Euclidean also distance. Which distance metrics (mahalanobis, manhattan, what more from me to add I have more distance metrics implemented already in TCE and Python) to make for Hybrid Manufaturing CNC machines, High Speed Machining, 5 Axes Machining, Automatic and Robotic Machining, CNC Swiss-Type Lathes, Multitsking machining and 3D printers? (Proposal to cooperate on more construction for
the specific industries).
If you need I can implement more distances as well, including logistic and special structures distance and for flights
Which utile, usable transformation to do?
What will be necessities to add to the code more features, z and azimuth, Bhattachariya distances...?
%for 2 looped vectors version:
% function dh = haversined(lat1,lon1,lat2,lon2)
dLat = (lat2 - lat1)*pi/180.0;
dLon = (lon2 - lon1)*pi/180.0;
lat1 = (lat1)*pi/180.0;
lat2 = (lat2)*pi/180.0;
a = sin(dLat/2).^2+(sin(dLon/2).^2).*cos(lat1).*cos(lat2);
rapprox = 6371;
c = 2*asin(sqrt(a));
%or
%c = 2*atan(sqrt(a),sqrt(1-a));
dh = rapprox*c;
return;
load('rough_dat.mat'); % Contains Lat, Lon, rough
load('data_dat.mat'); % Contains x, y, ep
% Pre-allocate arrays to store results
n_points = length(x);
nearest_rough = zeros(n_points, 1);
nearest_lat = zeros(n_points, 1);
nearest_lon = zeros(n_points, 1);
% For each point in XY
for i = 1:n_points
min_dist = Inf;
% Compare with each point in Latitude - Longitude grid
for j = 1:length(Lat)
for k = 1:length(Lon)
% Calculate distance using Euclidean formula
dist = haversined(y(i),x(i),Lat(j),Lon(k));
if dist < min_dist
min_dist = dist;
nearest_rough(i) = rough(j,k);
nearest_lat(i) = Lat(j);
nearest_lon(i) = Lon(k);
end
end
end
end
str.nearest_lat = nearest_lat
str.nearest_lon = nearest_lon
str.nearest_rough = nearest_rough
cl = {x, y, ep, nearest_rough, nearest_lat, nearest_lon}
results = table(x, y, ep, nearest_rough, nearest_lat, nearest_lon, ...
'VariableNames', {'X', 'Y', 'EP', 'Rough', 'Latitude', 'Longitude'});
disp(results);

Sign in to comment.

More Answers (2)

Pavl M.
Pavl M. on 12 Nov 2024 at 10:39
Edited: Pavl M. on 12 Nov 2024 at 13:12
home
clear all
close all
load('rough_dat.mat')
load('data_dat.mat')
x
y
ep
Lat
Lon
rough
thresh = eps*1000; %Any threshold as per custom objections
resclosestloc = [];
rescloseststr = [];
for i = 1:length(Lon)
for j = 1:length(Lat)
%Approach 1:
a2 = find(y==Lat(j))
a1 = find(x==Lon(i))
c = find(a1==a2);
if c~=0
str2.x = x(a1)
str2.y = y(a2)
str2.ep = ep(1) %a2 will also do, since x and y are in same length
str2.rough = rough(j,i)
rescloseststr = [rescloseststr,str2];
end
%Approach 2:
for k =1:length(y)
for l = 1:length(x)
if (sign(Lat(j)) == sign(y(k)))&&(abs(abs(Lat(j))-abs(y(k)))<thresh)&&(sign(Lon(i))==sign(x(l)))&&(abs(abs(Lon(i))-abs(x(l)))<thresh)
%found closest location
resclosestloc = [resclosestloc; x(l) y(k) ep(l) rough(j,i)];
end
end
end
end
end
%Constructed from needing help code by
%https://independent.academia.edu/PMazniker
%+380990535261
%https://diag.net/u/u6r3ondjie0w0l8138bafm095b
%https://github.com/goodengineer
%https://orcid.org/0000-0001-8184-8166
%https://willwork781147312.wordpress.com/portfolio/cp/
%https://www.youtube.com/channel/UCC__7jMOAHak0MVkUFtmO-w
%https://nanohub.org/members/130066
%https://pangian.com/user/hiretoserve/
%https://substack.com/profile/191772642-paul-m
  5 Comments
Pavl M.
Pavl M. on 12 Nov 2024 at 12:12
Edited: Pavl M. on 12 Nov 2024 at 13:31
corrected, so far:
Finds only corresponding equal entries on same places
#in
#Lat - y
#Lon - x
I am preparing more precise with each even on same places finding with intersect(...) function.
see Proof of Work, it really traverses and extracts the cells in multidimensional arrays as per custom objections.
It depends on TCE NCE MPPL version. The logic is correct, which check that c array of indices is not 0 you can substitute?
I can make more on it, for example to reduce the number of loops from 2 to 1, to fill resultant points to cell, table or multidim array, to add custom criterias for the 2 data lakes processing (for instance norms or sophisticated distances (see rationale: sophisticated doesn't mean complicated, while simple as not too simple means compex). I have more novel codes as for novel distances (in addition to existing Euclidean, Fro, Inf distances I have running codes in Python and Octave for Mahalanobis and more scientific metrics, tell me should there be a customer willing to buy I may deliver as per order/request sufficiently well here or via Skype/e-mail).
In Latest versions of TCE Matlab they replaced C style != to ~=.
So use ~= in TCE Matlab instead of common to C/C++ != .
In Octave beauty it works also for if(c!=0), the 2nd approach works on both, right?
As let us continue benign commited to utile high quality products and services and complete customers' satisfaction.
You may need interpolation / extrapolation for this kind of questions as well, to handle more valuably space-time multidimensional data by regressing missing values.
Please let me know which functionality to gently append.
What more to add to to the clean solution, which more features who will be requiring?
Can you accept my the answer?

Sign in to comment.


Pavl M.
Pavl M. on 12 Nov 2024 at 13:55
Edited: Pavl M. on 12 Nov 2024 at 14:43
%Version with zero loops.
%Runs both on TCE MPPL Octave and Matlab. Finds not only corresponding equal entries on same places
%in
%Lat - y
%Lon - x
%But also contains improvement, runs for intersection of values
home
clear all
close all
load('rough_dat.mat')
load('data_dat.mat')
x
y
ep
%Lat(3) = y(3) %For testing purposes
%Lon(3) = x(3)
%Lat(4) = y(4)
%Lon(4) = x(4)
rough
Lon
Lat
thresh = eps*1000; %Any threshold as per custom objections
resclosestloc = [];
str2 = [];
xl = x;
yl = y;
l1 = length(Lon);
l2 = length(x);
l3 = length(Lat);
l4 = length(y);
%l5 = length(Az);
%l6 = length(z);
%dimensions fitting:
Lon = Lon';
col = true;
%for x,y column vectors
if col
if l1 > l2
xl = [x; ones(l1-l2,1)*NaN]
elseif l1 < l2
xl = [Lon; ones(l2-l1,1)*NaN]
end
if l3 > l4
yl = [y; ones(l3-l4,1)*NaN]
elseif l3 < l4
yl = [Lat; ones(l4-l3,1)*NaN]
end
else
%ToDo: doable for row vectors as well:
if l1 > l2
xl = [x ones(1,l1-l2)*NaN]
elseif l1 < l2
xl = [Lon ones(1,l2-l1)*NaN]
end
if l3 > l4
yl = [y ones(1,l3-l4)*NaN]
elseif l3 < l4
yl = [Lat ones(1,l4-l3)*NaN]
end
end
if l3 > l4
if l1 > l2
%Approach 1:
a1 = find(yl==Lat);
a2 = find(xl==Lon);
c = find(a1==a2);
if c~=0
[a1,b1,c1] = intersect(yl,Lat);
[a2,b2,c2] = intersect(xl,Lon);
str2.x = x(c2);
str2.y = y(c1);
str2.ep = ep(c2);
str2.rough = diag(rough(c2,c1));
end
else
%Approach 1:
a1 = find(yl==Lat);
a2 = find(xl==x);
c = find(a1==a2);
if c~=0
[a1,b1,c1] = intersect(yl,Lat);
[a2,b2,c2] = intersect(xl,x);
str2.x = x(c2);
str2.y = y(c1);
str2.ep = ep(c2);
str2.rough = diag(rough(c2,c1));
end
endif
else
if l1 > l2
%Approach 1:
a1 = find(yl==y);
a2 = find(xl==Lon);
c = find(a1==a2);
if c~=0
[a1,b1,c1] = intersect(yl,y);
[a2,b2,c2] = intersect(xl,Lon);
str2.x = x(c2);
str2.y = y(c1);
str2.ep = ep(c2);
str2.rough = diag(rough(c2,c1));
end
else
%Approach 1:
a1 = find(yl==y);
a2 = find(xl==x);
c = find(a1==a2);
if c~=0
[a1,b1,c1] = intersect(yl,y);
[a2,b2,c2] = intersect(xl,x);
str2.x = x(c2);
str2.y = y(c1);
str2.ep = ep(c2);
str2.rough = diag(rough(c2,c1));
end
endif
endif
%res = struct2cell(str2)
%ToDo: doable for additional z and azimuth dimensions, just twice the if checks and additional pre-filling
%Contact/order me more to do and I will do it.
%Constructed from needing help code by
%https://independent.academia.edu/PMazniker
%+380990535261
%https://diag.net/u/u6r3ondjie0w0l8138bafm095b
%https://github.com/goodengineer
%https://orcid.org/0000-0001-8184-8166
%https://willwork781147312.wordpress.com/portfolio/cp/
%https://www.youtube.com/channel/UCC__7jMOAHak0MVkUFtmO-w
%https://nanohub.org/members/130066
%https://pangian.com/user/hiretoserve/
%https://substack.com/profile/191772642-paul-m
Proof of Commitment (Work) that it runs in TCE:

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!