How to calculate the minimum distance between two points?

6 views (last 30 days)
Helo,
I have a question about a code. I have two input files (file1.txt and file2.txt), each of them have three columns: the 1st has the latitude, the 2nd the longtitude and the 3rd one numerical value. The second .txt file has 200000 lines and the first .txt file has 1000 lines. I would like to calculate the minimum distance for each point of file 1 from the file2.txt (I mean that I would like to find the closest point of file2.txt for each point of file1.txt).
I am using the following lines,
clc
clear
filename1= 'file1.txt';
[d1,tex]= importdata(filename1);
lat1=d1.data(:,2);
lon2=d1.data(:,1);
t=importdata('file2.txt');
lat2=t(:,2);
lon2=t(:,1);
z2=t(:,3);
for z=1:size(lat1,1);
for j=1:size(t,1);
output(z,j)=(deg2km(distance(lat1(z),lon1(z),lat2(j),lon2(j))));
end
end
V=output';
%==================DISTANCE================================
[M,I]=min(V,[],2);
but I realise that it is two "slow" and time consuming.
Is there any way to make this code more "fast"??
  2 Comments
the cyclist
the cyclist on 22 Sep 2022
Can you upload the data files? (Use the paper clip icon from the INSERT section of the toolbar.)
Ivan Mich
Ivan Mich on 22 Sep 2022
I am uploading these files. I hope that you can help me...
Thank you

Sign in to comment.

Accepted Answer

the cyclist
the cyclist on 22 Sep 2022
I am not absolutely certain that this code is equivalent to what you wrote, because I cannot test it on my local machine. (I do not have a toolbox with the distance function.)
However, I believe that you can vectorize one of the inputs to that distance function, and you should get a large speedup. (The non-vectorized version did not even finish within the timeout period here.)
% Start timer
tic
% Read data from online files. (You should read from your local files instead.)
filename1= 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1133265/file1.txt';
filename2= 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1133270/file2.txt';
% Read in the data. (Used readtable instead of importdata.)
d1 = readtable(filename1);
d2 = readtable(filename2);
% Extract the latitude and longitude variables from the two files.
% (Uses curly brackets, to get the column contents, rather than a single-column table.)
lat1 = d1{:,2};
lon1 = d1{:,1};
lat2 = d2{:,2};
lon2 = d2{:,1};
% Preallocate the output, for better memory management
output = zeros(size(d1,1),size(d2,1));
% Calculate the distances
for i1 = 1:size(lat1,1)
output(i1,:)=(deg2km(distance(lat1(i1),lon1(i1),lat2,lon2)));
end
V=output';
%==================DISTANCE================================
[M,I]=min(V,[],2);
toc
Elapsed time is 2.003897 seconds.
Note that I made a couple other changes. The first is that I used readtable instead of importdata, so that it would work here. Second, I changed several variable names so that these names were treated consistently between your two files.
I suggest you thoroughly debug what I changed (perhaps on a small subset of your data), to make sure you are still getting the result you expect.
I hope that helps!

More Answers (1)

the cyclist
the cyclist on 22 Sep 2022
I doubt that this is the main issue slowing down the code, but you should preallocate the memory for your output array. Growing an array incrementally can lead to very inefficient memory usage. See more detail here.
Put the line
output = zeros(size(lat1,1),size(t,1));
before your for loops.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!