How to get random samples with certain distance between them?

9 views (last 30 days)
Hello everybody,
I have the following problem: I have the following two colums which correspond to the x and y coordinate. It is like this but longer. (In my real case I have a matrix A(27889,2) which values goes from 0 to 5).
A = 0 0
1 0
2 0
3 0
4 0
0 1
1 1
2 1
3 1
4 1
0 2
1 2
My goal is to select random samples (2, 3, etc.) and that all samples are separated by a maximum and minimum distance (that is, between a range). I have made the following code that works perfectly, but it is not very robust due to the distance condition. This is because it is always selecting 3 random samples that are inside the range established, but if we make the range condition more and more reduced, the program is calculating for a long time.
n = prod(size(A));
nsemillas = 3; % For 3 random samples
min_dist = 2; % Minimum distance
max_dist = 5; % Maximum distance
while (true)
semillas_indexes = randperm(n,nsemillas);
[row,col] = ind2sub(size(A),semillas_indexes);
for i = 1:nsemillas
semilla(i,:) = A(row(1,i),:);
end
dist1 = sum(abs(semilla(1,:)));
dist2 = sum(abs(semilla(2,:)));
dist3 = sum(abs(semilla(3,:)));
dif1 = abs(dist1-dist2);
dif2 = abs(dist2-dist3);
dif3 = abs(dist1-dist3);
if dif1 > min_dist & dif2 > min_dist & dif3 > min_dist
if dif1 < max_dist & dif2 < max_dist & dif3 < max_dist
break;
end
else
continue;
end
end
I would like to know if there is any way to make the program more robust with this distance condition between samples.
Thanks in advance.
J.F.

Accepted Answer

Jan
Jan on 15 Feb 2021
You have a [27889 x 2] matrix containing values from 0 to 5. There are only 36 different possibilities of taking 2 out of 6 elements. This means that most of your input data are repeating. Then it is very unlikely to find a set of 3 rows, which are pairwise distinct from eachother.
You method to select values even allows row to be not unique. A better approach:
uA = unique(A, 'rows');
nA = size(uA, 1); % not prod(size(uA)), which is numel(uA) by the way
nsemillas = 3; % For 3 random samples
min_dist = 2; % Minimum distance
max_dist = 5; % Maximum distance
while true
row = randperm(nA, nsemillas);
semilla(i,:) = A(row, :);
dist1 = sum(semilla(1,:)); % No need for ABS() here
dist2 = sum(semilla(2,:));
dist3 = sum(semilla(3,:));
dif1 = abs(dist1-dist2);
dif2 = abs(dist2-dist3);
dif3 = abs(dist1-dist3);
if dif1 > min_dist && dif2 > min_dist && dif3 > min_dist
if dif1 < max_dist && dif2 < max_dist && dif3 < max_dist
break;
end
end
end
I'm not really sure, what you want to achieve. How do you define "distance"? Why are you searching in a non-unique data set? Can you provide some input data and the wanted output`?
  3 Comments
Jan
Jan on 15 Feb 2021
Edited: Jan on 15 Feb 2021
As far as I understand, the input matrix A consists of pairwise distinct rows. Is this correct?
I do not understand how A and the 167x167 matrix are related. This matrix is symmetric, so is this the pairwise distance between the rows?
Because your matrix is small with 27889x2 elements, it is possible to determine all distances between 2 points: This needs 3.1GB of data. Based on this set you could select matching 2 points and determine a 3rd point dynamically. Would a single precision satisfy your needs also? This would reduce the memory consumption by 50%.
You have mentioned a distance. The code "dist1 = sum(semilla(1,:));" looks confusing, because this is a unusual method to define a distance. If all you need is really a sum of the components, why not calculating sum(A,1) before searching the points? So explain your mathematical definition of "distance".
Can you post a small working example, which produces the wanted output? This would clarify your needs.
Javier Fuster
Javier Fuster on 16 Feb 2021
Thanks for your answer and for the desire to help.
Thanks to your messages I have restructured the way of doing it and I have achieved what I wanted.
Once again, thank you very much for the help.
J.F.

Sign in to comment.

More Answers (0)

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!