Hi.
I have a logical matrix C (230x312 logical) that represents locations in a country along a grid of longitudes and latitudes. true represents locations that are within the country's borders and false represents locations that aren't within the country's borders.
C looks something like this (scaled down)
C = [0 0 0 0 1 0 0 1 1 0 ;...
0 1 1 0 0 0 1 1 0 1;...
0 1 1 1 1 1 1 0 0 0;...
0 1 1 1 1 1 1 1 1 0;...
1 1 1 1 1 1 1 1 1 1;...
0 1 1 1 1 1 1 1 0 0;...
0 0 1 1 1 0 0 1 0 0;...
0 0 0 0 1 0 0 0 0 0;...
0 0 0 0 0 1 0 0 0 0];
In total, there are about 40000 locations within the borders.
I also have a vectors, for example D (1x40000 single) containing data, where D corresponds to find(C). so if
then
There are various vectors like D that contain longitudes, latitudes, and other statistical data. I am trying to take out around 1000 to 3000 locations as evenly distributed across the country as possible. Because the country C is not a perfect square
locs = vC(1:floor(length(vC)./40):end);
does not return an even distribution.
idx = randperm(40000,1000);
locs = vC(idx);
returns slightly better results - however because of the amount of locations to choose from, there are almost always clusters of locations that are too close together. I also tried using randsample() with a weight for each location (according to the number of locations east/west & north/south from the respective location). But because of the large number of locations to choose from, this does not return better results than randperm().
The only other thing I can think of would be to use a while loop and calculate the distance between each location in locs using the longitudes and latitudes and keep the loop going until all locations are at least x kilometers apart and at most y km apart. I'm pretty sure that this would take forever though.
So, if anyone has a better/faster idea - I'd be forever in your debt!
Thanks!
P.S. It might also be worth noting that the prime factors of the actual total number of locations are [2] and [23063].