# Random sampling of elements from an array based on a target condition in MATLAB

10 views (last 30 days)
Edited: Richard Brown on 7 Nov 2019
I have an array (let's call it ElmInfo) of size Nx2 representing a geometry. In that array the element number and element volume are on the column 1 and column 2 respectively. The volume of elements largely vary. The sum of the volume of all elements leads to a value V which can be obtained in MATLAB as:
V=sum(ElmInfo(:,2));
I want to randomly sample elements from the array ElmInfo in such a way that the volume of sampled elements (with no repetition) will lead to a target volume V1. Note: V1 is less than V. So I don't know the number of elements to be sampled. I am giving an example. For a sampling case number of sampled element can be '10' whereas in other sampling number of sampled element can be '15'.
There is no straightforward MATLAB in-built function to meet the target condition. How can I implement the code in MATLAB?

Richard Brown on 5 Nov 2019
If your process is just to sample until you are within a tolerance of V_1 and you want things to be truly random, how about the following algorithm:
• Randomly shuffle the entries of V
• Take a cumulative sum
• If there is an entry within a certain tolerance of V_1, keep the corresponding entries and exit
• Otherwise, repeat
This is a completely naive brute-force approach, which won't work well at all if your tolerance is small. But if your tolerance is generous, the following might do the job well enough, and will actually sample things at random.
n = 300;
v = rand(n, 1);
V1 = 17;
tol = 1e-2;
sample = [];
maxits = 10000;
for count = 1:maxits
p = randperm(n);
s = cumsum(v(p));
k = find(abs(s - V1) < tol);
if ~isempty(k)
sample_indices = p(1:k(1));
sample = v(sample_indices);
fprintf('sample found after %d iterations\n', count);
break
end
end
@Richard, Thanks a lot.
This works for me. I have tried with a tolerance of 1e-5.

John D'Errico on 4 Nov 2019
Edited: John D'Errico on 4 Nov 2019
Are you looking for an exact match to V1? An approximate sum equal to V1? If so, then how close of a tolerance do you need?
Is there anything already in MATLAB to do this exactly? Absolutely not that I know of. Certainly not in MATLAB proper. You might find something on the file exchange, depending on whether you need an exact match to V1 or one with a tolerance.
Essentially, I would call this a random boolean (or 0-1) knapsack problem. An element from your set either appears in the knapsack, or it does not, but not more than one copy of any element. The problem is there may well be many possible subsets that fit your goal,
I could offer a solution that uses intlinprog, since that is a classical way to solve such a problem, although much depends on whether you need an exact solution or not.
Richard Brown on 7 Nov 2019
Edited: Richard Brown on 7 Nov 2019
My histogram is fine :)
If you look a little closer, you'll see I specified the bin centres (1:nsols) -- you can see that the 11 bars correspond to the 11 solutions. And the bars are exactly as expected - all solutions of the same class (e.g. a two and two ones) have around the same height, which is why there are three different heights evident, corresponding to the different ways of choosing (1,1,1,1), (2, 1, 1), and (3, 1)
While my toy example involving integers is admittedly contrived, it makes the point I was trying to make that each possible solution (where replicates are considered distinct) is not equally likely to be chosen! No more, no less!
And yes, the random scheme I proposed (before you suggested your elegant random objective orientation) is no better ... no arguments there