How to generate matrices that satisfies constraints on sum of row elements and sum of column elements?

25 views (last 30 days)
An example to clarify what I mean:
Generate some 2x3 matrices such that:
1) sum of row elements: row_sum=[100; 200; 300]
2) sum of column elements: column_sum=[500 100]
3) All elements greater than or equal to zero.
Possible_matrix_1=[100 0;200 0;200 100]
Possible_matrix_2=[50 50; 200 0;250 50]
and so on...
I understand that one way to is to generate random matrices and eliminate the ones that don't satisfy the conditions. However, I'd prefer something more efficient if it is possible.
Thank you
Kind regards,
Sharadhi
  3 Comments
sharadhi
sharadhi on 16 Aug 2019
Hi John,
No. They don't have to be integers. The elements can be floating point numbers.
I've made an edit to the question to be more specific. The matrices required are not always square. Further, the values of each row sum and each coloumn sum are different numbers.
John D'Errico
John D'Errico on 17 Aug 2019
Sigh. As is often the case, the problem has morphed in a new direction. The basic techniques I posed are still workable, but they need to be modified, as you might expect.

Sign in to comment.

Accepted Answer

Bruno Luong
Bruno Luong on 16 Aug 2019
Edited: Bruno Luong on 8 Feb 2021
New code with speficied target row and col sum, using FEX randfixesum and requires optimization toolbox
EDIT: new code no longer requires FEX
rs = [100; 200; 300] % target row sum
cs = [500, 100] % target column sum
if abs(sum(rs)-sum(cs)) > 1e-10*abs(sum(rs))
error('sum(rs) must be equal to sum(cs)');
end
% This part need to be done once if rs and cs are unchanged
m = length(rs);
n = length(cs);
I = zeros(m,n);
I(:) = 1:m*n;
Aeq = accumarray([repmat((1:m)',n,1) I(:); repelem(m+(1:n)',m,1) I(:)],1);
beq = [rs(:); cs(:)];
lb = zeros(m*n,1);
ub = inf(m*n,1);
B = zeros(m*n);
for k=1:m*n
f = accumarray(k, 1, [m*n 1]);
B(:,k) = linprog(f, [], [], Aeq, beq, lb, ub);
end
% This part generate new random A
x=-log(rand(m*n,1)); x=x./sum(x);
% x = randfixedsum(m*n,1,1,0,1); % Fex
A = reshape(B*x,[m n])
% Check
sum(A,2) % ~ rs
sum(A,1) % ~ cs
  33 Comments
enrico maggiolini
enrico maggiolini on 23 Feb 2023
@Bruno Luong how can i avoid some values from solutions? examples:
i want that a solution is an integer positive number but no 1 or 2, so it could be 0 or 3 or 4 or....
if i put lb=3 i exclude the 0 that is a good value for me.
Bruno Luong
Bruno Luong on 23 Feb 2023
@enrico maggiolini you must be aware that if you add constraints to a problem of generating random variables, the underline pdf will changed and the initial algorithm in most case no longer suitable.
I stop to follow your request, since you keep changing constantly, and to be changing once makes my answer no longer applicable.

Sign in to comment.

More Answers (2)

John D'Errico
John D'Errico on 16 Aug 2019
Edited: John D'Errico on 17 Aug 2019
You have 9 unknowns, with 6 linear equality constraints. The unknowns are bounded between 0 and 100. You don't say an upper bound, but if the sum is 100, and they are all positive, then I'll leave it to you to decide what the upper bound is. ;-)
The simple solution is to generate a random sample, then iteratively rescale it to have the desired properties. The problem is this will generate biased sets of random numbers, even if you can do things that way.
So my first algorithm will be a simple one...
M = rand(3,3)*100;
targetsum = 100;
err = inf;
tol = 1e-15;
while err > tol
M = targetsum*M./sum(M,1);
M = targetsum*M./sum(M,2);
err = max(max(abs(sum(M,1) - targetsum)) , max(abs(sum(M,2) - targetsum)));
% dump out err at each iteration just to see how the process is converging.
err
end
This simplsitic scheme coverges reasonably well. I've written out the error estimates at each iteration.
err =
6.70807649349402
err =
0.205005892822271
err =
0.00718973760706376
err =
0.000254603955340826
err =
9.02174510031273e-06
err =
3.19693384653874e-07
err =
1.13286660052836e-08
err =
4.01456645704457e-10
err =
1.42250655699172e-11
err =
4.9737991503207e-13
err =
2.8421709430404e-14
err =
2.8421709430404e-14
err =
0
And, while you can see it has converged to a "random" matrix" with the desired properties, A serious issue is that the matrix elements will tend to lie rarely in the corners of the 9 dimensional hypercube compared to that which a truly uniform random generator would produce. That is, we have a biased sampling scheme. (Doing better can be difficult though. But it is still entirely possible. In fact, it is possible to gain a truly uniform sample over the domain of interest.)
sum(M,1)
ans =
100 100 100
sum(M,2)
ans =
100
100
100
M
M =
24.0818968029536 31.2112084641951 44.7068947328513
33.9697588671318 36.734740106533 29.2955010263352
41.9483443299146 32.0540514292719 25.9976042408136
My point is, all of the elements will tend to lie close to 100/3 = 33.33333.... The renormalization scheme tends to produce exactly that kind of event.
Algorithm 2:
Here, the trick is to find the hyperplane all such matrices lie in, then work over that region, so a subset of a planar subspace of R^9. We can find some basic matrices all of which satisfy the requirement.
M0 = eye(3)*100;
P = perms(1:3);
A = zeros(6,9);
for i = 1:6
A(i,:) = reshape(M0(P(i,:),:),[1,9]);
end
Abar = A - 100/3;
% Rotate the rows of Abar to lie in a subspace. In this case,
% it will be a 4-dimensional subspace.
proj = orth(Abar');
Abarproj = Abar*proj;
% Now, tessellate those points. This requires only 3 simplexes, each of
% which have the same identical volume, in the 4-dimensional subspace.
tess = delaunayn(Abarproj);
% We can generate a random point inside that hypervolume by the
% following scheme:
% choose a random simplex from the set of three in the tessellation
sind = randi(3,1,1);
X = Abarproj(tess(sind,1),:);
for i = 2:5
Y = Abarproj(tess(sind,i),:);
p = rand().^(1./(i-1));
X = X.*repmat(p,1,4) + Y.*repmat(1-p,1,4);
end
X = reshape(X*proj',3,3) + 100/3;
Did it work? Here is the resulting matrix
X
X =
56.0547109547294 25.0034662806935 18.9418227645771
0.373141694283696 70.0790469577327 29.5478113479837
43.5721473509869 4.91748676157381 51.5103658874393
>> sum(X)
ans =
100 100 100
>> sum(X,2)
ans =
100
100
100
The row sums and column sums are all correct. And better, the elements of X are uniformly distrbuted inside the necessary volume. As you can see, there are some elements that are both small and large.

David Goodmanson
David Goodmanson on 16 Aug 2019
Edited: David Goodmanson on 17 Aug 2019
Hi sharadhi,
Here is an example for 3x3, where all the row and column sums are 100 (looks like you edited the original question). There are six equations, but only five of them are independent. Since the sum of each row is 100, the sum of all the elements is 300. If the sum of each of the first two columns is 100, then since the sum of all elements is 300, the sum of the third column has to be 100. You don't need the sixth equation. With nine elements and five equations, you are free to choose values for four elements. Suppose the matrix is
a b c
d e f
g h i
and you choose the elements in the upper left square a, b, d, e. Then the sum rules for rows and columns force the values of the remaining five elements and you are done. However, if you want all the elements to be >= 0, then there are some initial conditions on a,b,d,e.
c = 100-a-b, so c not negative means that a+b <=100. Similarly for three other cases, so there are the following initial conditions:
a+b <= 100
d+e <= 100
a+d <= 100
b+e <= 100
Also the element i turns out to be (a+b+d+e) -100, and to keep that element nonnegative requires that
a+b+d+e >= 100
There are plenty of solutions in the integers if that is advantageous.
Many other cases with non-square matrices, and with row and column sums that are not all the same, can be done in a similar fashion. Given arbitrary row and column sums (with of course the condition that sum(row sums) = sum(column sums) I am not sure that there will always be a solution with all nonnegative elements. However for the case you stated (you actually meant 3x2 rather than 2x3), then there are six elements and 5 -> 4 equations, so you can choose two elements. Suppose in
a b
c d
e f
you choose a and c. Then the rest of the elements are forced,with the conditions
a <= 100
c <= 200
a+c >= 200

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!