Using lsqnonneg to determine if there exists a solution to a (generally) under-determined linear system, but error is too large to be convincing.

9 views (last 30 days)
I would like to run many randomly generated instances of a specific problem. While I believe the problem always has a solution, I cannot find an analytical proof, so I want to test it numerically. I have written the problem as a linear system, A*x=b, where A is a matrix, x and b are vectors, and the elements of the solution x should be nonnegative. I am using lsqnonneg to solve each instance, my code is included below. Solution x will have length Nk*Njp, and what I would like to see is that, dividing x into a set of Njp blocks of length Nk, each of these blocks should sum to 1 (this condition, which is included as part of the system A*x=b, should be satisfied by any "correct" solution; quantity res appearing toward the end of the main program checks how closely this condition is satisfied). When I run the code, I get exitflag=1, indicating convergence to a solution. The norm of the residual, resnorm, is generally of order 10^-3 (as is quantity res mentioned above), which I do not find a convincing indicator that the solution is "correct". I have used fsolve in the past to solve different kinds of problems, obtaining fval of order 10^-12 and sometimes even smaller. I can readily accept 10^-12 as convincing evidence that the code converged to a good solution, but 10^-3 is not very convincing. Is there a way to "encourage" lsqnonneg to do better and find a solution with smaller error? Or is there something about the algorithm that does not allow for such a possibility? Or is there something else that I'm missing? Thanks much for any suggestions.
[Note: Somewhat surprisingly, as I increase the size of the problem, the error seems to get smaller. For example, by increasing dim, Nk, Njp. I suppose this may have something to do with the problem itself, and not so much with the numerical implementation. The problem arises from a research project in physics, which I suppose I could explain if anyone is interested, but it didn't seem appropriate for this venue.]
Main program code here:
global matA vecb x Nk Njp dim2
dim=4; % dimension of Hilbert space on which Ek and Ejp act
dim2=dim*dim;
Njp=5; % number of Ejp
Nk=2; % number of Ek
my_opt = optimset('TolFun',1e-20, 'TolX',1e-32, 'MaxFunEvals',5e6,'LargeScale','on');
for nn=1:1 % to repeat with different Ejp,Ek many times; collapsed to do only one time here
%randomly generate the E_j^\prime\ge0
sumjp=0;
for ii=1:Njp-1
Ajp{ii}=rand(dim);
Ejp{ii}=Ajp{ii}'*Ajp{ii};
sumjp=sumjp+Ejp{ii};
end
tmp=1/eigs(sumjp,1);
sumjp=tmp*sumjp;
Ejp{Njp}=eye(dim)-sumjp; % sum of the Ejp is identity matrix
for ii=1:Njp-1
Ejp{ii}=Ejp{ii}*tmp;
end
% next randomly generate coefficients between 0 and 1 such that the Ek are
% inside zonotope Z^\prime generated by the Ejp.
c0kj=rand(Nk,Njp);
sumEk=zeros(dim);
for ii=1:Nk-1
sumk=0;
for jj=1:Njp
sumk=sumk+c0kj(ii,jj)*Ejp{jj};
end
Ek{ii}=sumk;
sumEk=sumEk+Ek{ii};
end
tmp=1/eigs(sumEk,1);
sumk=zeros(dim);
for ii=1:Nk-1
Ek{ii}=Ek{ii}*tmp;
sumk=sumk+Ek{ii};
end
Ek{Nk}=eye(dim)-sumk; % sum of the Ek is also the identity matrix
% form vector b of system of linear equations vecb=matA*x
vecb=[];
for ii=1:Nk
tmp=reshape(Ek{ii},dim2,1);
vecb=[vecb;tmp];
end
vecb=[vecb;ones(Njp,1)];
% Form matrix A for system of linear equations
matA=[];
tmprow=[ones(1,Nk) zeros(1,(Njp-1)*Nk)];
for ii=1:dim
for jj=1:dim
tmpmatA=[];
for kk=1:Njp
tmpmatA=[tmpmatA Ejp{kk}(ii,jj) zeros(1,Nk-1)];
end
matA=[matA;tmpmatA];
tmpmatA=[];
end
end
tmp=matA;
for ll=1:Nk-1
tmp=circshift(tmp,1,2);
matA=[matA;tmp];
end
for ii=1:Njp
matA=[matA;tmprow];
tmprow=circshift(tmprow,Nk,2);
end
% Find a solution to A*x=b:
[X,RESNORM,RESIDUAL,EXITFLAG] = lsqnonneg(matA,vecb,my_opt)
resn(nn)=RESNORM;
% Check that sum of each block of X of length Nk is equal to 1
cnt2=0;
for mm=1:Njp
cnt1=cnt2+1;
cnt2=cnt1+Nk-1;
res(mm)=sum(X(cnt1:cnt2))-1; %THIS SHOULD VANISH, IDEALLY
end
end
X = 10x1
0.8877 0.1122 0.0346 0.9654 1.0000 0 1.0007 0 1.0022 0
RESNORM = 2.5930e-04
RESIDUAL = 37x1
0.0047 0.0038 -0.0024 0.0000 0.0038 -0.0009 -0.0001 0.0028 -0.0024 -0.0001
EXITFLAG = 1
  2 Comments
Torsten
Torsten on 3 Mar 2024
If the sum condition is important for you, maybe it's better to use lsqlin with this condition implemented in Aeq and beq and the non-negativity of the solution implemented in lb.
Scott Cohen
Scott Cohen on 3 Mar 2024
Thanks much, @Torsten. That does an excellent job of forcing the sum condition to be satisfied to good accuracy, of order 10-12 to 10-15 for the most part. However, RESNORM (norm of the residual) is still in the 10-3 to 10-5 range, in general. So I'm still not entirely convinced these are true solutions. If you have any other ideas, I'd love to hear them. Also, something I'm curious about but doesn't seem like it warrants a separate question: I'm using lb as you suggested. This, along with the sum condition automatically imposes an upper bound of 1 for each entry of solution x. Is it better to leave ub=[] (empty) under these circumstances, or should I use ub=ones(length,1)?

Sign in to comment.

Answers (3)

Torsten
Torsten on 3 Mar 2024
Moved: Torsten on 4 Mar 2024
If you are convinced that there should exist a "true" solution for your system, extract a matrix of size (10x10) that is non-singular and solve for x. The solution should then satisfy the 27 remaining equations. But this procedure doesn't give a feasible solution for x:
global matA vecb x Nk Njp dim2
dim=4; % dimension of Hilbert space on which Ek and Ejp act
dim2=dim*dim;
Njp=5; % number of Ejp
Nk=2; % number of Ek
my_opt = optimset('TolFun',1e-20, 'TolX',1e-32, 'MaxFunEvals',5e6,'LargeScale','on');
for nn=1:1 % to repeat with different Ejp,Ek many times; collapsed to do only one time here
%randomly generate the E_j^\prime\ge0
sumjp=0;
for ii=1:Njp-1
Ajp{ii}=rand(dim);
Ejp{ii}=Ajp{ii}'*Ajp{ii};
sumjp=sumjp+Ejp{ii};
end
tmp=1/eigs(sumjp,1);
sumjp=tmp*sumjp;
Ejp{Njp}=eye(dim)-sumjp; % sum of the Ejp is identity matrix
for ii=1:Njp-1
Ejp{ii}=Ejp{ii}*tmp;
end
% next randomly generate coefficients between 0 and 1 such that the Ek are
% inside zonotope Z^\prime generated by the Ejp.
c0kj=rand(Nk,Njp);
sumEk=zeros(dim);
for ii=1:Nk-1
sumk=0;
for jj=1:Njp
sumk=sumk+c0kj(ii,jj)*Ejp{jj};
end
Ek{ii}=sumk;
sumEk=sumEk+Ek{ii};
end
tmp=1/eigs(sumEk,1);
sumk=zeros(dim);
for ii=1:Nk-1
Ek{ii}=Ek{ii}*tmp;
sumk=sumk+Ek{ii};
end
Ek{Nk}=eye(dim)-sumk; % sum of the Ek is also the identity matrix
% form vector b of system of linear equations vecb=matA*x
vecb=[];
for ii=1:Nk
tmp=reshape(Ek{ii},dim2,1);
vecb=[vecb;tmp];
end
vecb=[vecb;ones(Njp,1)];
% Form matrix A for system of linear equations
matA=[];
tmprow=[ones(1,Nk) zeros(1,(Njp-1)*Nk)];
for ii=1:dim
for jj=1:dim
tmpmatA=[];
for kk=1:Njp
tmpmatA=[tmpmatA Ejp{kk}(ii,jj) zeros(1,Nk-1)];
end
matA=[matA;tmpmatA];
tmpmatA=[];
end
end
tmp=matA;
for ll=1:Nk-1
tmp=circshift(tmp,1,2);
matA=[matA;tmp];
end
for ii=1:Njp
matA=[matA;tmprow];
tmprow=circshift(tmprow,Nk,2);
end
% Find a solution to A*x=b:
% Inserted
[Xsub,idx] = licols(matA.',1e-10);
Xsub = Xsub.';
bsub = vecb(idx);
xx = Xsub\bsub
norm(matA*xx-vecb)
% End inserted
[X,RESNORM,RESIDUAL,EXITFLAG] = lsqnonneg(matA,vecb,my_opt);
resn(nn)=RESNORM;
% Check that sum of each block of X of length Nk is equal to 1
cnt2=0;
for mm=1:Njp
cnt1=cnt2+1;
cnt2=cnt1+Nk-1;
res(mm)=sum(X(cnt1:cnt2))-1; %THIS SHOULD VANISH, IDEALLY
end
end
xx = 10×1
1.3230 -0.3230 1.8469 -0.8469 0.1702 0.8298 0.0245 0.9755 0.9417 0.0583
ans = 2.2014e-16
function [Xsub,idx]=licols(X,tol)
%Extract a linearly independent set of columns of a given matrix X
%
% [Xsub,idx]=licols(X)
%
%in:
%
% X: The given input matrix
% tol: A rank estimation tolerance. Default=1e-10
%
%out:
%
% Xsub: The extracted columns of X
% idx: The indices (into X) of the extracted columns
if ~nnz(X) %X has no non-zeros and hence no independent columns
Xsub=[]; idx=[];
return
end
if nargin<2, tol=1e-10; end
[Q, R, E] = qr(X,0);
if ~isvector(R)
diagr = abs(diag(R));
else
diagr = R(1);
end
%Rank estimation
r = find(diagr >= tol*diagr(1), 1, 'last'); %rank estimation
idx=sort(E(1:r));
Xsub=X(:,idx);
end
Is it better to leave ub=[] (empty) under these circumstances, or should I use ub=ones(length,1)?
I don't know. Usually - the less constraints, the better. Do you get different results for the two cases ?
  5 Comments
Torsten
Torsten on 4 Mar 2024
Moved: Torsten on 4 Mar 2024
Your system matrix matA is 37x10. Thus your system is overdetermined.
Pick a submatrix matAsub of size 10x10 with full rank and solve matAsub*xsub = vecbsub.
If your 37x10 system had a true solution x, then x also had to satisfy matAsub*x = vecbsub. Since matAsub is regular, x = xsub. But the solution we get for xsub is not non-negative in all its components. Thus your full system matA*x = vecb cannot have a true solution with all its components non-negative.
Scott Cohen
Scott Cohen on 4 Mar 2024
Moved: Torsten on 4 Mar 2024
Oh, sorry, you're right. I lost track of the inputs in the specific example I used for the original post. So, this appears to be an answer to my question. If you'd like to post it as an answer, I'll accept it. Thanks much!

Sign in to comment.


Matt J
Matt J on 3 Mar 2024
There is nothing that can be inferred from the resnorm alone about the quality of the solution. 10^-3 is a very good result if the second derivatives at the solution are on the order of 10^12, for example.
  8 Comments
Scott Cohen
Scott Cohen on 4 Mar 2024
Apologies for the confusion. I guess I was myself somewhat confused about the use of such large values, but that is probably because I'm not much of a computer person and didn't realize it's importance.
I would have thought the cost function was something simple, such as a sum of squares. In that case, I guess the second derivatives would be independent of the output solution and therefore, not so relevant. But it now occurs to me that cost function isn't so simple, after all, is that correct?
Matt J
Matt J on 4 Mar 2024
Edited: Matt J on 4 Mar 2024
I wouldn't say the cost function is complicated. The cost function minimized by lsqnonneg(C,d) is,
The matrix of second derivatives (i.e., the Hessian) of this cost function is .

Sign in to comment.


Matt J
Matt J on 4 Mar 2024
Edited: Matt J on 4 Mar 2024
A better way to investigate the existence of a solution to Aeq*x=beq, x>=0 might be (as below) to find the minimum distance between the equality constrained region and the positive orthant. This way, you avoid the dependencies of resnorm on the scale of Aeq, beq and other algebraic properties.
x=optimvar('x',width(Aeq));
y=optimvar('y',size(x),'Lower',0);
prob=optimproblem('Objective',sum((x-y).^2), 'Constraints', Aeq*x==beq);
[~,fval,exitflag]=solve(prob);
minDistance=sqrt(fval); %Measure of "solvability"

Tags

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!