You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
fmincon: any way to enforce linear inequality constraints at intermediate iterations?
131 views (last 30 days)
Show older comments
I am solving an optimization problem with the interior-point algorithm of fmincon.
My parameters have both lower and upper bounds and linear inequality constraints.
A_ineq = [1 -1 0 0 0;
-1 2 -1 0 0;
0 -1 2 -1 0;
0 0 -1 2 -1;
0 0 0 1 -1];
b_ineq=[0 0 0 0 0];
I observed that fmincon satisfies, of corrse, the bounds at all iterations but not the linear inequality constraints.
However, the linear inequality constraints are crucially important in my case. If violated, my optimization problem can not be solved.
Is there anything one can do to ensure that linear inequality constraints are satisfied at intermediate iterations?
6 Comments
Walter Roberson
on 2 Mar 2023
fmincon interior-point, sqp, trust-region-reflective satisfy bounds constraints at each iteration, but active-set can violate bounds constraints.
But those refer to bound constraints, not to linear constraints.
SA-W
on 2 Mar 2023
@Walter Roberson
I am aware of that. But I thought that there are maybe some manipulationens to the linear constraints such that fmincon gives them a higher prority at intermediate iterations.
SA-W
on 2 Mar 2023
Scaling the matrix A_ineq by 1e4 or any other large value really helps in my case to (in a least-squares sense) enforce the linear constraints at intermediate iterations.
Matt J
on 6 May 2023
Edited: Matt J
on 6 May 2023
I'd like to point out that it violates the theoretical assumptions of fmincon, and probably all the Optimization Toolbox solvers as well, when the domain of your objective function and constraints is not an open subset of
. If you forbid evaluation outside the closed set defined by your inequality constraints, that is essentially the situation you are creating. It's not entirely clear what hazards this creates in practice, but it might mean one of the Global Optimization Toolbox solvers might be more appropriate.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1376369/image.png)
Accepted Answer
Matt J
on 5 May 2023
Edited: Matt J
on 23 May 2023
Within your objective function, project the current x onto the constrained region using quadprog,
fun=@(x) myObjective(x, A_ineq,b_ineq,lb,ub);
x=fmincon(fun, x0,[],[],[],[],[],lb,ub,options)
function fval=myObjective(x, A_ineq,b_ineq,lb,ub)
C=speye(numel(x));
x=lsqlin(C,x,A_ineq,b_ineq,[],[],lb,ub); %EDIT: was quadprog by mistake
fval=...
end
47 Comments
SA-W
on 6 May 2023
If I understand this snippet correctly, you want to invoke fmincon without passing any constraints and bounds and, at every iteration of fmincon, let quadprog project the x such that all constraints are satisfied?
If so I have two questions:
If constraints were satisfied, quadprog would not change the x, right?
I could also use this strategy with lsqnonlin, right?
Matt J
on 6 May 2023
Edited: Matt J
on 6 May 2023
I changed my mind about passing bounds. You should put those in. Yes, I guess you could do it with lsqnonlin as well.
I'm not sure about the performance of this strategy, as compared to my other answer. It violates differentiability assumptions. Also, because you will be using finite difference derivative approximation, quadprog will need to be called N times per iteration, so it wouldn't be a good strategy for high-dimensional problems.
SA-W
on 7 May 2023
And why would you not pass the linear constraints to fmincon but only to quadprog?
Matt J
on 7 May 2023
It is a quadratic minimization problem
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1376669/image.png)
where C is the set defined by the constraints.
SA-W
on 7 May 2023
I know. But why should I not pass the linear constraints to fmincon? I do not see the connection with the subsequent call to quadprog.
Matt J
on 7 May 2023
Edited: Matt J
on 7 May 2023
Ah, well there's no reason fmincon should need to worry about enforcing linear inequality constraints if every test point is going to be projected into a feasible candidate anyway. The objective function is incapable of evaluating anything outside C.
Come to think of it, though, it would probably be wise to apply a penalty on distance from the feasible set as well to remove ambiguity in the solution.
function fval=myObjective(x, A_ineq,b_ineq,lb,ub, weight )
C=speye(numel(x));
[x,dist]=quadprog(C,x,A_ineq,b_ineq,[],[],lb,ub);
fval=...
fval=fval+weight*dist
end
SA-W
on 23 May 2023
I tried your suggested strategy, but it is not working as expected:
function fval=myObjective(x, A_ineq,b_ineq,lb,ub, weight )
%1st iteration, i.e. x is the start vector
x =
0.004000000000000
0.001000000000000
0
0.001000000000000
0.004000000000000
0.009000000000000
0.016000000000000
0.025000000000000
0.036000000000000
0
0.020250000000000
0.063245553203368
0.083495553203368
0.081000000000000
0.144245553203368
0.089442719099992
0.109692719099992
0.170442719099992
all(A_ineq*x) % == 1, i.e. start vector satisfies constraints
C=speye(numel(x));
[x,dist]=quadprog(C,x,A_ineq,b_ineq,[],[],lb,ub);
x =
1.0e-03 *
0.178567411241192
0.050029327861103
0
0.000014275626508
0.000030351196081
0.000053745685964
0.000087040217878
0.000137579529376
0.000245546982784
0
0.000003261476952
0.000013844333853
0.000023204570239
0.000023729908401
0.000054189214599
0.000019086149826
0.000029710389433
0.000063797849365
% not possible to work with the new x, which is several magnitudes lower
end
Calling quadprog with the start vector, which satisfies all constraints and bounds, results in a new vector x, where nearly all components are several orders of magnitudes lower.
I attached the matrix A_ineq, b_ineq=0, lb=2, ub=0. Basically, A_ineq encodes that parameters are less or greater than neighboring parameters.
Ideally, if constraints were satisfied, I would expect quadprog to not change x significantly.
Does it make sense to you how the projected x looks like in my case?
SA-W
on 23 May 2023
Update:
The issue seems to be not related to the linear constraints but to the bounds.
C=speye(numel(x));
[x,dist]=quadprog(C,x,[],[],[],[],lb,ub);
x =
1.0e-03 *
0.001101980588430
0.160659227282455
0
0.160659227282455
0.001101980588430
0.000065909263473
0.000037138227992
0.000023769741153
0.000016507062926
0
0.000029344843098
0.000009396094623
0.000007117296309
0.000007336573768
0.000004119807053
0.000006644060468
0.000005417528065
0.000003486590875
lb =
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
ub =
1
1
0
1
1
1
1
1
1
0
2
2
2
2
2
2
2
2
Calling quadprog without bounds and linear constraints, I get the the vector that I pass multiplied by (-1).
Do you have any idea as to why the bounds are causing this issue?
Torsten
on 23 May 2023
Edited: Torsten
on 23 May 2023
You want to find a vector x that minimizes
1/2*(x-x_passed).'*(x-x_passed)
under the constraints
A_ineq*x <= b_ineq
where x_passed is the vector you get from fmincon.
Thus f in the list of inputs to quadprog must be -x_passed, not x_passed:
[x,dist]=quadprog(C,-x,A_ineq,b_ineq,[],[],lb,ub);
instead of
[x,dist]=quadprog(C,x,A_ineq,b_ineq,[],[],lb,ub);
Matt J
on 23 May 2023
@SA-W Sorry, I made a mistake. Instead of quadprog, I meant for you to use lsqlin:
fun=@(x) myObjective(x, A_ineq,b_ineq,lb,ub);
x=fmincon(fun, x0,[],[],[],[],[],lb,ub,options)
function fval=myObjective(x, A_ineq,b_ineq,lb,ub)
C=speye(numel(x));
x=lsqlin(C,x,A_ineq,b_ineq,[],[],lb,ub);
fval=...
end
But I want to reiterate that I think this whole thing you are pursuing is probably very inefficient. You should just abort the objective function with inf, as I suggest in my other answer.
SA-W
on 23 May 2023
Thanks to both of you. Makes sense to pass -x to quadprog.
@Matt J I have an analytical expression for the Jacobian. Hence, quadprog of lsqlin have to be called only once per iteration of lsqnonlin/fmincon. That said, I think the overall strategy is not so inefficient.
Anyway, why would you use lsqlin instead of quadprog?
Matt J
on 23 May 2023
Edited: Matt J
on 24 May 2023
@SA-W In the code I originally posted, quadprog was being called with the input argument syntax of lsqlin, so that probably explains why it didn't work. lsqlin is more specialized than quadprog, so I tend to think it is better to use that when it is applicable.
Regarding your analytical gradient, I am not sure it is valid anymore. If your original objective function was f(x) with gradient g(x), the new one with the projection onto the linear contraints is f(P(x)) where P is my notation for the projection operator. The new gradient would be
*g(P(x)), but I don't immediately see any easy way to compute
.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1391479/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1391479/image.png)
EDIT: Possibly, you can take
as the linear projection operator (transposed) onto the active linear constraints.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1391479/image.png)
SA-W
on 23 May 2023
"The new gradient would be *g(P(x)), but I don't immediately see any easy way to compute nabla P "
I see what you mean. If I were using finite differencing at fmincon/lsqnonlin level, I would not have to worry about that, right?
" EDIT: Possibly, you can take nabla P as the linear projection operator (transposed) onto the active linear constraints. "
Does this help in calculating nabla P?
Matt J
on 23 May 2023
I see what you mean. If I were using finite differencing at fmincon/lsqnonlin level, I would not have to worry about that, right?
Right, but it would slow you down a lot.
Does this help in calculating nabla P?
Yes, projecting onto the active constraints is a simple matrix multiplication.
SA-W
on 23 May 2023
"Yes, projecting onto the active constraints is a simple matrix multiplication."
Could you please explain a little bit how that could be implemented?
SA-W
on 23 May 2023
Do the expressions look the same for inequailities instead of equalities, i.e. if the set is given by
set {x|A*x <= b} instead of set {x|A*x=b} ?
Matt J
on 23 May 2023
Edited: Matt J
on 23 May 2023
No, but the case {x|A*x <= b} doesn't apply here. You will be projecting onto the active constraints. Active constraints are the ones satisfied with equality at the solution lsqlin gives you..
SA-W
on 23 May 2023
Say A*x > 0 at some components with x given by fmincon. After invoking lsqlin, I observed that A*x=0 at those components. Is this the expected behavior?
SA-W
on 23 May 2023
"I don't know what you mean by "with x given by fmincon". In the approach we are talking about, fmincon can does not act independently of lsqlin. You are calling lsqlin inside the objective function, which is in turn called iteratively by fmincon."
Yes, that's clear. Within the objective function, say some constraints are violated, i.e. A_ineq*x > 0 for some components. Then I call lsqlin to project x onto the feasible set and what I observe then is that A_ineq*x = 0 at those components, i.e. those constraints are active now.
Does it make sense that the constraints are active after calling lsqlin?
Torsten
on 23 May 2023
They can be active or inactive after calling "lsqlin". The purpose of the call to "lsqlin" is to make the x from "fmincon" feasible, and feasible means: a_i*x = b_i or a_i*x < b_i.
SA-W
on 24 May 2023
Suppose a_i*x > b_i before calling lsqlin. After calling lsqlin, is this constraint active or inactive, or is there no general statement?
Matt J
on 24 May 2023
Edited: Matt J
on 24 May 2023
The general statement is this - an inequality constraint a_i*x <= b_i can either be active, inactive, or violated at a given point x. In ideal mathematical terms, these cases are as follows:
a_i*x < b_i (inactive)
a_i*x = b_i (active)
a_i*x > b_i (violated)
An equality constraint will always be active, when satisfied. Otherwise, it is violated.
The optimization strategy we are pursuing here requires that you find which constraints are active at the projected point generated by lsqlin. As you do this, you have to be mindful that you are working with a finite precision computer, and so you have to modify the ideal notion of what it means for a constraint to be active. You must instead consider a constraint active if,
abs( a_i*x - b_i ) < tol
where tol>=0 is a tolerance parameter to be chosen by you.
SA-W
on 24 May 2023
The new gradient would be *g(P(x)), but I don't immediately see any easy way to compute nabla P .
EDIT: Possibly, you can take nabla P as the linear projection operator (transposed) onto the active linear constraints.
So, in the equation
nablaP =(I-pinv(A)*A).';
the matrix A contains only rows associated with active constraints?
Torsten
on 24 May 2023
Suppose a_i*x > b_i before calling lsqlin. After calling lsqlin, is this constraint active or inactive, or is there no general statement?
No general statement is possible in my opinion.
SA-W
on 13 Feb 2025 at 7:01
Edited: SA-W
on 13 Feb 2025 at 7:04
You suggested to project onto active constraints using lsqlin in every call to the objective function using
x = lsqlin(C,x,A_ineq,b_ineq,[],[],lb,ub);
The projection onto the active set and its gradient are then given as
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826049/image.png)
In the objective function, we can then return
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826050/image.png)
where g(x) is my notation for the gradient my objective function would return without any lsqlin projection.
First of all, I do not think the equation for
is correct if we pass the bounds (lb and ub) to lsqlin. Consider the toy problem in 1d (ub = 0, lb = 1) where the "projection function" is already not-differentiable. Correct me if I am wrong, but when passing the bounds to lsqlin, there is no hope for an analytical
.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826051/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826051/image.png)
But thats ok, I can formulate my problem without bounds. More important for me is the case where some components of x are fixed values (i.e.
for some i). For this, lets partition
and
where
refers to the fixed (constrained) values and
to the free values, Ais the linear inequality matrix. Then the projection problem reduces to finding the free components:
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826053/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826054/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826055/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826056/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826057/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826058/image.png)
So the overall
being used for the gradient of the objective
has a 2x2 block structure
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826051/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826060/image.png)
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826061/image.png)
(i) Is this mathematical derivation correct? It looks a bit suspicious to me to have this zero block at the top right, but intutitively it makes sense (the fixed values do not change if the free values are changed...)
(ii) Below is a minimal example demonstrating the projection step and the extraction of the active constraints at the projected solution. Looks this reasonable?
The reason why I show this is that the gradient of my objective function (probably the
part) is not valid anymore what I verify against a FD approximation. Without the projection, the gradient check works. So I might have a logical mistake in the code below, maybe also in the way I am extracting the active constraints and construct the final nabla P.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826051/image.png)
% Points and mock values x
pt = [3 5 7 9 11]; % Shape 5x1
x = [0 0.046506608245001 0.096706757662298 0.153809034551104 0.205280568411077];
% Compute linear constraints
k = 4; % cubic spline f
B = spapi(k, pt, eye(numel(pt))); % function basis
Bd = fnder(B, 1); % first derivative basis
Bdd = fnder(B, 2); % second derivative basis
% Implement [f >= 0; fd >= 0, fdd >=0] in Aineq x <= b
A = [-B.coefs'; -Bd.coefs'; -Bdd.coefs']; % Shape 12x5
b = zeros(size(A, 1), 1); % Shape 12x1
% Assume x = [x_c, x_f] where x_c = x(1) = 0
xc = 0; % Shape 1x1
xf = x(2:end); % Shape 4x1
Ac = A(:, 1); % Shape 12x1
Af = A(:, 2:end); % Shape 12x4
% Projection onto active set A_f x_f = b - A_c x_c
C = eye(numel(xf));
zf = lsqlin(C, xf, Af, b);
% Find active constraints at projected solution
idx_active = find(abs(Af*zf) < 1e-6); % idx = [1, 12], i.e., two ctr are active
Af_active = Af(idx_active,:);
nablaP = eye(size(Af_active, 2)) - pinv(Af_active) * Af_active;
% Concatenated solution z and gradient nabla P
z = vertcat(0, zf);
nablaP = blkdiag(0, nablaP); % Create 2x2 block shape
nablaP(2:end, 1) = -pinv(Af_active) * Ac(idx_active); % Set (2,1) block
nablaP(1, 1) = 1.0; % Set (1,1) block
Matt J
on 13 Feb 2025 at 13:24
Edited: Matt J
on 13 Feb 2025 at 13:43
Correct me if I am wrong, but when passing the bounds to lsqlin, there is no hope for an analytical.
I don't see why the presence of bounds would invalidate anything, as long as you are treating the bounds the same way you are treating the other inequality constraints. Remember, bounds are a special case of linear inequality constraints, when Aineq is the identity matrix eye(N)*x<=ub, -eye(N)*x<=-lb. So, you also need to look at which bounds are active when you form the projection operator pinv(A).
More important for me is the case where some components of x are fixed values.
You haven't explained how you are implementing that. Ideally, fixed and apriori known values wouldn't be among the x(i) at all - because they're not unknowns. In that case, they wouldn't have any role to play in the projection.
If you are including them in the unknowns, it would have to mean that you have set lb(i)=ub(i)=c(i) If that's the case, the only change that should be needed is to treat these as bound constraints that are always active, as mentioned above.
The reason why I show this is that the gradient of my objective function (probably the part) is not valid anymore what I verify against a FD approximation. Without the projection, the gradient check works.
You haven't shown us the numbers, but there is no reason to expect agreement with an FD check anymore, or at least not everywhere. The function is now only piecewise differentiable. At points where the set of active constraints change (such as at the boundary of the linearly constrained region) there will be a discontinuity in the derivative.
SA-W
on 13 Feb 2025 at 14:16
Edited: SA-W
on 13 Feb 2025 at 15:45
You haven't shown us the numbers, but there is no reason to expect agreement with an FD check anymore, or at least not everywhere. The function is now only piecewise differentiable.
This makes sense. I did not keep that in mind. However, I managed to create a toy problem which does not converge to the expected solution.
Specifically, I am creating reference data (xq --> yq = xq.^2) and want to fit a quadratic spline to this data. The objective function is ||f(xq) - yq||^2, where f(xq) represents the sampled spline values. Since the spline is quadratic, it can perfectly fit the data, which is reproduced by the code.
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826083/image.jpeg)
Then, I implemented the linear constraints f''>=0 and the projection as discussed herein and fmincon converged to a wrong solution:
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826084/image.jpeg)
You can reproduce the output with this code. Do you see a mistake in the implementation?
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;
function [fval, grad] = Objective_func(y, k, x, Aineq, bineq, xq, yq)
% y are the current parameters
% Project y onto active set
C = eye(numel(y));
options = optimset('Display', 'off');
y = lsqlin(C, y(:), Aineq, bineq, [], [], [], [], [], options);
% Find active ctr
idx_active = find(abs(Aineq * y(:)) < 1e-6);
Aineq_active = Aineq(idx_active,:); %#ok<FNDSB>
nablaP = eye(numel(y)) - pinv(Aineq_active) * Aineq_active;
% Compute residual and sum of squares
f = spapi(k, x, y);
residual = fnval(f, xq) - yq;
fval = sum(residual.*residual);
% Compute jacobian and gradient
if nargout > 1
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = 2 * jacobianT * residual(:);
grad = nablaP' * grad;
end
end
Matt J
on 13 Feb 2025 at 17:11
For a quadratic spline, you need k=2:
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 2;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826095/image.png)
SA-W
on 13 Feb 2025 at 17:42
Edited: SA-W
on 13 Feb 2025 at 19:22
For a quadratic spline, you need k=2:
The curve you plotted is a piece-wise linear function. For the spapi function, k refers to the order of the spline which is the degree +1. So we need k=3 for a quadratic spline (degree 2).
So for k=3, can you fit the data as I showed in the first picture of my previous post? As I said, when the projection terms are included, fmincon converges not to the solution.
Matt J
on 14 Feb 2025 at 0:19
Edited: Matt J
on 14 Feb 2025 at 4:26
% Generate data
xdata = linspace(3, 11, 30); ydata = xdata.^2;
% Quadratic spline f and basis for df/dy
k = 3;
xcp = [3 5 7 9 11]; %control point x-locations
fcn = @(z) fnval(spapi(k, xcp, z),xdata);
tic
S=func2mat(fcn,xcp,'doSparse',false);
toc
Elapsed time is 0.055264 seconds.
dS=diff(S,1,1);
% Linear constraints
Aineq = -dS;
bineq = zeros(height(dS), 1);
% Function handle and solver options
objectiveFunc = @(z) Objective_func(z,Aineq, bineq, ydata,S);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
z0 = -xcp(:); %
violation = any(Aineq*z0>bineq) %Initial guess z0 violates the constraint
violation = logical
1
z = fmincon(objectiveFunc, z0, lb, ub, A, b, [], [], [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
close all
xFit=linspace( xdata(1),xdata(end),1e4);
yFit=fnval(spapi(k, xcp, z), xFit );
plot(xFit,yFit,'-b',xdata,ydata, 'ro'); hold off
legend('Fitted Spline Samples','Data',Location='northwest'); hold off
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826179/image.png)
function [fval, grad] = Objective_func(z,Aineq, bineq, ydata,S)
% y are the current parameters
z=z(:); ydata=ydata(:); I = speye(numel(z));
% Project z onto constraining set
options = optimoptions('lsqlin','Display', 'none');
zp = lsqlin(I, z, Aineq, bineq, [], [], [], [], [], options);
errData=S*zp-ydata; %data error term (depends only on zp)
errReg=zp-z; %penalty term on constraint violation
% Compute objective
fval=norm(errData)^2/2 + norm(errReg)^2/2;
% Compute jacobian and gradient
if nargout > 1
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I - pinv(A) * A;
grad=(S*JacP).'*errData + (JacP-I).'*errReg;
end
end
SA-W
on 14 Feb 2025 at 6:01
Edited: SA-W
on 14 Feb 2025 at 6:02
Thanks for the working example.
fcn = @(z) fnval(spapi(k, xcp, z),xdata);
In my real problem, the evaluation points (xdata) are different for every fmincon iteration. So probably, I needed to setup (fcn) and (Aineq) anew inside the objective function.
Anyway, the minimal example code I provided is closer to my real code base. Also the logic of computing the constraints as (Aineq = -Bdd.coefs'). So it were important for me to understand why the code below converges to the wrong solution.
What needs to be changed below? The gradient at the solution is zero, so I suspect something must be wrong with the gradient, but I can not see a mistake.
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826185/image.png)
function [fval, grad] = Objective_func(y, k, x, Aineq, bineq, xq, yq)
% y are the current parameters
% Project y onto active set
C = eye(numel(y));
options = optimset('Display', 'off');
y = lsqlin(C, y(:), Aineq, bineq, [], [], [], [], [], options);
% Find active ctr
idx_active = find(abs(Aineq * y(:)) < 1e-6);
Aineq_active = Aineq(idx_active,:); %#ok<FNDSB>
nablaP = eye(numel(y)) - pinv(Aineq_active) * Aineq_active;
% Compute residual and sum of squares
f = spapi(k, x, y);
residual = fnval(f, xq) - yq;
fval = sum(residual.*residual);
% Compute jacobian and gradient
if nargout > 1
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = 2 * jacobianT * residual(:);
grad = nablaP' * grad;
end
end
Matt J
on 14 Feb 2025 at 6:42
Edited: Matt J
on 14 Feb 2025 at 6:58
The main problem seems to be that you are not adding a penalty term on the projection residual, as I advised in this earlier comment. This creates lots of local minima. Your code does converge as is when a better initial point is provided, e.g., x0 =x.^2-1. By adding a penalty term, as below, it also converges from x0=-x.^2:
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Optimizer (no constraints)
lb = []; ub = []; A = []; b = [];
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, lb, ub, A, b, [], [], [], options);
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1826187/image.png)
function [fval, grad] = Objective_func(z, k, x, Aineq, bineq, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
options = optimset('Display', 'off');
zp = lsqlin(I, z, Aineq, bineq, [], [], [], [], [], options);
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I- pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + (JacP-I).'*projectionResidual;
end
end
SA-W
on 14 Feb 2025 at 12:00
The main problem seems to be that you are not adding a penalty term on the projection residual, as I advised in this earlier comment. This creates lots of local minima.
Ah, I was not aware that neglecting the constraint violation would lead to multiple local minima since the problem without projection is quadratic.
Below, I added the constraint x(1) = 0 and I implemented it by setting lb(1) = ub(1) = 0 and pass the bounds to both fmincon and the lsqlin solver. Keeping all parameters (also the fixed ones) is easier in my interfaces. You said ...
So, you also need to look at which bounds are active when you form the projection operator pinv(A).
In this example, lb(1) and ub(1) are active per design. But how can we include this when forming JacP?
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:); % Active linear constraints, How to include active lb(1) and ub(1) ?
JacP = I- pinv(A) * A;
One idea I have (based on your comment) is to add the unit vectors ([-1 0 0 0 0] and [1 0 0 0 0]) to Aineq ...
lb1 = [-1 0 0 0 0]; ub1 = [1 0 0 0 0];
A = [lb1; ub1; A];
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6; % Now always includes indices "1" and "2"
A = Aineq(idx_active,:);
JacP = I- pinv(A) * A;
but I feel this is not the most efficient way to do this.
Code with x(1) = lb(1) = ub(1) = 0:
% Generate data
xq = linspace(3, 11, 30);
yq = xq.*xq;
% Quadratic spline f and basis for df/dy
k = 3;
x = [3 5 7 9 11];
B = spapi(k, x, eye(numel(x)));
% Linear constraints
Bdd = fnder(B, 2);
Aineq = -Bdd.coefs'; % Shape 3x5
bineq = zeros(size(Aineq, 1), 1); % Shape 5x1
% Bound constraints to fix x(1) = 0
lb = -Inf(numel(x), 1);
ub = +Inf(numel(x), 1);
lb(1) = 0.0;
ub(1) = 0.0;
% Function handle and solver options
objectiveFunc = @(y) Objective_func(y, k, x, lb, ub, Aineq, bineq, xq, yq);
options = optimoptions("fmincon", ...
"FiniteDifferenceStepSize", 1e-6, ...
"SpecifyObjectiveGradient", true);
% Call optimizer
x0 = -x .*x; % x0 activates the constraint
sol = fmincon(objectiveFunc, x0, [], [], [], [], lb, ub, [], options);
% Plot data and solution
f = spapi(k, x, sol);
fnplt(f); hold on;
plot(xq, yq, 'r+', 'DisplayName', 'data');
legend show;
function [fval, grad] = Objective_func(z, k, x, lb, ub, Aineq, bineq, xq, yq)
% y are the current parameters
z=z(:); xq=xq(:); yq=yq(:); I = eye(numel(z));
% Project z onto linearly constrained region
options = optimset('Display', 'off');
zp = lsqlin(I, z, Aineq, bineq, [], [], lb, ub, [], options);
% Compute norms of residuals
f = spapi(k, x, zp);
dataResidual = fnval(f, xq) - yq;
projectionResidual=zp-z;
fval = norm(dataResidual).^2/2 + norm(projectionResidual)^2/2;
% Compute jacobian and gradient
if nargout > 1
% Find active ctr
idx_active = abs(Aineq * zp - bineq) < 1e-6;
A = Aineq(idx_active,:);
JacP = I- pinv(A) * A;
B = spapi(k, x, eye(numel(x)));
jacobianT = fnval(B, xq);
grad = (jacobianT.' * JacP).'*dataResidual(:) + (JacP-I).'*projectionResidual;
end
end
Matt J
on 14 Feb 2025 at 14:28
Edited: Matt J
on 14 Feb 2025 at 15:14
One idea I have (based on your comment) is to add the unit vectors ([-1 0 0 0 0] and [1 0 0 0 0]) to Aineq ...
That's more or less what I would do. If the dimension is large, you might be able to improve efficiency using sparse operations:
function A = getActiveConstraints(zp,Aineq,bineq,Aeq,beq,lb,ub)
arguments
zp (:,1);
Aineq,bineq,Aeq,beq;
lb (:,1);
ub (:,1);
end
I=speye(numel(zp));
A=cell(4,1);
if ~isempty(Aineq)
idx=abs(Aineq * zp - bineq) < 1e-6;
A{1}=Aineq(idx,:);
end
% if ~isempty(Aeq)
% idx=abs(Aeq * zp - beq) < 1e-6;
% A{2}=Aeq(idx,:);
% end
A{2}=Aeq; %If lsqlin was successfully, equality constraints will always be active
if ~isempty(ub)
idx=abs(zp - ub) < 1e-6;
A{3}=full(I(idx,:));
end
if ~isempty(lb)
idx=abs(zp - lb) < 1e-6;
A{4}=full(-I(idx,:));
end
A=vertcat(A{:});
end
SA-W
on 14 Feb 2025 at 19:45
Edited: SA-W
on 14 Feb 2025 at 19:47
Thank you. Maybe a problem-specific detail:
if ~isempty(Aineq)
idx=abs(Aineq * zp - bineq) < 1e-6;
A{1}=Aineq(idx,:);
end
How to set the tolerance (here 1e-6) to identify active constraints? In general, my splines have a small co-domain (say the interpolation values are in [0, 1]). Hence, the entries of Aineq which enocodes constraints such as convexity or positivity are also rather small.
Example: Say Aineq has two rows and Aineq*zp produces [1e-5, 1e-7]. With a tolerance of 1e-6, we would only identify the first constraint as active. How does the neglection of constraints close to our specified tolerance affects nablaP?
Regardless of the scale of the parameters, I think it is not so easy to find a threshold which separates active constraints from non-active ones at the projected point.
Matt J
on 14 Feb 2025 at 20:24
Edited: Matt J
on 14 Feb 2025 at 20:46
Hence, the entries of Aineq which enocodes constraints such as convexity or positivity are also rather small.
Remember that the matrix entries defining a constraint are unique only up to a positive scalar multiple. Accordingly, I usually normalize my constraints so that the rows always have norm =1, e.g.,
s=vecnorm(Aineq,2,2);
Aineq=Aineq./s;
bineq=bineq./s;
This is the same as saying that the hyperplanes that bound your linearly constrained region are represented with unit normals. Once normalized, though, the selection of a threshold tolerance is a subjective thing. You will just have to experiment with what gives good results.
Matt J
on 14 Feb 2025 at 23:51
This is the same as saying that the hyperplanes that bound your linearly constrained region are represented with unit normals.
And consequently, constraints normalized this way will mean that the condition,
abs(Aineq(i,:) * zp - bineq(i))==d
implies that zp is exactly a distance d in L2-norm from the constraining hyperplane Aineq(i,:)*x =bineq(i). I don't know if that helps you choose your tolerance, but maybe it will.
SA-W
on 15 Feb 2025 at 9:13
I don't know if that helps you choose your tolerance, but maybe it will.
Not sure how it might help, but normalizing the rows of Aineq is in either case a good idea if different constraint sets (e.g. f>=0 and f''>=0) are assembled into Aineq to weight the different sets equally. Makes sense to formulate it like that?
Another hyperparameter is the weighting factor associated with the projection residual:
zp = lsqlin(I, z, Aineq, bineq, [], [], [], [], [], options);
projRes = zp-z;
fval = fval + weight * norm(projRes).^2
Usually, this weight is chosen on a case to case basis and has the purpose to balance/weight different contributions to the residuals.
In our case, the value we assign to "weight" does not really matter as the solution is characterized by "procRes = 0". So it may affect the path taken to the solution, but it should affect the solution itself, right?
Matt J
on 15 Feb 2025 at 17:42
but it should [Ed. not] affect the solution itself, right?
I dont think it should, no.
More Answers (2)
Shubham
on 5 May 2023
Hi SA-W,
Yes, there are a few options you can try to ensure that the linear inequality constraints are satisfied at intermediate iterations of the interior-point algorithm in fmincon:
- Tighten the tolerances: By tightening the tolerances in the fmincon options, you can force the algorithm to take smaller steps and converge more slowly, but with higher accuracy. This may help to ensure that the linear inequality constraints are satisfied at all intermediate iterations.
- Use a barrier function: You can try using a barrier function to penalize violations of the linear inequality constraints. This can be done by adding a term to the objective function that grows very large as the constraints are violated. This will encourage the algorithm to stay within the feasible region defined by the constraints.
- Use a penalty function: Similar to a barrier function, a penalty function can be used to penalize violations of the linear inequality constraints. However, instead of growing very large, the penalty function grows linearly with the degree of violation. This can be a more computationally efficient approach than a barrier function.
- Use a combination of methods: You can try using a combination of the above methods to ensure that the linear inequality constraints are satisfied at all intermediate iterations. For example, you could tighten the tolerances and use a penalty function or barrier function to further enforce the constraints.
It's important to note that these methods may increase the computational cost of the optimization problem, so it's important to balance the accuracy requirements with the available resources.
Matt J
on 5 May 2023
Within your objective function, check if the linear inequalites are satsfied. If they are not, abort the function and return Inf. Otherwise, compute the output value as usual.
fun=@(x) myObjective(x, A_ineq,b_ineq);
x=fmincon(fun, x0,A_ineq,b_ineq,[],[],lb,ub,[],options)
function fval=myObjective(x, A_ineq,b_ineq)
if ~all(A_ineq*x<=b_ineq)
fval=Inf; return
end
fval=...
end
See Also
Categories
Find more on Linear Least Squares in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)