# DerivativeCheck and finite differencing in lsqcurvefit/lsqnonlin

17 views (last 30 days)
Matt J on 2 Dec 2015
Commented: Matt J on 10 Dec 2015
I am trying to verify my Jacobian calculation for use in lsqcurvefit using the DerivativeCheck option. I typically find that the check fails the 1e-6 tolerance, but barely,
____________________________________________________________
DerivativeCheck Information
Objective function derivatives:
Maximum relative difference between user-supplied
and finite-difference derivatives = 1.0823e-06.
User-supplied derivative element (27,2): -0.0547181
Finite-difference derivative element (27,2): -0.0547171
____________________________________________________________
Wondering how genuine these failures were, I coded my own finite differencer and found that all elements of my finite-difference Jacobian Jn(i,j) and analytical Jacobian Ja(i,j) could be made to agree well for some (i,j)-dependent choice of the finite differencing stepsize, delta(i,j).
It occurs to me, however, that Optimization Toolbox finite differencers probably use only a j-dependent stepsize in its normal mode, rather than an (i,j)-dependent one. Otherwise, the objective function would have to be called numel(Jn) times instead of only size(Jn,2) times. On the other hand, maybe the Toolbox finite differencer doesn't operate in normal mode when DerivativeCheck is 'on'.
My question is whether the Toolbox always uses only a j-dependent delta in its finite differencing? If this is the case, it might be impossible in some cases to ever pass the DerivativeCheck tolerance. The example below attempts to illustrate this. It plots the worst error in the Jacobian as a function of stepsize delta. The resulting plot shows that the 1e-6 tolerance threshold is never achieved over a broad choice of delta.
f = @(x) [1000+0.9*x ; cos(100*x)];
Ja=[0.9;0]; %true Jacobian
delta=logspace(-15,-1,100);
for i=1:length(delta)
Error(:,i) = abs( (f(delta(i))-f(0))/delta(i) - Ja);
end
loglog(delta,max(Error));
ylabel 'Worst Error'
xlabel 'Stepsize'
Matt J on 9 Dec 2015
As further evidence of this problem, I've applied DerivativeCheck to the simplified example in my initial post, but looping over a range of values for DiffMinChange/DiffMaxChange. The code fails to find settings that succeed. I don't think I have the formulas for the Jacobian wrong. Is there something I'm missing about how to make this work?
N=100;
delta=logspace(-12,-1,N);
count=0;
for i=1:length(delta)
options=optimoptions(@lsqnonlin,'DerivativeCheck','on',...
'Jacobian','on',...
'DiffMinChange',delta(i)/1000,'DiffMaxChange',delta(i));
try
lsqnonlin(@fun,0,[],[],options);
catch ME
count=count+1;
end
end
end
if count==N, disp 'All delta choices failed'; end
function [F,J]=fun(x)
F=[1000+0.9*x ; cos(100*x)];
J=[0.9;-100*sin(100*x)];

Steve Grikschat on 9 Dec 2015
Hi Matt,
Your assessment is correct in that the step-size is determined relative to the scale of the variables (columns of J). See the note in the TypicalX description. The relative step-size is chosen to achieve the minimal approximation error.
This case highlights that the error bound is determined by the functions themselves. The tolerance used in DerivativeCheck is not adaptive in the same way (or at all, actually). It would be improved to have an adaptive tolerance or accept a given tolerance.
Using central finite-differences may improve the "hit" rate for DerivativeCheck since the error bound is tighter, but keep in mind that it will be used throughout the optimization.
For now, a more realistic approach would be to use DerivativeCheck keeping in mind the tolerance is fixed and to use your own judgement on what is acceptible.
Matt J on 10 Dec 2015
Using central finite-differences may improve the "hit" rate for DerivativeCheck since the error bound is tighter, but keep in mind that it will be used throughout the optimization.
Thanks, Steve. I do indeed find that central finite-differences improves the hit rate to 42% of the range of deltas that I loop over in my toy example.
However, I'm not sure what you meant by "keep in mind that it will be used throughout the optimization". DerivativeCheck only appears to work when the Jacobian option is 'on' and, in that case, finite differencing is not used in the optimization, in any way that is obvious to me.
Accordingly, I wonder if what this case really highlights is that DerivativeCheck should be its own separate function, rather than part of the solver. It should be, after all, something you only do once, to make sure your analytical derivative formulas are correct, rather than every time you solve. Moreover, in a separate function that you only run once, you can do more computationally expensive, adaptive finite differencing, (e.g., (i,j)-dependent stepsizes) to get more robust results.