How to find the derivative of a function with sums and conditions?

Question

Klaudio Luku on 11 Mar 2021

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/769702-how-to-find-the-derivative-of-a-function-with-sums-and-conditions

Edited: Klaudio Luku on 17 Mar 2021

Accepted Answer: Klaudio Luku

Open in MATLAB Online

Hi

I'm trying to solve this problem, in my function, that i show you below.

function p_out = quantreg_descent(x,y,tau)
% 
%
% ---------- Quantile Regression -------------
%
%
pmean=x\y; 
rho_fun = @(r) sum(abs(r).*abs(tau-(r<0))); 
p_fun = @(p) rho_fun(y-x*p);
x0 = pmean;
tol = 10^-5;
kmax = 500;
meth = 3;
p_out = descent(p_fun, diff_p_fun, x0, tol, kmax, meth);
end

What i need to do is to put into the 'descent' function the p_fun and the diff_p_fun that should be his derivative. I've tried to use symbolic toolbox but it seems that matlab has problems with symbolic functions like this.

To be sure i will put all the code of the other functions.

function fmin = descent(fun, grad, x0, tol, kmax, meth, varargin)
%
% DESCENT Metodo di discesa per il calcolo di minimi.
%
%   [x, err, iter]= descent(fun, grad, x0, tol, kmax, meth, hess)
%   approssima un punto di minimo della funzione 'fun' mediante il metodo 
%   di discesa con direzioni di Newton (meth=1), BFGS (meth=2),
%   del gradiente (meth=3) o del gradiente coniugato con beta_k di
%   Fletcher and Reeves (meth=41), beta_k di Polak and Ribiere (meth=42),
%   beta_k di Hestenes and Stiefel (meth=43).
%   - Il passo è costruito con la tecnica di backtracking
%   - fun, grad ed hess (quest'ultima usata solo se METH=1) sono function 
%     handle associati alla funzione obiettivo, al suo gradiente
%     ed alla matrice Hessiana.
%   - Se mets=2, hess è una matrice approssimante l'Hessiana nel punto
%     iniziale x0 della successione.
%   - tol è la tolleranza per il test d'arresto e kmax è il numero
%     massimo di iterazioni. 
%   
%   Si richiama le function backtrack.m
%% Verifica sui parametri di imput
if nargin>6
    if meth==1, hess=varargin{1};
    elseif meth==2, H=varargin{1};
    end
end
%% Definizione parametri
err = tol+1;  k = 0;   xk = x0(:);  gk = grad(xk);  dk = -gk;
eps2 = sqrt(eps);
%% Routine
while err>tol && k< kmax
    % Metodi
    if meth==1        
       H = hess(xk);  dk = -H\gk;       % Newton
    elseif meth==2     
           dk = -H\gk;                  % BFGS
    elseif meth==3     
           dk = -gk;                    % gradient
    end
    [xk1,~] = backtrack(fun,xk,gk,dk);
    gk1 = grad(xk1);
    if meth==2  % BFGS update
       yk = gk1-gk;  sk = xk1-xk;  yks = yk'*sk;
       if yks>eps2*norm(sk)*norm(yk)
          Hs = H*sk;
          H = H+(yk*yk')/yks - (Hs*Hs')/(sk'*Hs);
       end
       elseif meth>=40  % CG upgrade
       if meth == 41
          betak = (gk1'*gk1)/(gk'*gk);    % FR
       elseif meth == 42
              betak = (gk1'*(gk1-gk))/(gk'*gk);   % PR
       elseif meth == 43
              betak = (gk1'*(gk1-gk))/(dk'*(gk1-gk));    % HS
       end
      dk = -gk1+betak*dk;
    end
    xk = xk1;  gk = gk1;  k = k+1;  xkt = xk1;
    for i=1:length(xk1);  xkt(i)=max([abs(xk1(i)),1]);
    end
    err = norm((gk1.*xkt)/max([abs(fun(xk1)),1]),inf);
end
x = xk;  iter = k;
fmin = fun(x);
%% Messaggio di errore
if (k==kmax && err > tol)
  fprintf(['descent si e'' arrestato senza aver ',...
   'soddisfatto l''accuratezza richiesta, avendo\n',...
   'raggiunto il massimo numero di iterazioni\n']);
end

and the backtrack function

function [x, alphak] = backtrack(fun, xk, gk, dk, varargin)
%
% BACKTRACK Metodo backtracking per line-search
%
%   [x, alphak] = backtrack(fun, xk, gk, dk) calcola x_{k+1} = x_k+alpha_k 
%   e d_k del metodo di discesa, in cui alpha_k è costruito con la tecnica 
%   di backtracking, con sigma = 1.e-4 e rho = 1/4.
%
%   [x, alphak] = backtrack(fun, xk, gk, dk, sigma, rho) permette di
%   precisare i valori dei parametri sigma e rho. 
%   - Tipicamente 1*e-4<sigma<0.1 e 1/10<rho<1/2
%   - 'fun' é un function handle associato alla funzione obiettivo.
%
%   xk contiene l'elemento x_k della successione, gk il gradiente di 'fun'
%   in xk e dk la direzione d_k.
%% Verifica sul numero di imput
if nargin==4
    sigma = 1.e-4;  rho = 1/4;
else
    sigma=varargin{1}; rho=varargin{2};
end
%% Parametri ed inizializzazione
alphamin = 1.e-5; % valore minimo per il passo alpha
alphak = 1;  fk = fun(xk);
k = 0;  x = xk+alphak*dk;
%% Routine
while fun(x)>fk+sigma*alphak*gk'*dk  &&  alphak>alphamin
    alphak = alphak*rho;
    x = xk+alphak*dk;  k = k+1;
end
end

sample values of x,y, and tau are the seguent:

x_1 = [1:17]';
x = [ones(17,1), x_1];
y = [27.2; 27.3; 27.4; 27.8; 28.2; 28.6; 29.0; 29.3; 29.3; 29.4; 29.6; 29.8; 30.2; 30.5; 30.7; 30.7; 30.8];
tau = 0.5;

Hope you could help me

8 Comments
Show 6 older commentsHide 6 older comments

Klaudio Luku on 11 Mar 2021

Edited: Klaudio Luku on 11 Mar 2021

Open in MATLAB Online

Unfortunately there are no constrains i can put on my imputs or on the process. It is substantially a unconstrained optimization problem, and i cant use any given function to solve it.

The real issue is that using for example the symbolic toolbox, the function generated by Matlab its really complex and hard to even understand what is doing.

The function generated is a vector of 2 symbolic equation like the ones below.

val = 
piecewise(154/5 < p, (17*p)/2 - 2479/10 < (p - 29)*(p - 57/2) + (p - 30)*(p - 61/2) +...
    (p - 136/5)*(p - 267/10) + (p - 137/5)*(p - 269/10) + (p - 134/5)*(p - 273/10) +...
    (p - 139/5)*(p - 273/10) + (p - 141/5)*(p - 277/10) + (p - 143/5)*(p - 281/10) +...
    (p - 147/5)*(p - 289/10) + 2*(p - 144/5)*(p - 293/10) + (p - 148/5)*(p - 291/10) +...
    (p - 149/5)*(p - 293/10) + (p - 151/5)*(p - 297/10) + (p - 154/5)*(p - 303/10) +...
    2*(p - 151/5)*(p - 307/10))
val = 
piecewise(136/5 < p, (153*p)/2 - 2479/10 < (7*p - 29)*(7*p - 57/2) + (14*p - 30)*(14*p - 61/2) +...
    (2*p - 134/5)*(2*p - 273/10) + (3*p - 137/5)*(3*p - 269/10) + (4*p - 139/5)*(4*p - 273/10) +...
    (5*p - 141/5)*(5*p - 277/10) + (6*p - 143/5)*(6*p - 281/10) + (8*p - 144/5)*(8*p - 293/10) +...
    (9*p - 144/5)*(9*p - 293/10) + (10*p - 147/5)*(10*p - 289/10) + (11*p - 148/5)*(11*p - 291/10) +...
    (12*p - 149/5)*(12*p - 293/10) + (13*p - 151/5)*(13*p - 297/10) + (15*p - 151/5)*(15*p - 307/10) +...
    (16*p - 151/5)*(16*p - 307/10) + (17*p - 154/5)*(17*p - 303/10) + (p - 136/5)*(p - 267/10))

Walter Roberson on 12 Mar 2021

Edited: Walter Roberson on 16 Mar 2021

Open in MATLAB Online

I am having difficulty telling whether p is intended to be a scalar or a column vector of length 2. You seem to be assuming it is a scalar, but when it is then because x is a vector with two columns, x*p is generating a 2D array and you sum() across one dimension leaving rho_fun to emit a vector of length 2. Your pmean is also a vector of length 2, and that becomes x0. I suspect that your descent is going to lead you to the p value becoming a vector of length 2 even if it starts out a scalar.

Whereas if p is a column vector of length 2 then x*p is N x 2 * 2 x 1 giving N x 1 and sum() of that gives a scalar, which is plausible.

But if p is a scalar then the question of what you are taking the derivative with respect to becomes easy, but if p is a column vector of length 2 then it is not clear what the diff() is with respect to.

If p is intended to be a scalar, then with x being N x 2 so your calculations becoming N x 2, is your sum intended to be along the first dimension, giving you an 1 x 2 result, or is it intended to be along the second dimension giving you an N x 1 result?

When I make p a scalar and assume summation along the first dimension to give a 1 x 2 result, then the derivative starts looking like

piecewise(p == 29, -2, p == 61/2, 5, p == 136/5, -8, p == 137/5, -6, p == 139/5, -5

-- a bunch of individual constants . That's because rho_fun becomes things like

piecewise(p <= 136/5, 2479/10 - (17*p)/2, 154/5 <= p, (17*p)/2 - 2479/10

and derivative of those terms linear in p is piecewise constants.

We might hypothesize that MATLAB is making an error in converting the <= into == when we take the derivative (something that would need further investigations), but you would still have the fundamental problem that you have a piecewise constant derivative and you will find that the descent algorithms you are using fail in such cases.

Walter Roberson on 16 Mar 2021

Open in MATLAB Online

If p is a vector, then with diff_p_fun being the derivative of p_fun, what variable is the derivative taken over?

Do you think that there is any other possibility to solve this problem

No.

By examination we can see that

sum(abs(r).*abs(tau-(r<0)))

can be rewritten in terms of the sum of a number of piecewise() terms that together get rid of the abs() calls

piecewise(r < 0 & tau-1 <= 0, -r .* -(tau-1), ...
          r < 0 & tau-1 > 0, -r .* (tau-1), ...
          r >= 0 & tau <= 0, r .* -tau, ...
          r >= 0 & tau > 0, r .* tau)

and rho_fun is the sum of those terms over all r values. The summation can be rewritten to move tau mostly outside. You can see that each of the piecewise terms is linear in r, so the summation will be linear in any variable contained within r that appears linearly in r

rho_fun is invoked on y-x*p with y and x constant. Whether p is scalar or column vector, x*p is linear in the variables contained in p, and y minus that does not change the linearity. Then when you put y-x*p through rho_fun the resulting summation is going to be linear in the variables in p, except as modified by the action of the piecewise boundary conditions.

Now take the derivative of the rho_fun to try to satisfy the requirement that diff_p_fun is the derivative of p_fun. But with respect to which variable? Pick any one of the variables and since each of the terms is linear, the derivative is constant.

Thus, diff_p_fun is going to be a series of piecewise constants. But all of your methods that actually use the derivative, assume that the derivative is a continuous function.

Klaudio Luku on 17 Mar 2021

I found out how to solve the problem, i will post the answer below.

Sign in to comment.

Sign in to answer this question.

Answer 1

Klaudio Luku on 17 Mar 2021

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/769702-how-to-find-the-derivative-of-a-function-with-sums-and-conditions#answer_650487

Edited: Klaudio Luku on 17 Mar 2021

Open in MATLAB Online

The best way to define this derivative function is the seguent:

function grad = diff_p_fun(x,y,tau,p)
%
% -------- Derivata della funzione p_fun --------
%
%% Definizione parametri
r = y-x*p;
x1 = x(:,1);
x2 = x(:,2);
gamma1 = ((r)'*(-x1))/(norm(r));
gamma2 = ((r)'*(-x2))/(norm(r));
%% Calcolo della derivata e setup output
diff_p_fun_1 = sum(   (gamma1).*abs(tau-(r<0)) + ...
                      abs(r).*(((tau-(r<0))'*ones(17,1))/(norm(tau-(r<0))))   );
diff_p_fun_2 = sum(   (gamma2).*abs(tau-(r<0)) + ...
                      abs(r).*(((tau-(r<0))'*ones(17,1))/(norm(tau-(r<0))))   );
grad1 = diff_p_fun_1;
grad2 = diff_p_fun_2;
grad = [grad1;grad2];
end

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How to find the derivative of a function with sums and conditions?

8 Comments
Show 6 older commentsHide 6 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to find the derivative of a function with sums and conditions?

8 Comments Show 6 older commentsHide 6 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

8 Comments
Show 6 older commentsHide 6 older comments

0 Comments
Show -2 older commentsHide -2 older comments