Clear Filters
Clear Filters

Using dlarray with betarnd/randg

7 views (last 30 days)
Jack Hunt
Jack Hunt on 20 Jun 2024
Commented: Jack Hunt on 22 Jun 2024
I am writing a custom layer with the DL toolbox and a part of the forward pass of this layer is making draws from a beta distribution where the b parameter is to be optimised as part of the network training. However, I seem to be having difficulty using betarnd (and by extension randg) with a dlarray valued parameter.
Consider the following, which works as expected.
>> betarnd(1, 0.1)
ans =
0.2678
However, if I instead do the following, then it does not work.
>> b = dlarray(0.1)
b =
1×1 dlarray
0.1000
>> betarnd(1, b)
Error using randg
SHAPE must be a full real double or single array.
Error in betarnd (line 34)
g2 = randg(b,sizeOut); % could be Infs or NaNs
Is it not possible to use such functions with parameters to be optimised via automatic differentiation (hence dlarray)?
Many thanks

Accepted Answer

Matt J
Matt J on 20 Jun 2024
Edited: Matt J on 20 Jun 2024
Random number generation operations do not have derivatives in the standard sense. You will have to define some approximate derivative for yourself by implementing a backward() method.
  2 Comments
Matt J
Matt J on 20 Jun 2024
Edited: Matt J on 22 Jun 2024
You will have to define some approximate derivative for yourself by implementing a backward() method.
One candidate would be to reparametrize the beta distribution in terms of uniform random variables, U1 and U2, which you would save during forward propagation,
function [Z, U1, U2] = forward_pass(alpha, beta)
% Generate uniform random variables
U1 = rand();
U2 = rand();
% Generate Gamma(alpha, 1) and Gamma(beta, 1) using the inverse CDF (ppf)
X = gaminv(U1, alpha, 1);
Y = gaminv(U2, beta, 1);
% Combine to get Beta(alpha, beta)
Z = X / (X + Y);
end
During back propagation, your backward() method would differentiate non-stochastically with resepct to alpha and beta, using the saved U1 and U2 data as fixed and given values,
function [dZ_dalpha, dZ_dbeta] = backward_pass(alpha, beta, U1, U2, grad_gaminv)
% Differentiate gaminv with respect to the shape parameter alpha and beta
dX_dalpha = grad_gaminv(U1, alpha);
dY_dbeta = grad_gaminv(U2, beta);
% Compute partial derivatives of Z with respect to X and Y
X = gaminv(U1, alpha, 1);
Y = gaminv(U2, beta, 1);
dZ_dX = Y / (X + Y)^2;
dZ_dY = -X / (X + Y)^2;
% Use the chain rule to compute gradients with respect to alpha and beta
dZ_dalpha = dZ_dX * dX_dalpha;
dZ_dbeta = dZ_dY * dY_dbeta;
end
This assumes you have provided a function grad_gaminv() which can differentiate gaminv(), e.g.,
function grad = grad_gaminv(U, shape)
% Placeholder for the actual derivative computation of gaminv with respect to the shape parameter
% Here we use a numerical approximation for demonstration
delta = 1e-6;
grad = (gaminv(U, shape + delta, 1) - gaminv(U, shape, 1)) / delta;
end
DISCLAIMER: All code above was ChatGPT-generated.
Jack Hunt
Jack Hunt on 22 Jun 2024
I see, so I do indeed need to use a closed form gradient. I had naively assumed that the autodiff engine would treat the stochastic (rng) quantities as non stochastic and basically do as you have described above.
Thank you for the answer. I shall work through the maths (re-derive the derivatives) and implement it the manual way. I have been spoiled by autodiff in the last decade or so; it’s been some time since I explicitly wrote a backward pass!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!