How to create a initialize function for a custom layer where the learnable parameters have same size of input?

Question

0 votes

I want to form a initialize function inside a custom layer where the learnable parameters have same size as the unknown input size. Is it possible? I understood from Define Custom Deep Learning Layer with Learnable Parameters - MATLAB & Simulink - MathWorks India that this can be acquired by utilizing networkDataLayout objects. For instance while creating the deep learning network matlab will analyze the network by using input having a batch size '1' and later on training it will change based on the batch size that we provide in the training options. Is there any way to initialize the custom layer in that way?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Malay Agarwal on 25 Sep 2024

Edited: Malay Agarwal on 25 Sep 2024

Open in MATLAB Online

1 vote

Hi @BIPIN SAMUEL,

I am assuming you are using MATLAB R2024b.

You can initialize such a layer by implementing the initialize() method of your custom layer. The initialize() method has two arguments, the layer object itself and an instance of networkDataLayout which represents the input layout of the layer (usually referred to as layout in the method's implementation):

function layer = initialize(layer,layout)
    % (Optional) Initialize layer learnable and state parameters.
    %
    % Inputs:
    %         layer  - Layer to initialize
    %         layout - Data layout, specified as a networkDataLayout
    %                  object
    %
    % Outputs:
    %         layer - Initialized layer
    %
    %  - For layers with multiple inputs, replace layout with 
    %    layout1,...,layoutN, where N is the number of inputs.
    
    % Define layer initialization function here.
end

To access the input size, you can use the Size property of the networkDataLayout object. When you train a network which contains your custom layer, MATLAB will automatically create a networkDataLayout object with the size of the incoming inputs to the layer and pass it to this method for layer initialization.

The example you shared also shows an implementation of the initialize() function where the parameters are initialized based on the channel dimension of the input: https://www.mathworks.com/help/deeplearning/ug/define-custom-deep-learning-layer.html#mw_0679ac65-be66-477c-9a76-912c32c1ab27.

You can adopt the example to use all the dimensions of the input. For example:

classdef customLayer < nnet.layer.Layer
    properties (Learnable)
        Parameter
    end
    
    % Other code
    
    methods
        function layer = initialize(layer, layout)
            if isempty(layer.Parameter)
                Parameter = randn(layout.Size);
            end
        end
    end
end

If you'd like to test the initialization without creating a full-fledged network, you can do something like this:

% Define input size - This will vary based on what your layer does
inputSize = [224 224 3];
% Manually create a networkDataLayout object
layout = networkDataLayout(inputSize, "SSC");
% Create layer
layer = customLayer()
% Manually initialize the layer
layer = initialize(layer, layout);
% Check the size of the parameter
size(layer.Parameter) == inputSize;

Refer to the following resource for more information:

networkDataLayout documentation - https://www.mathworks.com/help/deeplearning/ref/networkdatalayout.html

Hope this helps!

13 Comments
Show 11 older comments Hide 11 older comments

Malay Agarwal on 25 Sep 2024

Edited: Malay Agarwal on 25 Sep 2024

Open in MATLAB Online

sample1.m

The issue doesn't seem to be reproducible on my end. I am able to initialize the layer successfully:

layer = sample1();
inputSize = [224 224 3];
% Pass the input size and the input format to create the layout
layout = networkDataLayout(inputSize, "SSC");
layer = initialize(layer, layout)
layer = 
  sample1 with properties:

    Name: ''

   Learnable Parameters
      Wq: [224x224x3 dlarray]
      Wk: [224x224x3 dlarray]
      Wv: [224x224x3 dlarray]
      Wo: [224x224x3 dlarray]

   State Parameters
    No properties.

Use properties method to see a list of all properties.

Make sure you're creating the networkDataLayout object correctly. You might be using a NaN in the input size to denote the batch dimension. Generally speaking, layer parameters do not depend on the batch dimension (due to vectorization). You can do something like this:

function layer = initialize(layer,layout)
    % All dimensions except the last one, which is NaN
    sz = layout.Size(1:end-1);
   
    if isempty(layer.Wq)
        layer.Wq = dlarray(rand(sz,'single'));
    end
    if isempty(layer.Wk)
        layer.Wk = dlarray(rand(sz,'single'));
    end
    if isempty(layer.Wv)
        layer.Wv = dlarray(rand(sz,'single'));
    end
    if isempty(layer.Wo)
        layer.Wo = dlarray(rand(sz,'single'));
    end
end

For a more robust solution, you can extract the specific dimensions you care about using the finddim function and initialize the parameters based on that: https://www.mathworks.com/help/releases/R2023a/deeplearning/ref/dlarray.finddim.html.

BIPIN SAMUEL on 25 Sep 2024

Thank you @Malay Agarwal actually my input data has a diamension of format "CB". Now the initialize function is not throwing the error. But in the predict function

function Z = predict(layer, X) %only part of the predict fuction is shown

X=stripdims(X);

S = pagemtimes(X,layer.Wq);

Z=dlarray(S,"CB");

end

error is thrown like:

"Error using pagemtimes Incorrect dimensions for matrix multiplication. Check that the number of columns in the first array matches the number of rows in the second array."

Even I used the transpose of the first and second variable inside pagemtimes that also is not working. Is there any problem with the layer to extract the batch diamension? Without taking the batch diamension how pagemtimes execution is possible because stripdims is required to eleminate the diamension format? Is there anything wrong I am doing in dealing with the batch diamension?

Intially, I utilized finddim function to extract specific diamension but when I checked the DAGNetwork after training the layers have learnable parameters of size 1-by-1 (singleton) thats why I came with the doubt.

Malay Agarwal on 25 Sep 2024

Edited: Malay Agarwal on 25 Sep 2024

Open in MATLAB Online

Actually I just checked and pagemtimes does work with 2D matrices and just does a normal multiplication between the matrices.

X = rand(3, 2);
Y = rand(2, 3);
pagemtimes(X, Y)
ans = 3×3
    0.4673    0.5950    0.5225
    0.4376    0.5502    0.4481
    0.4874    0.6189    0.5352
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

This suggests the sizes of X and Wq are not compatible. If you want to do matrix multiplication, the number of columns in the left operand should be the same as the number of rows in the right operand. I assume you are initializing Wq as:

sz = layout.Size(1:end-1);

Wq = randn(sz);

Since your input is in "CB" format, sz is a single number. When randn is passed a single number for the size, it creates a square matrix.

randn(3)
ans = 3×3
   -0.4397    0.1507   -0.6891
    0.2824   -0.9414    0.6363
   -0.8052   -0.4188    0.1079
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

So it's possible that Wq is actually

. I think what you need is a

matrix (row vector). You can do this instead in the initialize() method:

sz = layout.Size(1:end-1);

Wq = randn([1 sz]); % Create a row vector

Then in the predict() method, you can multiply as follows:

Z = layer.Wq * X; % 1xc * cxb

This will output a matrix in "CB" format.

Malay Agarwal on 25 Sep 2024

Edited: Malay Agarwal on 25 Sep 2024

Open in MATLAB Online

I think if you use the layer in a network, you can access the batch dimension just like any other dimension using finddim by passing "B" to the label argument. The issue with NaN that you were facing earlier was only because you were passing NaN as the last dimension in the input size.

inputSize = [3 NaN];

But if you do use the batch dimension, your implementation might not be correct and you may not get the results you're expecting.

BIPIN SAMUEL on 26 Sep 2024

Yes, may be this is right for higher diamensional input data with format like "SSCB", "SSCBT".... But in this case input is 2D so both rows and column will engage in multiplication operation so it may not affect the efficiency of the network. Anyway I will try that again.

Sign in to comment.

How to create a initialize function for a custom layer where the learnable parameters have same size of input?

0 Comments
Show -2 older comments Hide -2 older comments

Answers (1)

13 Comments
Show 11 older comments Hide 11 older comments

Categories

Tags

Community Treasure Hunt

How to create a initialize function for a custom layer where the learnable parameters have same size of input?

0 Comments Show -2 older comments Hide -2 older comments

Answers (1)

13 Comments Show 11 older comments Hide 11 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

13 Comments
Show 11 older comments Hide 11 older comments