Hi,
I've been writing a deep learning neural network model by scratch, so i can have an intuitive understanding of them. The code i've written works fine, and i've spent a great amount of time optimizing it, but i seem to have reached a bottleneck that is the GPU code. I've implemented a dynamic network through the use of structures, with the structure vector representing layer depth. This model uses sigmoid activation functions, and cross-entropy cost function.
First things first, there are three files: The main script, and the backprop and feed_forward functions.
The main script
clc;
clear all; close all;
load ('numbers.mat');
for i=1:length(numbers)
temp = numbers(i).label;
numbers(i).label = zeros(1,10);
numbers(i).label(temp+1) = 1;
end
validation = numbers(1:10000);
training = numbers(10001:end);
batch_size = 10;
numEpochs = 5;
rateFunc = interp1(0.5 ./ [1:20], linspace(1, 20, numEpochs));
numInput = size(training(1).data, 1) * size(training(1).data, 2);
net = create_net([numInput 100 10]);
numLayers = length(net);
average = [];
for epoch=1:numEpochs
tic;
randIndex = randperm(size(training,2));
for i=1:batch_size:length(training)-batch_size
[net, gradient] = backprop(net, training(randIndex(i:i+batch_size-1)), rateFunc(epoch));
end
fprintf ('Epoch(%d): %fs', epoch, toc);
[average(end+1), error] = validate_net(net, validation);
if mod(epoch, 5) == 0
train_error = validate_net(net, training);
fprintf ('\nError(Training): %f\n', train_error);
if train_error >= 0.99, break; end
end
fprintf ('\nError: %f', average(end));
fprintf ('\n---------------\n');
end
function [average, error] = validate_net(net, inputData)
error = [];
for i=1:size(inputData,2)
layer = feed_forward(net, inputData(i).data);
[~,ix] = max(layer(end).a);
[~,iy] = max(inputData(i).label);
error = [error; [ix-1, iy-1]];
average = mean(error(:,1) == error(:,2));
end
end
function net = create_net(structure)
numLayers = length(structure) - 1;
net = struct('b', [], 'w', cell(1, numLayers));
for i=1:numLayers
net(i).w = (randn(structure(i), structure(i+1))/sqrt(structure(i)));
net(i).b = (randn(1, structure(i+1)));
end
end
Backprop
function [net, gradient] = backprop(net, inputData, rate)
numLayers = length(net);
delta = struct('b', [], 'w', cell(1, length(net)));
gradient = struct('b', 0, 'w', num2cell(zeros(1, length(net))));
for i=1:length(inputData)
layer = feed_forward(net, inputData(i).data);
delta(numLayers).b = layer(numLayers).a - inputData(i).label;
delta(numLayers).w = layer(numLayers-1).a' * delta(numLayers).b;
for L=numLayers-1:-1:2
delta(L).b = (delta(L+1).b * net(L+1).w') .* 1./(1 + exp(-layer(L).z)) .* (1 - 1./(1 + exp(-layer(L).z)));
delta(L).w = layer(L-1).a' * delta(L).b;
end
delta(1).b = (delta(2).b * net(2).w') .* 1./(1 + exp(-layer(1).z)) .* (1 - 1./(1 + exp(-layer(1).z)));
delta(1).w = inputData(i).data' * delta(1).b;
for L=1:numLayers
gradient(L).b = gradient(L).b + delta(L).b;
gradient(L).w = gradient(L).w + delta(L).w;
end
end
for L=1:numLayers
net(L).b = net(L).b - rate/length(inputData)*gradient(L).b;
net(L).w = net(L).w - rate/length(inputData)*gradient(L).w;
end
end
Feed_forward
function layer = feed_forward(net, inputData)
layer = struct('z', [], 'a', cell(1, length(net)));
layer(1).z = inputData * net(1).w + net(1).b;
layer(1).a = 1./ (1 + exp(-layer(1).z));
for i=2:length(net)
layer(i).z = layer(i-1).a * net(i).w + net(i).b;
layer(i).a = 1./ (1 + exp(-layer(i).z));
end
end
The dataset I'm using is the classic MNIST digit recognition problem, and I've been able to get close to 98% accuracy on it. It takes roughly 5 seconds to run per epoch, but on the GPU it takes 6 times this amount. I use the GPU by changing the create_new function, like so:
function net = create_net(structure)
numLayers = length(structure) - 1;
net = struct('b', [], 'w', cell(1, numLayers));
for i=1:numLayers
net(i).w = gpuArray(randn(structure(i), structure(i+1))/sqrt(structure(i)));
net(i).b = gpuArray(randn(1, structure(i+1)));
end
end
Am i doing something wrong here? Would appreciate any feedback on optimizing the code, and how to solve this GPU issue.
Thanks for reading