Sparse gpuArray accumulation in for-loop

1 view (last 30 days)
I met a problem 'Out of memory' in Sparse gpuArray accumulation in my for-loop.
The following code is in a function. I need to accumulate the result 'KernelCurrent' of every loop into the grobal gpuArray Sparse 'Kernel'. In this function, 'KernelCurrent' is also a gpuArray Sparse and has the same size as 'Kernel'; (Size: 262144×262144)
I have tested all the other line of code in this function, which showed that the 'Out of memory' problem is caused by the operation of addition(accumulation). The storage memories requested for both 'Kernel' and 'KernelCurrent' is exactly less than the 'AvailableMemory' of the gpuDevice.
Kernel = gpuArray(sparse(num_row, num_col))
for
.
.
.
KernelCurrent = Result_oneLoop; % 'KernelCurrent' has the same size as 'Kernel'
Kernel = Kernel + KernelCurrent; % Causing the 'Out of mamory' problem
end
The gpuDevice that I can access:
Are there alternative method of coding for solving this problem ? Thanks in advance!
  2 Comments
Andrea Picciau
Andrea Picciau on 9 Oct 2019
Hi Chen,
How many elements do your sparse matrices have?
CHEN ZIXIANG
CHEN ZIXIANG on 10 Oct 2019
Hi Andrea,
The size of the sparse matrices is 262144×262144.(For both Kernel and KernelCurrent)

Sign in to comment.

Accepted Answer

Matt J
Matt J on 10 Oct 2019
Edited: Matt J on 10 Oct 2019
I would guess that your Kernel matrix is becoming less and less sparse as you accumulate until its memory consumption is growing beyond the GPU's capacity. Add the line below and re-run to check.
Kernel = gpuArray(sparse(num_row, num_col))
for
.
.
.
KernelCurrent = Result_oneLoop;
Kernel = Kernel + KernelCurrent;
percent_density=nnz(Kernel)/numel(Kernel)*100, %<---- Add this
end
How large does the percent_density become before the "Out of memory" occurs?
  1 Comment
CHEN ZIXIANG
CHEN ZIXIANG on 11 Oct 2019
Thank you for your answer!
Yes, the sparsity decreases very quickly as the accumulation goes on.
I finally try to keep the sparsity of Kernel by a Sparsity controlling vector(the size is 262144×1) with entries of 1 and 0(Only 6 elements of the vector is of value 1),now the code becomes:
Kernel = sparse([]);
parfor
.
.
.
KernelCurrent = Result_oneLoop; % 'KernelCurrent' has the size of (262144×1)
KernelCurrent = KernelCurrent.*Sparsity_Control_Vector;
Kernel = [Kernel, KernelCurrent];
end
As you can see, I don't apply 'gpuArray' anymore. However, the parallel computing pool still works. And now my problem is solved.

Sign in to comment.

More Answers (0)

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!