'Out of memory on device' error for GPU

26 views (last 30 days)
Peter
Peter on 3 May 2016
Commented: Joss Knight on 9 May 2016
When trying to replace a value in an existing large GPU array A, e.g. like this:
A(1,1)=rand(1,1,'single')
I get the error:
"Error using gpuArray/subsasgn
Out of memory on device. To view more detail about available memory on the GPU, use 'gpuDevice()'. If the problem persists, reset the GPU by calling 'gpuDevice(1)'."
I don't understand since I would not expect any additional GPU memory required for this operation. However, some closer investigation revealed that the amount of free GPU memory to enable even the simple operation above is roughly equal to the memory taken by A itself.
In cases like this, working with large variables, this means that at all times a huge chunk of GPU memory needs to remain available. Quite inefficient. Any explanation / solution for this issue would be very welcome, thanks!
I'm using Matlab 2016a
Here's the output from gpuDevice:
Name: 'Tesla K20c'
Index: 1
ComputeCapability: '3.5'
SupportsDouble: 1
DriverVersion: 7.5000
ToolkitVersion: 7.5000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 5.0330e+09
AvailableMemory: 2.4498e+09
MultiprocessorCount: 13
ClockRateKHz: 705500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1

Answers (1)

Edric Ellis
Edric Ellis on 3 May 2016
Edited: Edric Ellis on 3 May 2016
This behaviour is related to what MATLAB calls "in-place optimization". There's some background in this entry in Loren's blog. Basically, MATLAB treats assignment of the form:
A(x) = B;
as equivalent to a function call a bit like this:
A = subsasgn(A, substruct('()', x), B);
To avoid duplicating A, the in-place optimization must take place. Unfortunately, this doesn't happen when executing the commands at the command-line, but it does happen inside functions. So, for example, on my GPU device which has 6GB of RAM, the following two commands error at the command-line, but succeed inside a function:
A = gpuArray.zeros(1, 2e9, 'uint16');
A(1) = 2;
  4 Comments
Peter
Peter on 4 May 2016
Joss, thanks a lot! I hope I get it: Loren indeed doesn't return x to the global workspace. When she calls inplaceTest(x), a local copy of x does have to be made in the workspace of inplaceTest, which explains the initial increase in memory usage in her Page File Usage History. Once inside she uses the in-place optimization by doing the operation from a function within a function.
So my approach is indeed not working since initially a copy of A is required for which there's not sufficient space in the GPU memory.
In my code, part of A is replaced with freshly incoming data in a loop. Every j'th loop iteration, some additional operations are done with A leading to some program output.
Do I get it right that the only way to do in-place operations is to put the entire program in a function which is again in a function?
Joss Knight
Joss Knight on 9 May 2016
It's missing the point to trivially say just put a function inside a function. The fundamental discriminator is whether you are operating on data that is exposed in the MATLAB workspace. You could go twenty nested functions deep and you'd still never get in-place operations if the data being operated on exists in the MATLAB workspace. Can you see it in the Workspace browser window? If yes, operating on it will create a copy.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!