Difference in output of find when using gpuArray

Hello,
I have this piece of code:
a = zeros(12,12,12);
for i=1:12, for j=1:12, for k=1:12
a(i,j,k)=((i-5)^2+(j-6)^2+(k-7)^2)<6;
end, end, end
b = find(a);
Now I've done the same thing using gpuArray.zeros instead of zeros, and I find that the output of the find function is different. b is of size 57x1. The output of the gpuArray version (call it b1) is of size 134x1. Also, the first 57 entries in b and b1 are the same. After that, there are a bunch of zeros and some other numbers. Any idea why the find function seems to be misbehaving for gpuArray?
Thanks in advance, Ranjan
EDIT: Here's some version info just in case it helps CUDA VERSION: 6.5
the command nvidia-smi returns this:
+------------------------------------------------------+
| NVIDIA-SMI 346.47 Driver Version: 346.47 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla C2050 Off | 0000:05:00.0 Off | 0 |
| 52% 87C P0 N/A / N/A | 79MiB / 2687MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 29631 C /usr/local/MATLAB/R2014b/bin/glnxa64/MATLAB 71MiB |
+-----------------------------------------------------------------------------+

Answers (1)

I tried this in R2015a using a Tesla C2070 GPU and found no difference.
a1 = gpuArray.zeros(12,12,12);
a2 = zeros(12,12,12);
for i=1:12, for j=1:12, for k=1:12
a1(i,j,k)=((i-5)^2+(j-6)^2+(k-7)^2)<6;
a2(i,j,k)=((i-5)^2+(j-6)^2+(k-7)^2)<6;
end, end, end
b1 = find(a1);
b2 = find(a2);
% The following assertion passes
assert(isequal(b1, b2))
What release of MATLAB are you using, and what GPU device?

10 Comments

I just tested your code on my platform and the assertion failed. I'm using Tesla C2050 on MATLAB R2014b.
Here are the details of my gpuDevice:
Name: 'Tesla C2050'
Index: 1
ComputeCapability: '2.0'
SupportsDouble: 1
DriverVersion: 7
ToolkitVersion: 6
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [65535 65535 65535]
SIMDWidth: 32
TotalMemory: 2.8180e+09
AvailableMemory: 2.7242e+09
MultiprocessorCount: 14
ClockRateKHz: 1147000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
Very strange, I just tried this in R2014b on Linux (C2070) and Windows (M2075), and did not see the assertion failure... I think you should probably contact technical support to see if they can help with this.
Which flavor of Linux did you use? I'm using CentOS.
Oh alright.
I've found something else too. This issue doesn't seem to come up in grid sizes of 9x9x9 or lower. Only from 10x10x10 onward the assertion is failing.
On further inspection, the arrays a1 and a2 themselves are not equal. On the GPU, array elements that should be 0 are 4.4501e-308 instead. Any thoughts?
Still very mysterious - especially since the computation for the elements of a2 is performed on the host, and then the value simply sent to the GPU - and that value is a logical which gets converted to a double. What's the output of running that code and then doing:
format hex
unique(a1(:))
?
This is the output:
ans =
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0000000000000000
0020000000000000
0020000000000000
0020000000000000
0020000000000000
0020000000000000
0020000000000000
0020000000000000
0020000000000000
0020000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
3ff0000000000000
Those values 0020000000000000 are giving the odd results. It's very strange to me that the error is a single bit. Is that really what unique returned? As you can clearly see, there are duplicate values in there...
Yes, that's the output. I just double checked.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!