Struggling a bit with writing a function and passing it to the GPU with via arrayfun

3 views (last 30 days)
I have a long equation that I am currently looping though element by element and its taking forever so I want to make it faster. I’ve read that you can use arrayfun to apply a function to each element using the GPU which could sometimes decrease the computational time a lot. Not sure that method is applicable in all situations (like this one for example), but since I have the parallel computing toolbox I thought I would try it, but I’m not getting it to work, maybe I’m writing the function In the wrong way, I’m trying to understand this but I’m still stuck a bit in my old mindset of working with nested loops which Im trying to get away from.
This is the function file I have created:
function [o1] = IncAngle(DailyDeclination,DailyHourAngle,Azimuth,Tilt)
o1=acosd(cosd(DailyDeclination)*sind(DailyHourAngle)*sind(Tilt)*sind(Azimuth)+cosd(DailyDeclination)*cosd(DailyHourAngle)*sind(57.6)*sind(Tilt)*cosd(Azimuth)-sind(DailyDeclination)*cosd(57.6)*sind(Tilt)*cosd(Azimuth)+cosd(DailyDeclination)*cosd(DailyHourAngle)*cosd(57.6)*cosd(Tilt)+sind(DailyDeclination)*sind(57.6)*cosd(Tilt));
As you can see it’s a bit long and complicated, but the outcome depends on four variables:
DailyDeclination % a 1x1440, each value representing one minute of a day
DailyHourAngle % a 1x1440, each value representing one minute of a day
Azimuth=(-179:2:180)'; % a 180x1, each value representing one orientation on a compass
Tilt=(0:1:180)'; % a 181x1 , each value representing one degree tilt of a surface
And I want the output to be stored as 1440x181x180 representing all minutes during the day for all combinations of tilts and orientations. This is how I try to pass the function to the GPU which I am basing on something similar I saw in an example:
DailyDeclination = gpuArray(DailyDeclination);
DailyHourAngle = gpuArray(DailyHourAngle);
Azimuth = gpuArray(Azimuth);
Tilt = gpuArray(Tilt);
[o1]=arrayfun(@IncAngle,DailyDeclination,DailyHourAngle,Azimuth,Tilt)
But it only gives me the error: Error using gpuArray/arrayfun Non-singleton dimensions of input arrays must match each other.
I suspect that the error here isn’t just because I’m sending it to the GPU, but that I’m not writing the function input/outputs correctly. How do I specify for the function in which order the elements should be combined and how I want my output to look without using for loops?

Accepted Answer

Edric Ellis
Edric Ellis on 19 Mar 2015
To use arrayfun in this way, you need each of your vectors to define a different dimension. Something like this simplified example:
% a column vector
v1 = reshape(gpuArray.colon(1, 3), [], 1);
% a row vector
v2 = gpuArray.colon(1, 4);
% 3rd-dimension vector
v3 = reshape(gpuArray.colon(1, 5), 1, 1, []);
% add together elements
arrayfun(@(x,y,z) x + y + z, v1, v2, v3)
In your case, you'll need transpose DailyDeclination and DailyHourAngle to be columns, Tilt needs to be a row (so don't transpose that), and you need to reshape Azimuth to be a 3rd-dimension vector.
  2 Comments
PetterS
PetterS on 19 Mar 2015
Ok that worked and I simply cannot believe what I am seeing! I used to leave this calculation on overnight and in the morning about 6 of them would be done. With this method it took somewhere around 1-2 seconds to complete. Seriously. Thank you!
So this obviously changed my life forever, but I did find one thing that’s a bit odd: if I calculate it this way and store in say a matrix A and then have another matrix B of the same thing that I calculated the slow way and then type A==B to see if they are identical, about every third value or so is 0 (and the rest are 1). They have the same max, min and mean value but are not 100% identical, is the GPU approximating values differently or why are the results not exactly the same?
Edric Ellis
Edric Ellis on 20 Mar 2015
Yes, the precise double-precision arithmetic on the GPU doesn't always exactly match the CPU - especially when using trig functions etc. The results should be close to within a few eps though depending on how many operations are involved.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!