spmd, composites and hidden overheads

1 view (last 30 days)
Thomas Lai
Thomas Lai on 7 Jun 2012
I have a few questions regarding the inner workings of the spmd block that I wish PCT engineers would give some pointers on.
1) Is there any way to access codistributor information from the local master client without doing spmd, getCodistributor(x), end? I need to do this many times throughout my code and I was wondering if calling the spmd block in codistributor incurs unnecessary performance penalties.
2) My application requires data to be distributed in a highly specific way but since I am using mostly overloaded object methods I have little use for the functionalities of the codistributed data type save for calling getLocalPart(x) after the data has been redistributed according to the correct sizes. Thus I have been experimenting with composites as an alternative means of data storage but I am unsure of what the hidden overheads are, since i have to store them as object properties and copy them around a lot. Can you provide an explicit documentation on the overheads and communications that goes under the hood when using composites?
3) I need to overload the spmd block to work with my custom object classes, if that is possible can you provide detailed documentation with the functionality and specifications of the method?

Answers (1)

Jill Reese
Jill Reese on 8 Jun 2012
Thomas,
(1) The getCodistributor() function operates only on codistributed arrays; therefore, it must be called from within an spmd block. There is also no equivalent function for distributed arrays, which is what a codistributed array is outside of an spmd block. getCodistributor() is a simple query function that requires no communication, so you won't take a performance hit by calling it.
As for (2) and (3), could you provide some more details on what you are trying to do in your code and where you find the available codistributors lacking? In particular, a simple MATLAB code sample would be helpful. There may be some advanced maneuvers you could use early on that will allow you to simplify the remaining code.
Best,
Jill
  1 Comment
Thomas Lai
Thomas Lai on 8 Jun 2012
Jill,
Thanks for your speedy response! But I think I wasn't specific enough when I was framing my question. I was referring to the performance hit of going in and out of the spmd block just to execute that one line of getCodistributor(). I needed the distribution dimensions and partitions for my class constructor (which is constructed on the local matlab master client), here is a sample code from my class constructor:
if isdistributed(Cube) % Distributed 3D Matrix
warn3D = {0};
spmd
numcodist = getCodistributor(Cube);
if numcodist.Dimension ~= 3
warn3D = 1;
end
end
if warn3D{1}
error('3D Matrix is not distributed correctly!');
end
clear warn3D;
opList = Cube;
else % Undistributed 3D Matrix
error('3D Matrix is not distributed!');
end
As for (2), what I wanted to do was to create and store large data arrays with specific distributions(non-default), and avoiding unnecessary redistributing as much as possible, because I rarely have the use for the default distribution. Distributed/Codistributed arrays was unnecessary because I never really needed matrix multiplication directly on distributed arrays, but rather, I apply matrix-free operators on their local parts every time.
As for (3), I am building a data container class (that contains distributed data as a property) that can be passed in and out of spmd blocks transparently but obviously that violates spmd's transparency restrictions. I can of course setup checks before each and every spmd block and strip the distributed data out of the container, but that kinda defeats the purpose of the data container class itself, which is transparency, robustness and ease of use, on top of storing vital information about the data that cannot be carried along in matlab numerical arrays or distributed arrays, such as units and origins.
If you have the interest, and the leisure to look into the classes in question, you can find them here:
https://github.com/slimgroup/pSPOT
https://github.com/slimgroup/SeisDataContainer
Our matlab developers also have other burning questions about PCT and its inner workings, perhaps if you could indulge us in an online conference with you or one of your engineers, it would be immensely helpful. Thank you.
Cheers,
Thomas

Sign in to comment.

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!