When using MATLAB Parallel Cloud, do we need to transfer/upload files to the cloud?

Hi I'm using MATLAB Parallel Cloud to do parallel computing. The description can be found on the this page: http://www.mathworks.com/products/parallel-computing/parallel-computing-on-the-cloud/
I've followed every step to set up the cloud, and I think I'm now able to connect to the cloud (it shows it's connected to 16 workers).
If I correctly understand (and please correct me if I'm wrong.), we write matlab codes on local computer then start MATLAB cloud pool, and just run the code, then code will be run on the cloud NOT locally. My questions is, do we need to upload our code files and some mat files to the cloud (as we would do for MATLAB-mobile parallel cloud)? If files are stored in my local computer, while code is executed on the cloud, wouldn't this be inefficient? Since code run on the cloud would have to transfer data from local computer to the cloud so that execution/computation can be done on cloud.
Thank you!

Answers (1)

6 Comments

addAttachedFiles is useful for transferring data files (e.g. .mat files) - but for MATLAB code files needed by your parfor loops, the automatic dependency analysis usually does what you need.
Thank you Walter and Edric. I tried addAttachedFiles, but the speed doesn't seem to be improved much...
Edric, could you tell a bit more about what you meant by "the automatic dependency analysis usually does what you need"? Thank you!
When you create a function handle outside the parfor, then the dependency analysis is not always sufficient to find the associated source and transfer it.
This is partly because function handles do not always reference the version of the function that was in scope at the time the function handle was created. The documentation on which functions get launched is not adequate to describe which functions get launched in practice; see previous discussion
But anyhow, for whatever reason, even though functions() can be applied to a function handle to find the file that the handle usually references, the file does not get automatically attached if the handle was built outside the parfor; see another discussion
addAttachedFiles is not about speed (really), it is about whether the file is even visible to the worker.
If network transfers during execution are a speed problem it can sometimes be worth copying a .zip with the source to the workers and unzip'ing and executing there, from what would then be local disk. The .zip reduces network traffic not just because the source is compressed but because it saves round trips on access the file attributes.
Thank you, Walter. I have encountered the same problem as well. My case is like this, within a parfor-loop, I have a function written by myself, and I have to input four matrix as parameters for this function. To run this function once, it takes 0.1s, and I have to run it for 200 times by a for-loop, and this 200 iterations are independent with each other, only change some input parameter. I use a 8-cores cloud cluster and a 16-cores cloud cluster to run my code, the former is faster than the latter. I think the latter spends lots of time to send data (the four matrix parameters) and collect the result back for the future computation. So is it possible to send these four matrix to the cloud for one time before the parfor loop running, because they will not be changed during the parfor implementation. Thank you.
https://www.mathworks.com/help/distcomp/parallel.pool.constant.html

Sign in to comment.

Categories

Asked:

on 1 Oct 2015

Commented:

on 21 Apr 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!