When using MATLAB Parallel Cloud, do we need to transfer/upload files to the cloud?
Show older comments
Hi I'm using MATLAB Parallel Cloud to do parallel computing. The description can be found on the this page: http://www.mathworks.com/products/parallel-computing/parallel-computing-on-the-cloud/
I've followed every step to set up the cloud, and I think I'm now able to connect to the cloud (it shows it's connected to 16 workers).
If I correctly understand (and please correct me if I'm wrong.), we write matlab codes on local computer then start MATLAB cloud pool, and just run the code, then code will be run on the cloud NOT locally. My questions is, do we need to upload our code files and some mat files to the cloud (as we would do for MATLAB-mobile parallel cloud)? If files are stored in my local computer, while code is executed on the cloud, wouldn't this be inefficient? Since code run on the cloud would have to transfer data from local computer to the cloud so that execution/computation can be done on cloud.
Thank you!
Answers (1)
Walter Roberson
on 1 Oct 2015
1 vote
6 Comments
Edric Ellis
on 1 Oct 2015
addAttachedFiles is useful for transferring data files (e.g. .mat files) - but for MATLAB code files needed by your parfor loops, the automatic dependency analysis usually does what you need.
Shunyuan Zhang
on 1 Oct 2015
Walter Roberson
on 1 Oct 2015
When you create a function handle outside the parfor, then the dependency analysis is not always sufficient to find the associated source and transfer it.
This is partly because function handles do not always reference the version of the function that was in scope at the time the function handle was created. The documentation on which functions get launched is not adequate to describe which functions get launched in practice; see previous discussion
But anyhow, for whatever reason, even though functions() can be applied to a function handle to find the file that the handle usually references, the file does not get automatically attached if the handle was built outside the parfor; see another discussion
Walter Roberson
on 1 Oct 2015
addAttachedFiles is not about speed (really), it is about whether the file is even visible to the worker.
If network transfers during execution are a speed problem it can sometimes be worth copying a .zip with the source to the workers and unzip'ing and executing there, from what would then be local disk. The .zip reduces network traffic not just because the source is compressed but because it saves round trips on access the file attributes.
yanlu zhao
on 21 Apr 2018
Thank you, Walter. I have encountered the same problem as well. My case is like this, within a parfor-loop, I have a function written by myself, and I have to input four matrix as parameters for this function. To run this function once, it takes 0.1s, and I have to run it for 200 times by a for-loop, and this 200 iterations are independent with each other, only change some input parameter. I use a 8-cores cloud cluster and a 16-cores cloud cluster to run my code, the former is faster than the latter. I think the latter spends lots of time to send data (the four matrix parameters) and collect the result back for the future computation. So is it possible to send these four matrix to the cloud for one time before the parfor loop running, because they will not be changed during the parfor implementation. Thank you.
Walter Roberson
on 21 Apr 2018
https://www.mathworks.com/help/distcomp/parallel.pool.constant.html
Categories
Find more on Licensing on Cloud Platforms in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!