You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Parrellel server dir() for files local to the server
4 views (last 30 days)
Show older comments
Hello,
I wish to move computation from a local machine with .m files stored locally to a parrellel compute server with the .m files stored on the server.
Processing the files sequentially on my local machine this usally looks something like this.
Files = dir('C:\my_data'); % Retrieve all patients .m files names
for i=1:length(Files) %
load(strcat('C:\my_data,Files(i).name')) % Load each file in turn
% Put functions to run on data
end
I now want to move this compute to a parrellel server, I have a PS liscense and the server is validated, I have also uploaded the files to the server.
However I cannot figure out how to call the dir() command so that it queries the files on the server (as they are about 1Tb total in size - so too large to transfere to the remote server eachtime). I had though it would look something like this;
Files = dir('~/home/user/Database/Physionet/training/'); % Rather than query locally, querey the data on the server
However the directory isn't found correctly, Can anyone explain to me how to point to this data on the parrellel compute server? Or if anyone has suggestions on better ways to do this please let me know!
Kind regards,
Christopher
Accepted Answer
Raymond Norris
on 25 Apr 2022
For starters, you don't want to hard code files/paths in your code. Your code should be functions so that you can pass in root folder locations to where you want to read/write. I'll show you an example, but first a couple of questions.
How do you submit your code to the cluster? Are you using parpool or batch. For example
c = parcluster('cluster');
pool = c.parpool(16);
Files = dir('~/home/user/Database/Physionet/training/');
parfor i=1:length(Files)
% Had a typo in your line. Also, will want to make sure Files(i).name
% is always a MAT-file (think at least about . and ..)
load(strcat('C:\my_data',Files(i).name))
...
end
Or
c = parcluster('cluster');
job = c.batch(@mycode,...,'Pool',16);
I'm guess you want the former, but you probably gonna need the latter. It also depends on what you're going to do with the data after the parfor finishes (or while it's running). I have a thought, but you might need to update to R2022a.
13 Comments
Christopher McCausland
on 26 Apr 2022
Edited: Christopher McCausland
on 26 Apr 2022
Hi Raymond,
Thank you for the reply!
- I usually hard code in paths for testing purposes as I have used uigetdir() in the past but it can be a little annoying for lot of testing. I would really like to see how you would suggest to do it if there is a better way though!
In terms of the additional information, I should include that I have used MATLAB for some time but this is my first time using MATLAB with a cluster (as the processing time was taking to long locally).
- To submit a job to the cluster I have been using Enviroment -> Parallel -> Select a default cluster -> my cluster and run the job as this seemed to be handeling the submission part?
- I hadn't be using parfor to loop around the files as the load() command is incompatiable with this, if you know of a way around this I would love to know it. I have a large amount of computing resource so if I can compute different patient files in parallel this should massivly speed things up!
Sorry if these are all simple questions, the MATLAB docuemntation is very well maintained but there isn't as much information on the parrellel server end (I guess because all servers are diffrent, and only a small precentage of the MATLAB comunity uses this functionality) but I would love to see an 'idiots' guide to parrellel server processing several hundred files through the same set of functions!
In terms of 2022a, I could get it put on the server but this would take time. I would love to know the idea and possiably use it in the future.
In terms of what my processing pipeline looks like: I have 1000 eight hour recordings of 13 channels @ 200Hz. I want to take six of these channels which represent brainwaves and preform statistical analysis on then in the time and frequency domians. Currenlty I have been locally cycling though using for() each of the 1000 files, loading them with load() into my workspace, putting them through a set of custom functions to extract statistical properties and using saving some workspace variables with save() before moving on to the next file and rinise and repeat, this takes quite some time as you can imagine. If there is a better way to do this with a cluster I would love to know!
Thank you in advance, if you need any more information just let me know!
Christopher
Raymond Norris
on 27 Apr 2022
Here's how I would write your code
function TF = mccausland(brain_in_dir,brain_out_dir)
% BRAIN_IN_DIR is the the directory of all the brainwave MAT-files.
% BRAIN_OUT_DIR is the directory to store all the resulting MAT-files.
% Assume everything works fine. Will return true (success) by default.
TF = true;
try
if nargin==0
% If no directories where provided, assume everything is in the
% current folder. This is helpful if we're running everything
% local on our machine and don't want to have to pass in the
% folders to read/write to each time.
brain_in_dir = pwd;
brain_out_dir = pwd;
end
brain_mat_files = dir(fullfile(brain_in_dir,'*.mat'));
if isempty(brain_mat_files)
% Didn't find any brainwave files (either BRAIN_IN_DIR is bad or
% there aren't any MAT-files in it. In either case, exit early.
error("Failed to find any brainwave files in ""%s"".", brain_in_dir)
end
% Ensure output directory exists. If not, create it first.
if exist(brain_out_dir)~=7 %#ok<EXIST>
disp("Creating folder:" + brain_out_dir)
[PASSED, emsg, eid] = mkdir(brain_out_dir);
if PASSED~=true
% Invalid BRAIN_OUT_DIR name or failed write permissions. In
% either case, exit early.
error(eid,emsg)
end
end
parfor bidx = 1:numel(brain_mat_files)
% BRAIN_FILE is a structure, containing all MAT-files. We
% specifically will need "brain_file.name" and "brain_file.folder".
brain_file = brain_mat_files(bidx);
% Operate on BRAIN_FILE. Write results to BRAIN_OUT_DIR.
unit_of_work(brain_file, brain_out_dir)
end
catch E
% Found an error. Display the last error and return failed.
disp(E.message)
TF = false;
end
end
function unit_of_work(bfile, brain_out_dir)
brain_file = fullfile(bfile.folder,bfile.name);
% Load brain file
load(brain_file) %#ok<LOAD>
% For this example, let's assume the variables "ch1" and "ch2" were
% stored in the MAT-file.
ch1 = rand(1,1000);
ch2 = rand(1,1000);
% Work with brainwaves / perform statistical analysis
...
% Save results (for example, channel variables)
% Need to determine unique output file name. In this case, we'll use
% the name of the input brainwave file and concatenate a suffix
[~, ifile] = fileparts(bfile.name);
rfile = fullfile(brain_out_dir,ifile + "_results");
save(rfile,"ch1","ch2")
end
This way you could run your code locally on a smaller set of brainwave files but then also run it on the cluster. To submit your job to the cluster, try the following
function job = submit_brainwave_job()
cluster = parcluster();
idir = "~/home/user/Database/Physionet/training/";
odir = "~/home/user/Database/Physionet/training/RESULTS";
job = cluster.batch(@mccausland,1,{idir,odir}, ...
"AutoAddClientPath",false, "CaptureDiary",true, ...
"CurrentFolder",".", "Pool",100);
end
Obviously, change the Pool size to something that best fits your cluster/algorithm.
Christopher McCausland
on 29 Apr 2022
Hi Raymond,
This is very lovely code. It's really intresting to see how MATLAB staff code and pick up a few tips and tricks! Thank you so much.
I have a few questions, some of which are technical and some of which are more to understand your throught process. I am student and I always try to learn as much as I can from people that know what they are doing so I can improve myself.
1) In terms of the paths function mccausland, if I still have to pass in the variables brain_in_dir and brain_out_dir as a string is this not still technically hard coding? I know I could use uigetdir() for this locally but not so much on the cluster.
I think having this as a function rather than a script will reduce the initial workspace transfer overhead to the cluster which is good, is that why you have suggested to move to a function based approach?
2) If mccausland is run without input you use pwd too look in the current directory (pretty smart!) Is this the prefered error handeling techique for file paths etc?
3) To run this, currenlty I have a script file which acts as a 'main' to call fucntions, would you suggest changing this to a function and just running from the command window instead?
4) What is the differance between batch and parpool? I have read the docuemention but I cannot figure out what the fundemental diffrence between the submission methods are?
5) Function unit_of_work, is there any method to load multiple files and compute seperate patients in parrellel or is this a terriable idea?
6) submit_brainwave_job() thats a really clean way to do things! In terms of the function file being coppied to the worker will this also copy the 'helper functions'/additional required functions files?
7) "AutoAddClientPath",false - This is because we are adding our own paths to the cluster storage with idir,odir?
8) "CurrentFolder","." - Will the current folder be okay on the local machine or should this be moved to the cluster? What does it mean by this being the folder the script/fucntion executes in?
Sorry for all the questions I really appreciate you taking the time to answer them and share your knowledge!! Once again thank you for the code above too, it's so helpful to see how it's done professionaly and I have learnt a lot to apply to my own work moving forward!
Kind regards,
Christopher
Raymond Norris
on 29 Apr 2022
1) In terms of the paths function mccausland, if I still have to pass in the variables brain_in_dir and brain_out_dir as a string is this not still technically hard coding? I know I could use uigetdir() for this locally but not so much on the cluster.
I think having this as a function rather than a script will reduce the initial workspace transfer overhead to the cluster which is good, is that why you have suggested to move to a function based approach?
It's ok to pass in file/folder names at the top level, you want to avoid hardcoding names in the bowels of the code. mccausland doesn't know anything about the folder it should read or write. It's just told using the input variables.
functions are MUCH better than scripts here. Otherwise, batch will send EVERYTHING in your workspace and EVERYTHING back from the MATLAB running on the cluster (including what you sent over to begin with). I'm betting somewhere along the line, you have (large?) temporary variables that aren't needed. Parameterizing your code allows you to bound what gets passed back and forth.
2) If mccausland is run without input you use pwd to look in the current directory (pretty smart!) Is this the preferred error handling technique for file paths etc.?
Obviously you could swap out pwd for any other local directory, but for illustration purposes, I'm using pwd. You could also call uigetdir if you don't pass in folder names, since you can assume you're running on your local machine.
3) To run this, currently I have a script file which acts as a 'main' to call functions, would you suggest changing this to a function and just running from the command window instead?
Yes, but I might need to see main.
4) What is the difference between batch and parpool? I have read the documentation but I cannot figure out what the fundamental difference between the submission methods are?
parpool and batch are job launchers. parpool spawns a job from the current MATLAB client and the MATLAB client (and machine running it) needs to continuously run. batch is an asynchronous call that launches a job but doesn't require the MATLAB client (nor the machine running it) to continuously run, instead batch will add +1 worker to run as the "proxy MATLAB client". Let me give you a couple of examples
parpool
MATLAB is running on my local Windows machine. I start a local parallel pool of 4 workers. Once started, I can then call parfor/spmd/etc. MATLAB sends instructions/data to each of the workers, there's compute, and then the data comes back to MATLAB. For instance
parpool("local",4)
parfor idx = 1:N
A(idx) = rand;
end
plot(A)
While the parfor is running, my MATLAB client is blocked. If I need more than 4 workers, I'll create a new cluster profile and launch a larger size job on the cluster. For instance
parpool("pbs",400);
parfor idx = 1:N
A(idx) = rand;
end
plot(A)
I'm skipping over a lot here, because I just want to focus on the differences between parpool and batch. Again, while parfor is running, the MATLAB can't run any other code.
batch
To avoid blocking MATLAB or if I want to run lots of jobs at the same time, I can bundle my code up and pass it to batch. In this case, one of the input arguments to batch is the size of the pool that should be launched later (not on my local machine). For instance
% Batch job will start 401 workers (400 + 1)
job = batch(@mycode,.., 'Pool',400);
Now the question is, how do I get back A that was assigned in mycode? When I ran the parallel pool, after the parfor, I could just reference A (A is automagically pulled back to the client MATLAB for me.) First, you need to ensure that the job (calling mycode) has finished running. When you call parfor in your client MATLAB, you either (A) know parfor was finished because it couldn't run plot until it was or (B) you have a visual cue that parfor is finished because the MATLAB prompt comes back. For batch, you need to use the job object to get the state and then you use the job object to "fetch" the results (for those familiar with the new R2022a ValueStore feature, I'm disregarding that for now).
% Wait for the job to finish running
job.wait
% Fetch the results
A = job.fetchOutputs{1};
% Use the results
plot(A)
One advantage batch has is that you can offload several jobs at once
job1 = batch(@mycode1,.., 'Pool',200);
job2 = batch(@mycode2,.., 'Pool',200);
The bottom line is that running parpool in your current MATLAB client gives a more natural flow by calling parfor/spmd/etc. directly. batch allows you to push everything off of the MATLAB client (adding an additional worker to the job), and provides an API to manage the job object.
NOTE: look at the batch documentation for an example workflow: https://www.mathworks.com/help/parallel-computing/run-a-batch-job.html
5) Function unit_of_work, is there any method to load multiple files and compute separate patients in parrellel or is this a terrible idea?
I need more information to follow this. Flesh out unit_of_work if that helps.
6) submit_brainwave_job() thats a really clean way to do things! In terms of the function file being copied to the worker will this also copy the 'helper functions'/additional required functions files?
MATLAB will run a dependency checker on submit_brainwave_job. Every function you wrote or MAT-file you depend on will be attached to the job (parpool works the same way). When the dependency checker hits a MATLAB built-in function, it stops and goes back up the tree, because built-ins of course are already on the cluster. There's a bit of art here though. You're going to start running submit_brainwave_job many times. Each job will need to traverse the dependencies and attach to the job. You might find this costly after a while if there are a lot of files, especially ones that you don't modify. You might consider bundling these files up and placing them on the cluster, and then specifying AdditionalPaths. Only do this for "static" files -- ones that you don't change at all. If you're constantly changing the files locally on your machine, you're better off having MATLAB find them and then attach them to the job for you.
7) "AutoAddClientPath",false - This is because we are adding our own paths to the cluster storage with idir,odir?
This is because our local machine has a different file system then the cluster, so we don't want to automatically add our client path to the path of the workers. For example, my Windows path (c:\work\matlab) can't be added to the worker path running on Linux (/mnt/home/raymond).
8) "CurrentFolder","." - Will the current folder be okay on the local machine or should this be moved to the cluster? What does it mean by this being the folder the script/function executes in?
CurrentFolder is where the workers should start on the cluster. I default to '.', which is a shortcut for my home directory. But you could also set it, for example, to /home/raymond/matlab/project-1. It's really important to understand where the workers are running because of (A) paths -- workers might not find a file in your home directory on the cluster, so you'll want the workers to start elsewhere and (B) if the workers save a file to the current directory, where exactly is the "current directory" on the cluster? Explicitly setting it in your call to batch gives you a clearer picture where you are writing to.
Christopher McCausland
on 2 May 2022
Hi Raymond,
Thank you so much for taking the time again. You have been absolutely brillent! My one final question is this, within mccausland you have the following block of code:
parfor bidx = 1:numel(brain_mat_files)
% BRAIN_FILE is a structure, containing all MAT-files. We
% specifically will need "brain_file.name" and "brain_file.folder".
brain_file = brain_mat_files(bidx);
% Operate on BRAIN_FILE. Write results to BRAIN_OUT_DIR.
unit_of_work(brain_file, brain_out_dir)
end
I am correct in saying:
- This will proccess multiple .mat files similtanously (Basically replacing the orignal FOR loop but the parrellel nature will mean it processes faster)
- How can this block 'load' .mat file data (i think thats what the variable brain_file is doing?) without using load()? Equally I know you can't use load() in a parfor loop.
- Lastly, while not a concern currently how would you do the same but with say a .csv file?
I think this will answer all my questions, so I will accept the answer after this one and give you peace for now! Thank you so so much for taking the time to reply to all of these, the knowledge has been wonderful and so helpful! I am already starting to apply it to my own programming!
Kind regards,
Christopher
Raymond Norris
on 3 May 2022
Within mccausland you have the following block of code. I am correct in saying:
This will process multiple .mat files simultaneously (Basically replacing the original FOR loop but the parallel nature will mean it processes faster).
Correct. Each iteration will process a MAT-file. MATLAB will chunk up the MAT-file names and disperse them to the workers. It's then up to unit_of_work what to do with them. A brute force approach would be for MATLAB to give each worker an equal number of MAT-files to processes. However, MATLAB doesn't assume it takes the same amount of time to process each iteration. Therefore, MATLAB will give a subportion to each. When the worker has processed all the subiterations, MATLAB will give out more, so that ideally all workers will be busy for the duration of the parfor.
How can this block 'load' .mat file data (i think that's what the variable brain_file is doing?) without using load()? Equally I know you can't use load() in a parfor loop.
The following is how we select the brain file to use, but it's not what loads the brain file, that's done in unit_of_work
brain_file = brain_mat_files(bidx);
It's helpful to understand why load/save are problematic in parfor-loops. In fact, your code can call load/save.
load
Take the following example:
parfor idx = 1:N
%%%%%%%% run on another machine %%%%%%%%
load XY
z = x .* y .* idx;
%%%%%%%% run on another machine %%%%%%%%
end
Of course the parfor block doesn't need to "run on another machine", but I want to emphasize a point here, which is, what would it take to run this code in a completely different process on a completely different machine (maybe different OS)? We need the file XY and we need the "tokens" x, y, and idx. What do I mean by "token"? MATLAB evaluates the code and identifies the elements x, y, and idx to either be variables or functions. You see the code x and y and think it must be defined in XY.mat, but you wouldn't if the MAT-file was called MONDAY.mat. Likewise, MATLAB doesn't know to deduce from the MAT-file name that x and y are in it. If MATLAB doesn't know what the token is, then it assumes it's a function (that can be resolved on the workers running on the other machine). MATLAB tells the workers that x and y are functions and that idx is a variable. Then the workers run the block of code, load XY, which in turn creates the variables x and y, contradicting what it was told by MATLAB.
There are a couple of ways to resolve this.
Solution #1
parfor idx = 1:N
%%%%%%%% run on another machine %%%%%%%%
data = load('XY');
z = data.x .* data.y .* idx;
%%%%%%%% run on another machine %%%%%%%%
end
I'm not really a fan of this, because that's not how I'd write it in a for-loop. But in a pinch, it'll work.
Solution #2
What I coded.
parfor idx = 1:N
%%%%%%%% run on another machine %%%%%%%%
unit_of_work(idx)
%%%%%%%% run on another machine %%%%%%%%
end
function unit_of_work(idx)
load XY
z = x .* y .* idx;
end
This also has its drawbacks. What if we need x and y later in the code? The way it's written scope the variables to the subfunction. But the reason these work are because we're not introducing new variables to the parfor-loop on the fly (e.g., eval would also cause issue). In solution #1, we already know that d is a variable. In solution #2, we aren't introducing any new variables at all. All the code is pushed into our refactored code, unit_of_work (I just use that name, it can be called anything).
save
This is a little trickier. Unlike load, you can't call save directly in a parfor-loop, but it also can be refactored and called in a subfunction, like solution #2. save requires knowing something about the workspace that called it. Take the following example
A = rand;
for idx = 1:N
save RESULT
end
What gets stored in RESULT? MATLAB looks at its workspace and finds A, idx, and N, which are all stored in RESULT. Now write this as a parfor
A = rand;
parfor idx = 1:N
%%%%%%%% run on another machine %%%%%%%%
save RESULT
%%%%%%%% run on another machine %%%%%%%%
end
MATLAB didn't send the workers any variables and the workers can't reach back asking for variables, so how can it save A, idx, and N? Here's a workaround:
A = rand;
parfor idx = 1:N
%%%%%%%% run on another machine %%%%%%%%
unit_of_work(A,idx,N)
%%%%%%%% run on another machine %%%%%%%%
end
function unit_of_work(A,idx,N)
save RESULT A idx N
end
You probably want a unique MAT-file name -- refer back to what I showed in the code.
Two salient points here
- For parfor loops to work properly, they must (ultimately) behave and provide the same results as a for-loop, but quicker.
- Whenever you modify your code to parallelize it, go back and run it as a for-loop to ensure you haven't changed the behavior/output.
Lastly, while not a concern currently how would you do the same but with say a .csv file?
There are a couple of small changes and one larger change. To begin with, you'll want to pass in the file format to source. Let's make fext to be the file extension (update submit_brainwave_job as well).
function TF = mccausland(brain_in_dir,brain_out_dir,fext)
and then use it here
brain_mat_files = dir(fullfile(brain_in_dir,fext));
Replace any references to "mat" in the code, for instance change the variable to
brain_files = dir(fullfile(brain_in_dir,fext));
The piece that requires more work is the file "reader" (and maybe "writer" if you want to write non MAT-files). Before, we knew it was MAT-files, so we just call load. Now we need to provide the workers with the proper function to call to read the data.
Here's a modified version
function TF = mccausland2(brain_in_dir,brain_out_dir,fext)
% BRAIN_IN_DIR is the the directory of all the brainwave files.
% BRAIN_OUT_DIR is the directory to store all the resulting files.
% FEXT is the file extension of file we want to read.
% Assume everything works fine. Will return true (success) by default.
TF = true;
try
if nargin==0
% If no directories where provided, assume everything is in the
% current folder. This is helpful if we're running everything
% local on our machine and don't want to have to pass in the
% folders to read/write to each time.
%
% Default to reading and writing MAT-files.
brain_in_dir = pwd;
brain_out_dir = pwd;
fext = 'mat';
end
brain_files = dir(fullfile(brain_in_dir,fext));
if isempty(brain_mat_files)
% Didn't find any brainwave files (either BRAIN_IN_DIR is bad or
% there aren't any files in it. In either case, exit early.
error("Failed to find any brainwave files in ""%s"".", brain_in_dir)
end
% Ensure output directory exists. If not, create it first.
if exist(brain_out_dir)~=7 %#ok<EXIST>
disp("Creating folder:" + brain_out_dir)
[PASSED, emsg, eid] = mkdir(brain_out_dir);
if PASSED~=true
% Invalid BRAIN_OUT_DIR name or failed write permissions. In
% either case, exit early.
error(eid,emsg)
end
end
% Define readers and writers for the brain files
switch fext
case 'mat'
helper_fcns.reader_fcn = @load;
helper_fcns.writer_fcn = @save;
case {'csv', 'txt'}
% Select the oppropriate reader/writer from the list. Using an
% example here (readmatrix, writematrix)
% https://www.mathworks.com/help/matlab/import_export/supported-file-formats-for-import-and-export.html
helper_fcns.reader_fcn = @readmatrix;
helper_fcns.writer_fcn = @writematrix;
otherwise
error('Unsupported file format: %s', fext)
end
parfor bidx = 1:numel(brain_files)
% BRAIN_FILES is a structure, containing all brain files. We
% specifically will need "brain_file.name" and "brain_file.folder".
brain_file = brain_files(bidx);
% Operate on BRAIN_FILE. Write results to BRAIN_OUT_DIR. Provider
% helper functions for reading/writing brain files.
unit_of_work(brain_file, brain_out_dir, @helper_fcn)
end
catch E
% Found an error. Display the last error and return failed.
disp(E.message)
TF = false;
end
function unit_of_work(bfile, brain_out_dir, hfcns)
brain_file = fullfile(bfile.folder,bfile.name);
% Read in brain file
hfcns.reader_fcn(brain_file)
%%%%% CAUTION %%%%%
% For this example, let's assume the variables "ch1" and "ch2" were
% stored in the file.
ch1 = rand(1,1000);
ch2 = rand(1,1000);
% Work with brainwaves / perform statistical analysis
...
% Write results (for example, channel variables)
% Need to determine unique output file name. In this case, we'll use
% the name of the input brainwave file and concatenate a suffix
[~, ifile] = fileparts(bfile.name);
rfile = fullfile(brain_out_dir,ifile + "_results");
hfcns.writer_fcn(rfile,"ch1","ch2")
end
OK, here's the tricky part here (CAUTION). You need reader_fcn to return the same format of data so that you can work on the data "in the blind". By that I mean, load will load variables stored in the MAT-file. But readmatrix will assign the data to your output variable, for instance.
% MAT-file (load individual variables into the workspace)
hfcns.reader_fcn("foo.mat");
% MAT-file (load all variables into the structure A)
A = hfcns.reader_fcn("foo.mat");
% CSV file (load all data into the variable A)
A = hfcns.reader_fcn("foo.csv");
In the first example, MATLAB automagically import variables ("ch1", "ch2", etc.). But for the CSV file, we store everything in the variable A. You need to adapt to both so that the rest of your code isn't switching between working with MAT-files or working with CSV files.
The webpage I listed above in the comments gives a whole slew of file formats and output types (cell arrays, tables, etc.). You'll want to give some thought how best to design this so that your code is as flexible as possible. One option is to store data in your MAT-files as a single table. Then you'll know if your data is a structure, you can index into to get the table. For example
T = hfcns.reader(brain_file);
% T is either a struct containing a table (read from a MAT-file) or it is a
% table (read from a CSV file);
if isa(T, 'struct')
% The variable "T" is a struct that contains all the variables in the
% MAT-file. In this case, there's only one variable, also called "T"
% and is a table. The table T contains all the channels.
T = T.T;
end
% At this point now, regardless of how we read the data in, we have a
% variable T, which is of type table. The columns are the channels.
You'll want to do something similar for writing your results as well if you don't want to only write MAT-file. This type of obfuscating allows you to operate on a whole slew of data formats (databases, HDF5, etc.) without unit_of_work ever knowing what it's working on.
Christopher McCausland
on 3 May 2022
Hi Raymond,
This is absolutely brillent, thank you so much for all the help!! I have learnt so much from this and I really appreciate you taking the time to give such complete and informative answers! I am going to accept the answer now as you have been brillent and I know I've taken a lot of your time.
One final final question (and then I will leave you in peace) is that you mentioned that you would write the for-loop in Solution #1 diffrently. If its not too much trouble could you show me how you would do it?
Many, many thanks!
Christopher
Raymond Norris
on 3 May 2022
I meant more that I wouldn't write the for-loop as such
for idx = 1:N
data = load('XY');
z = data.x .* data.y .* idx;
end
I would have had it
for idx = 1:N
load XY
z = x .* y .* idx;
end
The structure d, with the fields x and y will be slightly larger than just the variables x and y. The second version is how you'd typically write the for-loop. Of couse all of this code is quite jibberish since you wouldn't load the same MAT-file in the for-loop. Here's a better example where each iteration is loading a different MAT-file and then storing the results in the array z.
for idx = 1:N
load(sprintf("XY_%d",idx))
z(idx) = x .* y .* idx;
end
Christopher McCausland
on 4 May 2022
Hi Raymond,
I understand what you mean now, thank you!
I lost access to the cluster for a few days while it was shut down for upgrades so I am finally able to do more testing with the code you outlined. One issue that I am facing is that I get the following error when running;
>> diary(ans)
--- Start Diary ---
--- End Diary ---
Task with properties:
ID: 1
State: finished
Function: @parallel.internal.cluster.executeFunction
Parent: Job 1
SchedulerID: 6720
StartDateTime: 04-May-2022 11:35:20
RunningDuration: 0 days 0h 0m 4s
Error: none
Warnings: Worker unable to add the following folders to the MATLAB search path at the start of the job:
<none>
This can occur when the worker has a different file system to the client. Try one of the following:
* Do not include these folders in the 'AdditionalPaths' parameter when creating a job.
* Do not include these folders in the 'AdditionalPaths' field of the cluster profile.
* Set the 'AutoAddClientPath' parameter to false when creating a job to prevent adding folders from your client's MATLAB search path.
Warning Stack: JobPathHelper>JobPathHelper.addAdditionalPaths (line 108)
dctEvaluateTask>iAddJobDependencies (line 473)
dctEvaluateTask>iEvaluateTask/nEvaluateTask (line 206)
dctEvaluateTask>iEvaluateTask (line 175)
dctEvaluateTask (line 81)
distcomp_evaluate_filetask_core>iDoTask (line 154)
distcomp_evaluate_filetask_core (line 52)
distcomp_evaluate_filetask (line 17)
Task with properties:
ID: 2
State: finished
Function: @parallel.internal.pool.poolWorkerFcn
Parent: Job 1
SchedulerID: 6720
StartDateTime: 04-May-2022 11:35:20
RunningDuration: 0 days 0h 0m 6s
Error: none
Warnings: Worker unable to add the following folders to the MATLAB search path at the start of the job:
<none>
This can occur when the worker has a different file system to the client. Try one of the following:
* Do not include these folders in the 'AdditionalPaths' parameter when creating a job.
* Do not include these folders in the 'AdditionalPaths' field of the cluster profile.
* Set the 'AutoAddClientPath' parameter to false when creating a job to prevent adding folders from your client's MATLAB search path.
Warning Stack: JobPathHelper>JobPathHelper.addAdditionalPaths (line 108)
dctEvaluateTask>iAddJobDependencies (line 473)
dctEvaluateTask>iEvaluateTask/nEvaluateTask (line 206)
dctEvaluateTask>iEvaluateTask (line 175)
dctEvaluateTask (line 81)
distcomp_evaluate_filetask_core>iDoTask (line 154)
distcomp_evaluate_filetask_core (line 52)
distcomp_evaluate_filetask (line 17)
I can tell that the error is probably related to idir,odir not being added to the search path however I don't understand why they aren't and the warning seems to conflict if they are or not with the <none> handle.
I was going to add these paths with 'AdditionalPaths' and I might give this a go anyways however I can see the output log warning me not to do this. As the whole thing finishes in under ten seconds its not running as it should but returns as finished with these two warnings rather than errors even though the code hasn't completed as expected.
I am hoping this it the last thing and I can leave you in peace soon!
Kind regards,
Christopher
Christopher McCausland
on 4 May 2022
For anyone following this i found the source of the very weird error; I was using a preconfigured cluster profile. Within the validation -> properties -> Files and Folders the empty text feild (which should only contain paths) had been filled in with <none>, hence the error.
This seems to have been a red herring however as once solved and the warning removed the entire programe still completes in ~10 seconds with the output not saved to the odir file path. My current working theory is that the idir is not seen either and therefore with nothing to process the code is complete in seconds. I can't figure out why idir and odir cannot be seen from the workers however.
Kind Regards,
Christopher
Raymond Norris
on 4 May 2022
As noted, this is a warning (not error) and is a red harring. You could reproduce this as such
>> c = parcluster("local");
>> j = c.batch(@pwd,1,{},'AdditionalPaths','\\does\not\exist');
>> j.wait
>> j.Tasks(1)
ans =
Task with properties:
ID: 1
State: finished
Function: @parallel.internal.cluster.executeFunction
Parent: Job 26
StartDateTime: 04-May-2022 11:13:34
RunningDuration: 0 days 0h 0m 2s
Error: none
Warnings: Worker unable to add the following folders to the MATLAB search path at the start of the job:
\\does\not\exist
This can occur when the worker has a different file system to the client. Try one of the following:
* Do not include these folders in the 'AdditionalPaths' parameter when creating a job.
* Do not include these folders in the 'AdditionalPaths' field of the cluster profile.
* Set the 'AutoAddClientPath' parameter to false when creating a job to prevent adding folders from your client's MATLAB search path.
If the workers couldn't find the idir, then brain_files should be empty and MATLAB should throw the error message
Failed to find any brainwave files in "/path/to/files".
Next, we don't see any error from creating the output folder, so I'm assuming we can get to the parfor-loop. What I would suggest doing is putting disp statements throughout the code to see how far things are getting (including in and after the parfor-loop), and displaying the size of brain_files -- is it the number of files you expected? Display the name of the results file (should be the full path) -- is it getting stored where you think it is?
Christopher McCausland
on 10 May 2022
Hi Raymond,
As a final round up for anyone following, to find your directories locally on clusters the easiest method is just to use home -> Parallel -> manage clusters -> select relevent cluster -> properties -> edit -> files and folders -> manually specify folders to add to the workers search path OR use AdditionalPaths.
This will add the paths for workers and is quite a nice way of doing things!
Lastly, Raymond, thank you so much for all the help and being so patient, I really appreciate all the time and effort you put in, i've learnt a lot from you! The in depth answers were brillent!
Kind regards,
Christopher
Raymond Norris
on 10 May 2022
Keep in mind that if you place the additional folder names in the profile, they will be used for each job you submit to the cluster. Adding it to the call to batch explicitly sets it for that job. In the case of adding paths, there's no overhead to speak of. However, wait until you need to debug a job where you can't understand why a job fails, only to discover that you included another path (listed in the profile) that was shadowing your other function. Listing the additional paths in the call to batch doesn't solve this issue, but it hopefully at least puts it in your face that you are adding /home/cmcausland/work/... to your job.
More Answers (0)
See Also
Categories
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)