Progress Update from Neural Network Trainer running on Parallel Server Cloud

2 views (last 30 days)
I am performing some training of a neural network using some pretty standard code:
[net1,tr] = train(net1,X,Y1,'useParallel','yes')
When I do the training locally, the Neural Network Training window, gets constantly updated, and I can follow the progress of the Epochs and the Performance. Aborting the training if something is not working.
However, I also have Amazon AWS Setup as cloud support for the parallel toolbox.
When I activate the Parallel Pool and the training uses the cloud, I get no update at all. The Neural Network Training Window, stays at Epoch 0 of 10000, until all training is completed. Even if this is several hours. :(
Is there a way to force how often updates are reported back from the parallel pool? Or do I need to script it myself, forcing a limit of 100 Epochs per training call, and continually pass the trained network back and forwards?
Thanks in advance
  3 Comments
SMEAC
SMEAC on 19 Oct 2023
When I run the parallel Toolbox on my own computer, I get an update too.
But when the Parallel Pool is remote on AWS, I get no update at all.
Sam Marshalik
Sam Marshalik on 20 Oct 2023
I spun up a MATLAB Parallel Server cluster using Cloud Center and ran the job there without any issues. I could see the epoch progress, just like I could with a local parallel pool:
I will note that it took a bit for the epochs to start moving, so make sure you wait a minute to ensure nothing is actually populating there.
If you are running MATLAB Parallel Server in Cloud Center like I am, I would expect you to see the progress as I can. I would suggest reaching out to Technical Support.

Sign in to comment.

Answers (0)

Categories

Find more on Parallel and Cloud in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!