Sometimes training diverges and it's wasteful to re-run with a specific stopping epoch. I could use checkpoints and sift through the trainingInfo to find the lowest loss, but that carries costs in memory and time. Also I am using the Experiment Manager and I'd have to add special code to load the correct checkpoint in my custom metric function. Having a training option to return the net that minimizes the loss would allow me to easily compare multiple experiments, where divergence may occur at different training epochs. Also, minimizing the loss is the definition of training, so it just makes sense.