I am running a fairly straightforward loop on a single machine, and want to take advantage of parallel processing (parfor on heaps).
Each iteration of the loop is supposed to take about the same time.
I am splitting between each worker equally, but I am unsure on how to treat the remainer, and am unsure whether this even makes a diffence.
Let's assume I have 3 workers and 19 iterations. I could do 7,7,5 or 7,6,6. Theoretically, they are both limited by the slower worker (on average either one with 7 iterations), so it wouldn't seem like it makes much of a difference.
Any insight on this would be helpful - namely which approach should I use.