Are there MATLAB use cases for which a Windows laptop connected to 1 to 6 GPUs in external thunderbolt enclosures is useful?
5 views (last 30 days)
I recently noticed that 3 GPUs in Thunderbolt enclosures connected to a Thunderbolt 4 hub that is connected to a Thunderbolt port of a Windows 11 PC with 12th generation Intel CPU appear to be fully functional.
I am a test engineer at a company that sells an item that I have mentioned. I understand if an admin would prefer to delete this message. If so, could you put me in touch with a technical peer at Mathworks so we can talk shop?
I am here to learn if there are MATLAB use cases for which this configuration might be useful.
An advantage of this configuration is being able to easily connect and disconnect a laptop to a bunch a GPUs.
Drawbacks of this configuration are the limited PCIe bus bandwidth between GPU and CPU and expense of Thunderbolt-PCIe enclosures. I used https://www.mathworks.com/help/parallel-computing/measuring-gpu-performance.html to measure shared CPU to GPU bandwidth of this 3 GPU configuration of 2.2 GB/s. This configuration is best suited for MATLAB sessions for which CPU-GPU interaction are a small part of the workload. I used the parfevalexample from https://www.mathworks.com/help/parallel-computing/run-matlab-functions-on-multiple-gpus.html as a guide with N = 1000; numSimulations = 3; numIterations = 3000000. I tested with three Nvidia GTX 1060 GPUs. Completion times relative to one GTX 1060 GPU in an internal PCIe slot are:
1 GPU = 1.0 2 GPU = 0.77 3 GPU = 0.43
I am seeking feedback if this is an interesting result or not as well as workloads to test.
I would love to test "RTX 6000 Ada Generation" GPUs in this configuration, but GTX 1060 is the only type of Nvidia card I have.
Jack on 25 Mar 2023
Having multiple GPUs connected to a Thunderbolt hub can be useful for certain use cases, such as data processing or machine learning tasks that can be parallelized and distributed across multiple GPUs. However, as you mentioned, the limited PCIe bandwidth may be a bottleneck for some workloads.
In terms of your results, a speedup of 0.77 and 0.43 for 2 and 3 GPUs, respectively, compared to a single GPU is a good improvement. However, the performance gain may vary depending on the specific workload and how well it can be parallelized across multiple GPUs.
Regarding the RTX 6000 GPUs, they are high-end GPUs designed for professional applications such as scientific computing and machine learning. They have a higher memory bandwidth and more CUDA cores than the GTX 1060, so they may provide even better performance gains when used in a multi-GPU setup. However, they are also more expensive, so it may not be practical for everyone to use them.