Building a Deep Learning Machine – Part 1: Components

I have started building a desktop machine designed for fitting machine learning models including deep learning applications. The Reinforcement Learning and Decision Making class in the OMS CS program at Georgia Tech motivated me to build a machine appropriate for machine learning applications as I start the second half of the masters program. I was able to complete RLDM with my laptop which has a 2.16 GHz Celeron processor and 8 GB RAM, but I plan to use the new machine for upcoming machine learning classes.

I have prioritized designing the machine learning desktop around the GPU(s). I will take a short detour to explain why. GPUs have become the main engine for solving computations in data science models versus CPUs. GPUs have many more arithmetic logic units (ALUs) than CPUs which provides an improved ability to perform simple operations in parallel. Machine learning, artificial intelligence, and deep learning problems generally require matrix math operations that can be accelerated by solving in parallel. My design goals were:

  • A powerful GPU that has sufficient RAM to be well suited for computer vision applications. Some users have reported using 8 GB RAM at a minimum for training computer vision models, but an upgrade to 11 GB is beneficial. The GPU should also have broad support for machine learning libraries. The cuDNN library built on top of Nvidia CUDA programming framework is used by major deep learning frameworks including TensorFlow and PyTorch. I decided on the GeForce GTX 1080 Ti made by EVGA which has a Nvidia processor with 11 GB.
  • Sufficient RAM to handle a future upgrade to two GPUs. The machine should have at least as much RAM as the GPUs. Since I would like the machine to be ready for a possible upgrade to 2 GPUs in the future, I have purchased 32 GB RAM.
  • A 40-lane CPU that can accommodate an upgrade to two 16 PCIe lane GPUs while maximizing the PCIe lanes for data transfer between the CPU and GPU. The data transfer between the CPU and GPU across PCIe lanes can be a bottleneck which slows the GPU performance depending on the application. I chose an Intel Xeon E5 1620 V4 3.5 GHz processor over an i-7 series processor since the Xeon has 40 PCIe lanes which will allow two GPUs to use 16 lanes apiece. *07/02/21 UPDATE: My research in 2018 indicated that PCI lanes may restrict GPU performance. Given the cost of the GPU, I preferred to ensure my system did not restrict performance due to data transfer limitations. However, more recent posts have shown that deep learning may be restricted by memory but should be little restricted by PCI lanes and data transfer with the CPU. Tim Dettmers has a nice article discussing GPU selection for deep learning: https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/.
  • A motherboard that suits the GPU and CPU and handles an upgrade to two GPUs while maximizing the PCIe 3.0 lanes for data transfer between the CPU and GPU. PCIe 3.0 is recommended for multiple GPU machines. To have space for two 1080 Ti GPUs, the motherboard needs to support two dual-width x16 graphics slots. The motherboard should also have a LGA 2011 processor slot for the Xeon processor. I chose the ASUS STRIX X99 motherboard which provides 40 PCIe 3.0 lanes which supports a 16/16/8 configuration.

I have provided a full list of the components I chose on PC Part Picker: https://pcpartpicker.com/list/RtKCq4 .

References

Tim Dettmer: https://blog.slavv.com/picking-a-gpu-for-deep-learning-3d4795c273b9

Slav Ivanov: https://blog.slavv.com/picking-a-gpu-for-deep-learning-3d4795c273b9

Yan-David Erlich: https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460