The machine booted successfully using the chassis power button. The initial screen displays that the CPU and RAM are recognized. The image was taken when only the initial 16 GB RAM was recognized before moving the RAM card to another slot.
I decided to install the latest LTS (Long Term Support) desktop version of Ubuntu 18.04.1 LTS. I chose Ubuntu over Windows 10 since some machine learning packages like OpenCV can only run on Ubuntu, and some applications like Docker are built on Linux and are easier to install on Ubuntu.
I downloaded the 2 GB Ubuntu 18.04.1 ISO file and burned a bootable ISO file onto a USB flash drive using Rufus. The machine booted off the USB flash drive without changing BIOS boot settings beforehand. I chose default Ubuntu installation options with login credentials required. Installation steps are described here in more detail.
The only installation issue I had was that I could not get past the Ubuntu login screen. After every attempt to login by entering the password, Ubuntu would return to the login screen. No error appeared for an incorrect password since the password was correct. I selected the Use LVM with the new Ubuntu installation option on the first installation. LVM stands for Logical Volume Management and allows the user to add, modify, resize, and take snapshot partitions.
The infinite login loop issue was resolved by reinstalling Ubuntu without selecting the LVM option.
I completed building the deep learning machine this past weekend. I will describe the final steps for assembling the hardware in this Part 3 and discuss the OS installation in Part 4.
Solid State Drive The build has a 500 GB Samsung 960 EVO M.2 solid state storage drive. The drive uses the NVMEe protocol which can be utilized by the M.2 socket with ‘M’ keying (read more about keying here). The Strix X99 M.2 socket runs on a PCIe 3.0 x4 lane which it shares with a U.2 connector. The socket is compatible with the following SSD sizes: 2242/2260/2280/22110. The first two numbers ’22’ are the width (22 mm), and the remaining numbers are the length (42 mm, etc.). The M.2 socket was designed to provide faster link speeds than the mini-SATA connector. The SATE 3.0 has a link speed of up to 6 GB/s versus the PCIe 3.0 x4 lane which runs up to 20 GB/s. The 960 EVO has sequential read/write speeds up to 3.2 GB/s and 1.8 GB/s. Read more about the performance difference between SATA and M.2 at PCWorld here.
When inserted into the M.2 socket, the SSD will be angled upward. I first install a hex head jack screw to raise the screw mount even with the socket and press the SSD onto the jack screw while screwing the mounting screw into the jack screw.
RAM
I have started the build with two 16 GB DDR4 RAM cards.
I initially installed the two RAM cards in the D1 and B1 motherboard locations as recommended by the motherboard manual and as shown below.
After completing the installation and booting the machine, the BIOS utility only recognized the RAM in the B1 slot though the D1 slot is recommended as the first slot to use with one card.
When I researched this issue, the first solutions I found recommended overclocking the motherboard with increased RAM slot voltage to permit using additional RAM cards. In this case, I moved the card in D1 to A1, and the BIOS utility recognized two cards and 32 GB RAM. I recommend moving RAM cards to another slot as the first troubleshooting step when a card is not recognized.
GPU
The build will begin with one EVGA GTX 1080 Ti 11GB graphical processing unit. Tests have shown the 1080 Ti performance is comparable to the more expensive Titan X for machine learning applications. The motherboard has 3 PCIe x16 slots and is suited to run two GPU utilizing x16 PCIe lanes each or three GPU in a x16/x8/x8 configuration.
The Strix X99 motherboard has two PCIe x16 slots and one PCIe x8 slot, and the first GPU is installed in the first slot.
PSU
I chose a EVGA 1000 GQ power supply unit for the build. Since the absolute peak load for a GTX 1080 Ti that is overclocked is 350 W, the 1000 W PSU will be sufficient for an upgrade to two GPUs. The EVGA 1000 GQ comes with one ATX 20+4-pin cable for the main motherboard power supply, two 8(4+4)-pin CPU cables, two standalone 8(6+2)-pin and four 8(6+2)-pin x2 cables, three SATA 5-pin x4 cables, one Molex 4-pin x3 cable, and one Molex to FDD adapter. The ‘4+4’ notation indicates that two 4 pin male cables are wired adjacently to connect with either an 8 or 4 pin female connector.
I recommend connecting all required cables to the PSU prior to installing in the case since access to the rear of the PSU is restricted inside the case. The Phanteks Eclipse P400 case has an exhaust port along the case bottom for the PSU fan.
Motherboard and Power Connections
Once the PSU is installed, I complete the build by making all remaining motherboard and power connections. The GTX 1080 Ti requires one 8 pin and one 6 pin VGA cable.
The motherboard has a 4-pin header for the water pump and a 4-pin header for the CPU fan. The water pump power connector has holes for 3 pins and should be connected to the 3 header pins aligned with the bar as shown below. The fourth pin left of the water pump connector has no bar behind it. The fourth header pin would allow pump/fan speed control via pulse width modulation. The CPU fan has holes for four pins.
The chassis is connected to the motherboard with the following connections. The power, reset, and HD audio connections are shown in the upper left corner of the image below. The chassis USB 3.0 ports are connected in the center connector. The front panel audio connector is shown in the upper right.
The CPU is powered with both the 8-pin and 4-pin ATX connectors. The CPU water cooler is powered with a 15-pin SATA connector from the PSU as shown below in the upper left corner.
After binding the wires in the chassis behind the motherboard, the machine hardware build is now complete! In Part 4, I will describe the OS and software installation.
I have started building a desktop machine designed for fitting machine learning models including deep learning applications. The Reinforcement Learning and Decision Making class in the OMS CS program at Georgia Tech motivated me to build a machine appropriate for machine learning applications as I start the second half of the masters program. I was able to complete RLDM with my laptop which has a 2.16 GHz Celeron processor and 8 GB RAM, but I plan to use the new machine for upcoming machine learning classes.
I have prioritized designing the machine learning desktop around the GPU(s). I will take a short detour to explain why. GPUs have become the main engine for solving computations in data science models versus CPUs. GPUs have many more arithmetic logic units (ALUs) than CPUs which provides an improved ability to perform simple operations in parallel. Machine learning, artificial intelligence, and deep learning problems generally require matrix math operations that can be accelerated by solving in parallel. My design goals were:
A powerful GPU that has sufficient RAM to be well suited for computer vision applications. Some users have reported using 8 GB RAM at a minimum for training computer vision models, but an upgrade to 11 GB is beneficial. The GPU should also have broad support for machine learning libraries. The cuDNN library built on top of Nvidia CUDA programming framework is used by major deep learning frameworks including TensorFlow and PyTorch. I decided on the GeForce GTX 1080 Ti made by EVGA which has a Nvidia processor with 11 GB.
Sufficient RAM to handle a future upgrade to two GPUs. The machine should have at least as much RAM as the GPUs. Since I would like the machine to be ready for a possible upgrade to 2 GPUs in the future, I have purchased 32 GB RAM.
A 40-lane CPU that can accommodate an upgrade to two 16 PCIe lane GPUs while maximizing the PCIe lanes for data transfer between the CPU and GPU. The data transfer between the CPU and GPU across PCIe lanes can be a bottleneck which slows the GPU performance depending on the application. I chose an Intel Xeon E5 1620 V4 3.5 GHz processor over an i-7 series processor since the Xeon has 40 PCIe lanes which will allow two GPUs to use 16 lanes apiece. *07/02/21 UPDATE: My research in 2018 indicated that PCI lanes may restrict GPU performance. Given the cost of the GPU, I preferred to ensure my system did not restrict performance due to data transfer limitations. However, more recent posts have shown that deep learning may be restricted by memory but should be little restricted by PCI lanes and data transfer with the CPU. Tim Dettmers has a nice article discussing GPU selection for deep learning: https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/.
A motherboard that suits the GPU and CPU and handles an upgrade to two GPUs while maximizing the PCIe 3.0 lanes for data transfer between the CPU and GPU. PCIe 3.0 is recommended for multiple GPU machines. To have space for two 1080 Ti GPUs, the motherboard needs to support two dual-width x16 graphics slots. The motherboard should also have a LGA 2011 processor slot for the Xeon processor. I chose the ASUS STRIX X99 motherboard which provides 40 PCIe 3.0 lanes which supports a 16/16/8 configuration.