folding, RTX 3070 & single-thread execution speed…

first, some background…
Many forum posts discuss the effects of PCIe lanes on Folding GPU performance. The default GPU slot with x16 lanes is base gear that most folders are running with; usually most folders also find acceptable results with PCIe x8 slots, but dropping to x4 will show a significant loss in Folding PPD production rates.
I’ve seen little to no discussion of single-thread CPU speed on GPU point production for Folding. I suspect the primary reason for that is because most people are folding on desktop/gaming PCs with relatively fast CPUs.

December 2020, I started a transition to server class xeon systems to leverage the powerful CPU PCIe pipelines of the Intel-xeon LGA 2011 architecture, a single system, that could easily drive several GPUS for Folding. Ebay is an excellent resource for vintage hardware to build systems like this at a reasonable cost.

My first dual-Xeon system build for Folding:

  • Supermicro X9-DR3-LN4F+ mobo
  • Dual Xeon E5-2630 2.6 GHz CPUs
  • customized Veddha mining frame
  • Noctua NH-D9L heatsinks

That’s a classic GTX 750Ti GPU for the install. Noctua heatsinks were used because of their lower profile. The Supermicro SNK-P0050AP4 is a better choice, slightly taller, much cheaper and fits both wide and narrow LGA2011 sockets, as I discovered later.

My first 3070 was installed in a X570 system, running a Ryzen 5 3600 CPU. I never got good results with a second GPU in this system. In fact, the Gigabyte specs were wrong, despite claiming PCIE 8x performance, only 4x is achievable, as both #2 & #3 PICe x16 slots were driven by a single PCH x4 connection. So, this system became a single slot folder and pushed me into the decision to make an architecture switch. The chassis is a stripped-down server tower laid-flat, with added fans and side opened for ventilation. Desktop, is the likely future for this mobo & CPU.

3070 #2 was an EVGA notify card. Installed on the Supermicro X9-DR3 shown above.
After watching and comparing the two systems for over a month, not a big surprise, the faster Ryzen 5 system maintained an edge on the slower Xeon 2.6GHz system. Not a big edge, but eventually, the wondering-why got the better of me. Comparing the two CPU’s, the single-thread performance of the Ryzen 5 had a 64% advantage!

Digging into the HFM logs, the Ryzen 5 edge was more apparent. The AMD CPU dominated the Top20 PPD ranks for the 3070 cards.

Of course, the vintage motherboard would limit my options for a faster CPU having better single-thread execution. Using the CPUbenchmark.net website, it was fairly easy to isolate candidate CPUs.

The first candidate I spotted was the E5-2673 v2, then, with a little scrutiny, I found the E5-2667 v2, which had a slightly better single-thread performance index.

Searching Ebay, I found a local vendor. $117/ea.

I installed the E5-2667 CPUs yesterday.

Today, I’ve been watching the two 3070s flip-flop between the top spot while folding. Something I had not seen before.

I call that a WIN!
This is a Linux system, but, how does single-thread execution performance apply to a Windows system? Linux Folders use one core per GPU, whereas Windows uses two cores per GPU for Folding. A question for another day…

CPUBase GHzSingle Thread
Ryzen 5 36003.6 GHz2584
E5-2630 v22.6 GHz1534
E5-2673 v23.3 GHz1959
E5-2667 v23.3 GHz2027

Update:

After several days of accumulated data, the top numbers for the two systems being compared were about equal. Bumping the single-thread execution speed had equalized the results. While not matching the Ryzen 5 rating, the shift from a 1534 rated CPU to 2027 was sufficient to shift the bottleneck to some other portion of the compute architecture.