Showing results for 
Search instead for 
Did you mean: 

ROG X99 ... 4-GPU, NVLink, new rad?

Level 14
My primary desktop rig at home (in my profile System Specs) is meant for "prosumer" use. It's used for daily (and nearly continuous) work-related hard computing. My attempt to build a "personal supercomputer" or "superworkstation" which isn't locked into a proprietary leasing structure (I want to properly own all my own hardware).

I require a true (not emulated) x86/x64 platform for maximal compatibility/performance when running heavy PC-based apps. These days the list (in no particular order) includes: Autodesk/AutoCAD/Inventor, CATIA/Solidworks/SolidworksPCB, Pantheon PCB, MATLAB, SIMUL8, Suigyodo BSch3V, TINA, NI Multisim, Siemens NX11/ST9, CIRCAD, EasyEDA, CUSPICE, OrCAD, Altium Designer 17, DipTrace ... and Crysis 3, BioShock Infinite, Fallout 4, Elite: Dangerous, Witcher 3, Grand Theft Auto V. I push a whole lot of pixels every day, even if many of them aren't actually sent to a monitor, lol.

I prefer a generic full tower PC/ATX chassis "form factor". No more ugly racks and frames and blade units for me. And I really like my pretty modded Obsidian 750D case far too much - it's elegant, spacious, quiet, cool, it provides plenty of upgrade options - for me it's a perfect balance between slick professional and aggressive gamer and modded happiness, plus it matches my workspace's brushed aluminum decor.

The cores/threads and PCIe lanes (for multi-GPU and fast SSD) and quad-channel memory bandwidth on HEDT are far more useful for me (in this machine) than the increased speed and compatibility and features offered by Z270/etc. I might upgrade to X299, once it proves stable - or even to an AMD X390/X399 (or whatever) if it's better and it can run NVIDIA GPUs, once it proves stable.

I used to run x16/x16 4GB GTX980 cards (owned by me), alongside a 24GB twin-Tesla K80 card (leased from SuperMicro).
FP64 compute consistently held ~3200 GFLOPS (~2900 CC3.7 plus ~330 CC5.2). Gaming fps was (I thought then) "superior", though in hindsight it was merely "excellent". The Tesla card wasn't directly useful for games, though I was able to use it for (overkill) PhysX emulation.

But now the SuperMicro lease has expired and the K80 is gone. I upgraded my (now embarassing and gutless) twin GTX980 cards to twin 12GB Titan Xp cards. FP64 compute is now "only" ~760 GFLOPS (CC6.1). Gaming fps is now "unbelievable", twice as powerful as before!

And my evil boss wants more productivity (and less gaming), he's arranging for me to get a pair of 16GB Quadro GP100 cards (leased from SuperMicro).
FP64 compute from these two cards should be ~10600 GFLOPS (CC6.0). Gaming performance should still be "unbelievable" ... but also be upper capped at a flat 60fps (albeit across up to four discrete and simultaneously-rendered 4K/2160p 60Hz DP 1.4 outputs, lol).
Apparently I fall within the exact niche NVIDIA wants to sell these cards to. And apparently they think my niche doesn't play high-fps games.

NVIDIA's 2-way Quadro NVLink Bridge would provide 32GB/s bidirectional HBM2 bandwidth. Only available from NVIDIA, only one model, only in 2-slot spacing. A leased component which would cost over $1000 to purchase (or replace), lol, so I'd be apprehensive about modding (or breaking) it.

NVIDIA's Quadro SLI HB Bridge would provide 2GB/s bidirectional bandwidth. Available from NVIDIA in 2-slot or 3-slot spacing. But also available (quite affordably and in many styles) from many non-NVIDIA sources in multiple standard/nonstandard or even flexible slot spacings.


My questions:

I currently run an X99 R5E motherboard (I do want an R5E10, I'm just unwilling to pay "major" price for "minor" upgrade).
It doesn't matter which slots the Quadro cards go into, provided they are adjacent for the NVLink Bridge's fixed 2-slot spacing.
It doesn't matter which slots the Titan Xp cards go into, though I'd prefer them to be adjacent (not in PCIE_X16_1 and PCIE_X8_4, lol).

I haven't found any decent articles or benchmarks about GP100 (fps) gaming performance. I don't know if these things would run x16/x16 (with either the SLI HB Bridge or the NVLink Bridge). As far as raw fps goes, I can't even tell which way the numbers lean, though I suspect a Titan would win by a slight margin one-vs-one but the Quadros (with NVLink) would win by large margin two-vs-two. Even taking into consideration that the PCIe3 point-to-point interconnect bus is rarely saturated during multi-GPU gaming.
Titan Xp = 12GB of 11.4Gbps 384-bit GDDR5X (547.7GB/s), 30 SMs, 3840 CUDA Cores, 3840 Shaders, 240 TMUs, 96 ROPs, 1480MHz/1582MHz.
Quadro GP100 = 16GB of 1.4GHz 4096-bit ECC HBM2 (720GB/s); 60 SMs, 3584 CUDA Cores, 3584 Shaders, 224 TMUs, 128 ROPs, 1395MHz/1430MHz.

I'd have a 40-lane CPU with four GPU cards installed. Would they only run as x16/x8 "SLI", even when only two GPU cards are actually used?
Could an R5E10 (or any other X99) motherboard run the same hardware at x16/x16 "SLI" (without PLX fakery)?
Is there any way to use all four GPU cards when gaming, even if two could only be used for PhysX support?
Is there any way to deactivate/unpower "useless" GPU cards (through software) when gaming?

Is there any way to run a full 4-GPU x16/x8/x8/x8 "SLI" setup? Surely GP102 and GP100 ain't so different ... alas, GP102 is also used by Quadro P6000 cards, but they're basically just hugely overpriced versions of already-hugely-overpriced Titan Xp cards, with nearly identical spec but yucky green trim ... I'd rather have 4 Titan Xp cards, if forced.

Four double-slot GPU cards could get a little hot and crowdy. I might need to liquid cool the two cards I own (not gonna muck around with the $16000 cards I don't own, lol, they can stay on air cooling). I could install a 420mm top rad up to (exactly) 85mm thick, but which would generally be the best cooling option for a pair of mighty GPUs under consistently moderate/heavy load in a warm (30C~35C) ambient?
Two 25mm-thick push-pull fans with 35mm-thick rad?
One 25mm-thick push(?) fan with 60mm-thick rad?
Two slim (~15mm-thick) push-pull fans with ~54mm-thick rad?

I ask these things now because I have to give my evil boss a decision about whether or not I want him to lease these (SuperMicro) cards for the next 12 or 18 or 24 months. Before the end of May, lol, 'tis the season for renewed contracts, they like to lock us into leases right before the season Intel/NVIDIA/etc launches newer and better and cheaper stuff, lol.

I do have the option of buying two more Titan Xp cards instead. Or waiting until "real" Titans with better FP64 spec come out (though I agree with the "expert" consensus that they never will, NVIDIA's experiments are over and their greed spilleth over). And - who knows? - maybe AMD will actually jump back into the workhorse race. I must accept/deny my boss's leased hardware offer before deadline and I can buy/upgrade my own hardware any time - but declining his offer now basically "forces" me onto the upgrade path soon, lol, and working must take precedence over gaming (or so says the evil boss says, anyhow).

I also have the option of returning my Titan Xp cards. (And yes I'm well aware of how they compare vs lower-priced GTX1080Ti cards, especially now that factory-overclocked versions of the latter are available. I paid top price to gain every advantage I could in my DPFP computes, Titan Xp might be only ~5% better than GTX1080Ti for my needs but that's still enough for them to "pay for themselves".)
"All opinions are not equal. Some are a very great deal more robust, sophisticated and well supported in logic and argument than others." - Douglas Adams


Level 9
Just pointing you to a post with great info about someone who has a functional 4-way Pascal Titan setup. Hopefully you'll find some useful info there. His primary purpose is for "Its for medical imaging, VMs, nuc med, isotopes", but does some gaming (though supposedly 4-way is not supported).

pharma wrote:
Just pointing you to a post with great info about someone who has a functional 4-way Pascal Titan setup. Hopefully you'll find some useful info there. His primary purpose is for "Its for medical imaging, VMs, nuc med, isotopes", but does some gaming (though supposedly 4-way is not supported).

His machine is a beast! Supermicro X10CRG-Q, C612 chipset, twin E5-2699-4 CPUs (each 2.2GHz~3.6GHz, 22C/44T, 55MB), 1024GB ECC DDR4-2400, two Samsung 961 PRO NVMes, ten Samsung 850 PRO SSDs in RAID, lol, and of course four Titan Xp GPU cards.
Odd that he doesn't have dual PSUs or a UPS, a small price to ensure such expensive hardware (and precious data) remains intact.
I doubt his x16/x16/x16/x16 SLI is what it seems, though. That mobo has shared/linked QPI busses and multiple PLX functions, it services two CPUs and up to 8 GPUs alongside a handful of other PCIe3 x8 devices (and a huge SATA array, dual M.2s, dual Gbit LANs, USB3 galore, etc, etc). I wonder what sorts of games could possibly use 44 cores and 1024GB memory.

My platform is much less manly and costs much, much less. My Titan Xps are magnificent at FP32 (and thus at gaming!) but they ain't so great at FP64. Four Titan Xps (even in the ideal x16/x16/x16/x16 SLI) would still only produce ~1520 GFLOPS. Apparently that's not an issue with VMs and nuclear medicine imaging - where it seems more cores are where it's at - but it is a major bottleneck with the stuff I do in my field of work, lol.

SLI is not just about gaming, it's built into CUDA. x16/x16, in CUDA terms, is actually far faster than x16/x8. I can understand not having my two Quadro cards involved (in a x16/x8/x8/x8 SLI) - since they'd be paired with NVLink - but I wonder if there's any way I can get my X99 to use the two (active) Titan Xp cards as x16/x16.

it's all such a mess, lol, I might just go with a C612 platform myself - don't need 44C/88T so much (a mere 16C/32T would do), but I could actually fill up 2048GB ECC memory, lol. It's not ROG, but an ASUS Z10PE-D16 WS motherboard with a pair of E5-1680-4 Xeons and a mountain of memory ... hmm ... if only I can avoid the usual ugly 4U Tower setup ...
"All opinions are not equal. Some are a very great deal more robust, sophisticated and well supported in logic and argument than others." - Douglas Adams