Oct 19, 2016

Hands-On With Nvidia's Titan X (Pascal) In SLI

We test VR, SLI, Nvidia's new high-bandwidth bridges, and the concern that HBM2 might have been a better choice for Titan X.

Introduction

Two Titan X (Pascal) In SLI: Nvidia dropped the GeForce brand too late in development.Two Titan X (Pascal) In SLI: Nvidia dropped the GeForce brand too late in development.

Let me start with what this article is not about. It's not about value. It's not about mainstream (or even enthusiast) gaming. It's not about comparing AMD and Nvidia.

Now that official support for three- and four-way SLI in games is gone, this article is about the pinnacle of performance achievable in late 2016 with Nvidia's Pascal architecture, assuming money is no object.

Good Luck Buying One (Or Two)

In order to get my hands on two Titan X cards, I set myself an agenda alert on August 2 to check Nvidia’s online sales page. I checked at midnight. Nothing. I signed up for the "Notify me" email, woke up, and went to work. Still no email. Then I checked at 9:15 AM EST and saw the “Buy Now” button. I took the opportunity and snatched two cards. The notification email arrived about an hour later. Oh, and I totally forgot the new "SLI HB" bridge, so I ended up ordering that separately.

Not long after, Nvidia's online store was out of stock. Since then, we've seen them in and out of stock. But as of October 17th, they're available for purchase from geforce.com. If you’re crazy and lucky enough to get your hands on a pair, consider snagging the SLI HB bridge, too.

MORE: Best Graphics Cards

MORE: Desktop GPU Performance Hierarchy Table

Less SLI Is More SLI

As you no doubt read in our Nvidia GeForce GTX 1080 Pascal Review, Nvidia is curtailing SLI quite a bit. The 1060 doesn't support SLI at all. Moreover, the 1070 and 1080 officially only support two-way configurations in real-world games. The same goes for Titan X. If you were hoping to run three or four of these, you’d have to jump through some hoops. And even then, they’ll only work in approved benchmarks (not games you'd actually play). If you absolutely must try a three- or four-way arrangement in something like 3DMark, you'll need to generate a unique __hardware signature using software from Nvidia that can be used to request an “unlock” key.

Last but not least, the Pascal-based boards introduce a new SLI bridge dubbed "SLI HB Bridge," which Nvidia claims "doubles the bandwidth of previous SLI bridges." Technically, you can still use the old "soft" SLI bridges, though.

For reference, the bandwidth of old-school SLI bridges has long been officially quoted as "up to 1 GB/s." At the Pascal launch event, Nvidia mentioned the new SLI bridges supporting a higher pixel clock of 650 MHz (versus the older interface’s 400 MHz), while allowing for a dual-link connection, effectively bringing the available bandwidth in the 3GB/s range for two-way SLI configurations. By comparison, that's less than four lanes of PCIe 3.0 bandwidth.

Nvidia asks you to shell out $40 plus shipping and handling for these, and the company sells them directly on its site. EVGA offers them too.

From an engineering perspective, we really don't care that three- and four-way SLI support was dropped. Based on the multi-GPU rendering technique commonly used (Alternate Frame Rendering), PCs with several graphics cards contend with increased latency, diminishing scaling beyond two GPUs, compatibility issues (especially when it comes to zero-day game support), and a lack of functionality in VR. Particularly given the performance of today’s Pascal-based GPUs, the only way three- and four-way setups make sense is for synthetic benchmarking. Hopefully that means Nvidia will put more effort into improving the current state of two-way SLI.

Test System

Intel Core i7-6700K
$419.99 Amazon
EVGA GeForce GTX 980 SC ACX 2.0
$554.99 NewEgg
Nvidia Titan X
$1200.00 Nvidia
Corsair Vengeance LPX 16GB (2x 8GB)
$69.99 Newegg
Asus Z170-Deluxe
$319.99 Newegg
Samsung SM951 512GB SSD
$499.95 Amazon
Windows 10 Pro 64-bit
$139.99 Newegg

It would have been great to compare two Maxwell-based Titan X cards to the newer Titan Xes for a generational comparison. Alas, I don’t have two of the older cards on-hand. I did, however, have a couple of EVGA GeForce GTX 980 SCs in SLI.

SLI And VR

Without going into depth on how SLI works (you’ll find more detail in Nvidia's SLI Technology In 2015: What You Need To Know), it primarily utilizes that AFR technology mentioned on the previous page, which imposes a two-frame delay (or queue) to properly provide scaling benefits. And while you won't mind those extra milliseconds on a typical desktop display, an additional 17 ms of motion-to-photon lag in VR will affect your experience.

The workaround for VR is assigning GPUs to specific eyes for stereo rendering acceleration. This naturally requires optimization on the developer’s end, and as a result, SLI just isn’t supported by most VR titles as of mid-2016.

Consequently, we did some testing of the Oculus Rift using the only title we know punishes a GeForce GTX 980, Elite: Dangerous. We’re manually reporting performance based on the in-HMD debug tool display. Unfortunately, Oculus has not responded to our requests to enable logging-to-disk of that tool's data, so we can’t chart that experience out quantitatively.

SLI is disabled in these runs. You can leave the technology turned on in Nvidia's control panel, but the application only exploits one GPU no matter what.

  • Photon2
  • Headroom
  • Avg FPS

Notice that a single Titan X (Pascal) is barely able to keep up using the Ultra detail preset, with a minimal performance headroom of ~10%. Conversely, a GTX 980 just isn't fast enough to facilitate a smooth experience, averaging 73.1 FPS (below the 90 FPS target), while almost doubling motion-to-photon latency. Not even the Titan X manages to stay below the 20 ms that John Carmack of Oculus describes as the "sweet spot" of VR. You'll have to drop the quality level for an optimal VR experience in Elite: Dangerous for now.

MORE: Best Virtual Reality Headsets

The Need to Overclock…Your CPU?

I've been using 3DMark Fire Strike as a synthetic benchmark since it came out in 2013. Its "basic" run (non-Ultra and non-Extreme) is pretty taxing on GPUs in its two graphics tests. The other two tests tend to be CPU-bound.

So here's the news: two Titan X (Pascal) cards in SLI are actually CPU-bound in the first graphics test of Fire Strike, though not in the second. And that’s with a Core i7-6700K boosting to 4.2 GHz. Forget about using Unigine Valley or other older tests to saturate the Titan Xes. GPU utilization sits at around 50% in those; the bottleneck is clearly our host processor.

  • DX12 TS
  • DX11 FS

I managed to get the two cards in SLI stable at +190 MHz core and +160 MHz memory, which represents about a +10% overclock. As you no doubt know, clock rates in SLI are synchronized, so your headroom is limited to the less-scalable GPU.

MORE: Best CPUs

MORE: Intel & AMD Processor Hierarchy

The Lack Of HBM2 Is No Big Deal

Given AMD’s use of HBM on its Fiji-based cards, __hardware enthusiasts are quick to question Nvidia’s lack of HBM2 on the Titan X. The company instead chose to give this card a massive 12 GB of GDDR5X with a nominal memory bandwidth of 480 GB/s. That's a 42% increase over the previous-generation Titan X that offered 337 GB/s.

While it is true that HBM2 facilitates a bandwidth advantage over GDDR5X, the benefits of more throughput would only become apparent in situations where on-die resources aren't being fed fast enough. The concept is akin to PCIe 3.0/2.0/1.1 links between CPU and GPU, and you can already run any modern card with four lanes of third-gen PCIe with little or no performance degradation. Bumping up that bandwidth with eight or 16 lanes yields a 1% to 2% performance increase on average, and anything beyond confers negligible gains.

Below you can see a short run of The Witcher 3 running on two Titan Xes in SLI at 5K (5120x2880p).

Scaling Is Good At 5K; SLI Is Overkill For 1440p

In games with proper SLI support, you might expect scaling between +70-80% when you add a second Titan X. That’s a fairly typical number in graphics-bound workloads.

  • SLI Disabled
  • FR 1440
  • FR 2880

Indeed, the Titan X’s SLI scaling falls within that range…unless you’re gaming at 2560x1440, where two GP102-powered cards become CPU-limited. Consequently, scaling pares back to +43% in Total War: Warhammer (a Gaming Evolved title) and +54% in The Witcher 3 (a GameWorks title).

What If We Skip The SLI HB Bridge?

Now, you could argue that if you splurge on $2400 worth of graphic cards, skimping on the $40 SLI HB bridge that supposedly doubles (or even triples) available bandwidth between the cards is a silly thing to do.

And yes, of course, it really would be silly. But as a check of how essential that component may be to Titan X (and also GTX 1080/1070) owners, we decided to run a few tests using an old-school flexible SLI bridge.

  • FROT 1440
  • FROT 2880
  • SLI HB TW3
  • SLI HB TW

The results are interesting. Total War: Warhammer sees absolutely no difference with the low-bandwidth SLI bridge that came with our Asus motherboard, while performance in The Witcher 3 drops from 25% (at 1440p) to 35% (at 2880p) using the flexible connector.

Conclusion

Assuming you're swimming in cash and lucky enough to catch the Titan X in stock, $2400 gets you the best gaming performance that money can buy.

The card’s scaling in two-way SLI is pretty good; expect 70-80% over a single card (depending on the game) at 5K. If you plan on playing at 1440p (or lower), you’re more likely to see a 40-50% speed-up before your CPU becomes the bottleneck at higher frame rates.

We've also proven that you want that $40 SLI HB bridge on top (something that GTX 1080/1070 owners will appreciate knowing), else you may face a 25 to 35% performance hit in certain games.

For VR, you're better off with just one card at the moment. It'll be a while before developers start taking advantage of multiple GPUs—growth on that front has been slow thus far. Still, a single Titan X is capable of running Elite Dangerous using the Ultra preset at a steady 90 FPS on an Oculus Rift (although motion-to-photon latency is still a sub-optimal 27 ms.)

Whether your rig is a $500 value box with second-hand-parts scavenged off of eBay or a $10,000 behemoth, we'll leave you by the same letters Nvidia inscribed inside the box of the Titan X:

GL HFGL HF

MORE: Best Deals

MORE: Hot Bargains @PurchDeals

No comments:

Post a Comment