Ah yes, "the world's fastest GPU", a tagline bounced around by the likes of Nvidia and AMD each and every time they release a new flagship GPU. While both companies have held that seemingly coveted title over the years, with the release of the GTX 980, Nvidia is back on top once more. There's no doubt that the GTX 980 is currently the best single-GPU you can buy, but its admirable frame-pushing performance isn't the most interesting thing about it.
Like the GTX 750 Ti before it, the GTX 980 makes use of Nvidia's Maxwell architecture, the successor to the Kepler architecture featured in almost all of the 6-series and 7-series GPUs. With Maxwell, Nvidia has continued to focus on the efficiency and performance per watt improvements it started with Kepler, resulting in a flagship GPU with a TDP of just 165W. By comparison, the GTX 780 Ti--the previous holder of the "world's fastest GPU" moniker--had a TDP of 250W.
That 85W difference is significant, not just for power consumption savings and the option to use a less demanding power supply, but in heat, noise, and performance too. With a lower TDP the GTX 980 runs cooler, which means its cooling fan can spins slower, which results in less noise. It also means that the 980 can run at its faster boost clock more of the time without it having to clock down to combat heat. That's of even more benefit if you're running dual, triple, or even quad-SLI configurations, where multiple GPUs tend to require more cooling.
Nvidia has accomplished this power-saving feat by making some significant changes with its Maxwell architecture. Just like Kepler, the Maxwell GM204 chip featured in the 980 is built on a 28nm process made up of an array of graphics processing clusters (GPCs), streaming multiproccessors (SMs), and memory controllers. Inside each GPC is a dedicated raster engine (the conversion of vector graphics into pixels), along with four SMs containing 128 CUDA cores, a Polymorph Engine, and eight texture units. That adds up to a total of 2048 CUDA cores and 128 texture units.
|GPU||GTX 680 (Kepler)||GTX 780 Ti (Kepler)||GTX 980 (Maxwell)|
|Base Clock||1006 Mhz||837 Mhz||1126 Mhz|
|GPU Boost Clock||1058 Mhz||876 Mhz|
|Memory Clock||6000 Mhz||7000 Mhz||7000 Mhz|
|Memory Bandwidth||192 GB/sec||288 GB/sec||224 GB/sec|
|Memory Bus Width||256-bit||384-bit||256-bit|
The eagle-eyed among you may have noticed that's a significant decrease from the 2880 CUDA cores and 240 texture units of the 780 Ti, or even the 2304 CUDA cores and 192 texture units of the 780, although it's a significant boost over the 680 many potential 980 customers will be upgrading from. Memory bandwidth has taken a hit too, with the 980 featuring a 256-bit memory bandwidth compared to the 384-bit bandwidth of the 780 Ti and 780. But why compare GK204 to the fully unlocked GK110 chips of the 780 Ti and 780? The 980 matches and usually bests the performance of the 780 Ti, and yet it does so using far less power. That's a mighty impressive feat of engineering, particularly given the 980 hasn't relied on a new production process for its power saving prowess.
Those power savings come from a set of underlying improvements to the architecture, most notably by giving each of the four warp schedulers (the distribution of threads/work across the GPU) their own pool of CUDA cores and execution resources, rather than having to share them out across each scheduler. This--along with a larger 2MB cache that Nvidia says results in fewer requests to the GPU's DRAM, and improved memory compression techniques--has resulted in a highly power-efficient architecture. There is a performance hit to this new setup, which has largely been negated by the 980's much higher base clock of 1126 MHz, and a boost clock of 1216 Mhz.
Design and Features
Housing all that silicon goodness (in the reference design at least) is Nvidia's excellent and rather attractive aluminum cooler. Whether manufacturing partners will adopt it, though, remains to be seen. It's certainly not the cheapest cooler to produce, and with such a low TDP on the card, partners could get away with being a little thriftier. Still, let's hope some adopt it, because it remains a fine performer, particularly for a blower-style design. And, with it cooling the low TDP of the 980, it's very quiet and cool, even under load.
On the output side the reference board comes with one DVI port, One HDMI 2.0-compliant port, and three full-size display ports. Power comes from two 6-pin inputs. The GTX 980 will fully support DirectX 12 when it's released, although Nvidia has announced that all Fermi, Kepler, and Maxwell GPUs will support the API, so don't feel you need to buy a new GPU just for DX12.
There's also support for Nvidia's G-Sync technology, which syncs the refresh rate of the monitor to the frame rate of your game. If you've not had a chance to check it out yet, I thoroughly recommend you head down to a shop that's got one of the coveted G-Sync monitors in stock and give it a look. No matter how much your frame rate bounces around (within reason: anything less than 30fps is still going to look a bit ropey), games look incredibly smooth, without any of the tearing and artifacts you might see from a normal monitor, and without the input lag you'd get from using V-sync.
Backing up the power-efficient hardware is a suite of additions to Nvidia's GameWorks middleware, and its GeForce Experience software for end-users. The most interesting, if not the most practical, is Nvidia's new Voxel Global Illumination technology, or VXGI for short. In essence, it picks up where Epic's now defunct voxel cone ray tracing technology left off, by trying to create a realistic lighting model without giving the GPU too much work to do. As one of Nvidia's own VPs Tony Tamasi told me, Epic's implementation "was very expensive" for the GPU, but Nvidia thinks it's got it right with VXGI.
It works by essentially translating a scene into voxels, and then computing how much light those voxels emit either directly or indirectly by bouncing light. VXGI then approximates the effect of secondary ray--the light that a light source emits and then bounces--by using cone tracing. This way, rather than calculate the thousands of rays that would normally have to be calculated, they're bunched up into a number of cones that approximate it.
It creates some great realistic-looking lighting effects, with less computational overhead than Epic's solution. Indeed, support for VXGI is coming to third-party engines like Unreal and CryEngine, but the jury's still out on whether the likes of PS4 and Xbox One have the computational chops to make it work. With consoles still the lead platform for many games, don't expect to see much of VXGI unless the consoles can take the strain.
Other features include Dynamic Super Resolution, which essentially renders a game at 4K resolution and downsamples it to a 1080p display using a 13-tap Gaussian filter. The feature works with most games thanks to its integration into GFE, and the results are good, although they still don't match a native 4K experience; there's also a similar performance hit to running the game at native 4K.
Finally, there's MFAA, which is a new type of anti-aliasing that promises the quality of 4XMSAA at the performance cost of 2XMSAA by using alternating AA sample patterns. Unfortunately, we couldn't test the feature out for launch, but it looked promising in demos, and Nvidia says support is coming to a range of games later in the year.
Granted, our current test rig isn't as sprightly as it used to be, but it's a good indicator of the kind of hardware still used in many gaming PCs today. Powering it all is an Intel Core i5-3570K processor overclocked to 4.2Ghz, an Intel Z77 DZ77GA-70K motherboard, 16GB of 1866 Mhz Corsair Dominator GT RAM, a 120GB Corsair Force LS SSD, and a Corsair HX 850 PSU. And the results? Very impressive, particularly when you consider how little power it's pulling down compared to the competition.
|GPU||Ultra @1080p, 8XAA FPS||Ultra @1440p, 8XAA, Extreme Tessellation||Ultra @4K, 8XAA, Extreme Tessellation|
|GTX 780 Ti||55||35||21|
|GPU||Ultra @1080p, TressFX, FXAA FPS||Ultra @1440p, TressFX, FXAA FPS||Ultra @4K, TressFX, No AA|
|GTX 780 Ti||74||49||27|
Metro: Last Light
|GPU||Ultra @1080p, Tessellation Normal, 2XSSAA, Advanced PhysX Off FPS||Ultra @1440p, Tessellation Normal, 2XSSAA, Advanced PhysX Off FPS||Ultra @4K, Tessellation Normal, No AA, Advanced PhysX Off FPS|
|GTX 780 Ti||77||47||41|
|GPU||Ultra @1080p, 2XMSAA, HBAO FPS||Ultra @1440p, 2XMSAA, HBAO FPS||Ultra @4K, No AA, HBAO FPS|
|GTX 780 Ti||82||60||38|
|GPU||Very High @1080p, 2XMSAA FPS||Very High @1440p, 2XMSAA FPS||Very High @4K, No AA FPS|
|GTX 780 Ti||54||33||19|
|GPU||Ultra @1080p, AO, AA FPS||Ultra @1440p, AO, AA FPS||Ultra @4K, AO, AA FPS|
|GTX 780 Ti||134||92||49|
With the exception of the Unigine Valley benchmark at 4K, the GTX 980 bested the 780 Ti, or at least matched it, while sailing past the frame rates of the GTX 780 and AMD's R9 290X--and it did so while remaining very quiet in use, and at its peak boost clock. There's definitely plenty of headroom in there for those inclined to overclock it; Nvidia recons 1400 Mhz on the boost clock is easily achievable.
The GTX 980 puts on an impressive show, but it doesn't come cheap. The 980 is set to retail for $549 in the US, and £429 in the UK. That's quite a bit cheaper than the $700 launch price of the 780 Ti and the $649 launch price of the 780, but both cards go for quite a bit less money these days. While the 980 costs substantially more than an AMD R9 290X in the UK, in the US at least, it's around the same price. Frankly, the 980 is a superior product to the R9 290X, and if you've got the choice, it's the GPU to go for. It's fast, cool, elegantly designed, and as quiet as you like under load.
The GTX 980 might not be a generational leap in frame-pushing performance, but it's certainly a generational leap in power efficiency.
If you're interested in Maxwell, but don't fancy splashing so much cash, another card based on same Maxwell GM204 chip as 980 is the 970. It features fewer CUDA cores and a slower clock speed, but will be available for a more palatable $329 in the US, and £259 in the UK. We'll have a review of that card up for you soon. The likes of the GTX 780 Ti, 780, and 770 are being discontinued, so you might be able to pick up a bargain as retailers clear stock. I'd take the 980 in a heartbeat if I were building a new system, though.
The GTX 980 might not be a generational leap in frame-pushing performance, but it's certainly a generational leap in power efficiency. AMD might be making some great GPUs like the R9 295X2, and great value GPUs like the R9 280, but they lack flexibility thanks to their hefty power requirements. The GTX 980 is just as home in small form factor mini-ITX PC as it is in a great tower, and without any noticeable drop in performance. And just think, if Nvidia can do this with a 165W TDP, if it ups the power with future cards, we might just get a generational leap in performance soon too.