The Fiji GPU: Go Big or Go Home

Now that we’ve had a chance to take a look at the architecture backing Fiji, let’s talk about the Fiji GPU itself.

Fiji’s inclusion of High Bandwidth Memory (HBM) technology complicates the picture somewhat when talking about GPUs. Whereas past GPUs were defined by the GPU die itself and then the organic substrate package it sits on, the inclusion of HBM requires a third layer, the silicon interposer. The job of the interposer is to sit between the package and the GPU, serving as the layer that connects the on-package HBM memory stacks with the GPU. Essentially a very large chip without any expensive logic on it, the silicon interposer allows for finer, denser signal routing than organic packaging is capable of, making the ultra-wide 4096-bit HBM bus viable for the first time.

We’ll get to HBM in detail in a bit, but it’s important to call out the impact of HBM and the interposer early, since they have a distinct impact on how Fiji was designed and what its capabilities are.

As for Fiji itself, Fiji is unlike any GPU built before by AMD, and not only due to the use of HBM. More than anything else, it’s simply huge, 596mm2 to be precise. As we mentioned in our introduction, AMD has traditionally shied away from big chips, even after the “small die” era ended, and for good reason. Big chips are expensive to develop, expensive to produce, take longer to develop, and yield worse than small chips (this being especially the case early-on for 40nm). Altogether they’re riskier than smaller chips, and while there are times where they are necessary, AMD has never reached this point until now.

The end result is that for the first time since the unified shader era began, AMD has gone toe-to-toe with NVIDIA on die size. Fiji’s 596mm2 die size is just 5mm2 (<1%) smaller than NVIDIA’s GM200, and more notably still hits TSMC’s 28nm reticle limit. TSMC can’t build chips any bigger than this; Fiji is as big a chip as AMD can order.

AMD Big GPUs
  Die Size Native FP64 Rate
Fiji (GCN 1.2) 596mm2 1/16
Hawaii (GCN 1.1) 438mm2 1/2
Tahiti (GCN 1.0) 352mm2 1/4
Cayman (VLIW4) 389mm2 1/4
Cypress (VLIW5) 334mm2 1/5
RV790 (VLIW5) 282mm2 N/A

Looking at Fiji relative to AMD’s other big GPUs, it becomes very clear very quickly just how significant this change is for AMD. When Hawaii was released in 2013 at 438mm2, it was already AMD’s biggest GPU ever for its time. And yet Fiji dwarfs it, coming in at 158mm2 (36%) larger. The fact that Fiji comes at the latter-half of the 28nm process’s life time means that such a large GPU is not nearly as risky now as it would have been in 2011/2012 (NVIDIA surely took some licks internally on GK110), but still, nothing else we can show you today can really sell the significance of Fiji to AMD as much as the die size can.

And the fun doesn’t stop there. Along with producing the biggest die they could, AMD has also more or less gone the direction of NVIDIA and Maxwell in the case of Fiji, building what is unambiguously the most gaming/FP32-centric GPU the company could build. With GCN supporting power-of-two FP64 rates between 1/2 and 1/16, AMD has gone for the bare minimum in FP64 performance that their architecture allows, leading to a 1/16 FP64 rate on Fiji. This is a significant departure from Hawaii, which implemented native support for ½ rate, and on consumer parts offered a handicapped 1/8 rate. Fiji will not be a FP64 powerhouse – its 4GB of VRAM is already perhaps too large of a handicap for the HPC market – so instead we get AMD’s best FP32 GPU going against NVIDIA’s best FP32 GPU.

AMD’s final ace up their sleeve on die size is HBM. Along with HBM’s bandwidth and power benefits, HBM is also much simpler to implement, requiring less GPU space for PHYs than GDDR5 does. This is in part due to the fact that HBM stacks have their own logic layer, distributing some of the logic on to each stack, and furthermore a benefit of the fact that the signaling logic that remains doesn’t have to be nearly as complex since the frequencies are so much lower. 4096-bits of HBM PHYs still takes up a fair bit of space – though AMD won’t tell us how much – but it’s notably lower than the amount of space AMD was losing to Hawaii’s GDDR5 memory controllers.

The end result is that not only has AMD built their biggest GPU ever, but they have done virtually everything they can to maximize the amount of die space they get to allocate to FP32 and rendering resources. Simply put, AMD has never reached so high and aimed for parity with NVIDIA in this manner.

Ultimately this puts Fiji’s transistor count at 8.9 billion transistors, even more than the 8 billion transistors found in NVIDIA’s GM200, and, as expected, significantly more than Hawaii’s 6.2 billion. Interestingly enough, on a relative basis this is almost exactly the same increase we saw with Hawaii; Fiji packs in 43.5% more transistors than Hawaii, and Hawaii packed in 43.9% more transistors than Tahiti. So going by transistors alone, Fiji is very much to Hawaii what Hawaii was to Tahiti.

Finally, as large as the Fiji GPU is, the silicon interposer it sits on is even larger. The interposer measures 1011mm2, nearly twice the size of Fiji. Since Fiji and its HBM stacks need to fit on top of it, the interposer must be very large to do its job, and in the process it pushes its own limits. The actual interposer die is believed to exceed the reticle limit of the 65nm process AMD is using to have it built, and as a result the interposer is carefully constructed so that only the areas that need connectivity receive metal layers. This allows AMD to put down such a large interposer without actually needing a fab capable of reaching such a large reticle limit.

What’s interesting from a design perspective is that the interposer and everything on it is essentially the heart and soul of the GPU. There is plenty of power regulation circuitry on the organic package and even more on the board itself, but within the 1011mm2 floorplan of the interposer, all of Fiji’s logic and memory is located. By mobile standards it’s very nearly an SoC in and of itself; it needs little more than external power and I/O to operate.

Fiji’s Architecture: The Grandest of GCN 1.2 Fiji’s Layout
Comments Locked

458 Comments

View All Comments

  • chizow - Friday, July 3, 2015 - link

    No silverblue, you contributed just as much to the unrealistic expectations during the Rebrandeon run-up along with unrealistic expectations for HBM and Fury X. But in the end it doesn't really matter, AMD failed to meet their goal even though Nvidia handed it to them on a silver platter by launching the 980Ti 3 weeks ahead of AMD.

    And spate of returns for the 970 memory fiasco? Have any proof of that? Because I have plenty of proof that shows Nvidia rode the strength of the 970 to record revenues, near-record market share, and a 3:1 ownership ratio on Steam compared to the entire R9 200 series.

    If Fury X is an experiment as you claim, it was certainly a bigger failure than what was documented here at a time AMD could least afford it, being the only new GPU they will be launching in 2015 to combat Nvidia's onslaught of Maxwell chips.
  • mapesdhs - Friday, July 3, 2015 - link

    A lot of the 970 hate reminded me of the way some people carried on dumping on OCZ long after any trace of their old issues were remotely relevant. Sites did say that the 970 RAM issue made no difference to how it behaved in games, but of course people choose to believe what suits them; I even read comments from some saying they wanted it to be all deliberate as that would more closely match their existing biased opinions of NVIDIA.

    I would have loved to have seen the Fury X be a proper rival to the 980 Ti, the market needs the competition, but AMD has goofed on this one. It's not as big a fiasco as BD, but it's bad enough given the end goal is to make money and further the tech.

    Fan boys will buy the card of course, but they'll never post honestly about CF issues, build issues, VRAM limits, etc.

    It's not as if AMD didn't know NV could chuck out a 6GB card, remember NV was originally going to do that with the 780 Ti but didn't bother in the end because they didn't have to. Releasing the 980 Ti before the Fury X was very clever, it completely took the the wind out of AMD's sails. I was expecting it to be at least level with a 980 Ti if it didn't have a price advantage, but it loses on all counts (for all the 4K hype, 1440 is far more relevant atm).
  • silverblue - Friday, July 3, 2015 - link

    How about you present proof of such indescretions? I believe my words contained a heavy dose of IF and WAIT AND SEE. Speculation instead of presenting facts when none existed at the time. Didn't you say Tahiti was going to be a part of the 300 series when in fact it never was? I also don't recall saying Fury X would do this or do that, so the burden of proof is indeed upon you.

    Returns?
    http://www.techpowerup.com/209409/perfectly-functi...
    http://www.kitguru.net/components/graphic-cards/an...
    http://www.guru3d.com/news-story/return-rates-less...

    I can provide more if you like. The number of returns wasn't exactly a big issue for NVIDIA, but it still happened. A minor factor which may have resulted in a low number of returns was the readiness for firms such as Amazon and NewEgg to offer 20-30% rebates, though I imagine that wasn't a common occurrence.

    Fury X isn't a failure as an experiment, the idea was to integrate a brand new memory architecture into a GPU and that worked, thus paving the way for more cards to incorporate it or something similar in the near future (and showing NVIDIA that they can go ahead with their plans to do the exact same thing). The only failure is marketing it as a 4K card when it clearly isn't. An 8GB card would've been ideal and I'd imagine that the next flagship will correct that, but once the cost drops, throwing 2GB HBM at a mid-range card or an APU could be feasible.
  • chizow - Sunday, July 5, 2015 - link

    I've already posted the links and you clearly state you don't think AMD would Rebrandeon their entire 300 desktop retail series when they clearly did. I'm sure I didn't say anything about Tahiti being rebranded either, since it was obvious Tonga was being rebranded and basically the same thing as Tahiti, but you were clearly skeptical the x90 part would just be a Hawaii rebrand when indeed that became the case.

    And lmao at your links, you do realize that just corroborates my point the "spate of 970 returns" you claimed was a non-issue right? 5% is within range of typical RMA rates so to claim Nvidia experienced higher than normal return rates due to the 3.5GB memory fiasco is nonsense plain and simple.

    And how isn't Fury X a failed experiment when AMD clearly had to make a number of concessions to accommodate HBM, which ultimately led to 4GB limitations on their flagship part that is meant to go up against 6GB and 12GB and even falls short of its own 8GB rebranded siblings?
  • silverblue - Monday, July 6, 2015 - link

    No, this is what was said in the comments for http://www.anandtech.com/comments/9241/amd-announc...

    You: "And what if the desktop line-up follows suit? We can ignore all of them too? No, not a fanboy at all, defend/deflect at all costs!"
    Myself: "What if?

    Nobody knows yet. Patience, grasshopper."

    Dated 15th May. You'll note that this was a month prior to the launch date of the 300 series. Now, unless you had insider information, there wasn't actually any proof of what the 300 series was at that time. You'll also note the "Nobody knows yet." in my post in response to yours. That is an accurate reflection of the situation at that time. I think you're going to need to point out the exact statement that I made. I did say that I expected the 380 to be the 290, which was indeed incorrect, but again without inside information, and without me stating that these would indeed be the retail products, there was no instance of me stating my opinions as fact. I think that should be clear.

    RMA return rates: https://www.pugetsystems.com/labs/articles/Video-C...

    Fury X may or may not seem like a failed experiment to you - I'm unsure as to what classifies as such in your eyes - but even with the extra RAM on its competitors, the gap between them and Fury X at 4K isn't exactly large, so... does Titan X need 12GB? I doubt it very much, and in my opinion it wouldn't have the horsepower to drive playable performance at that level.
  • chizow - Monday, July 6, 2015 - link

    There's plenty of other posts from you stating similar Silverblue, hinting at tweaks to silicon and GCN level when none of that actually happened. And there was actually plenty of additional proof besides what AMD already provided with their OEM and mobile rebrand stacks. The driver INFs I mentioned have always been a solid indicator of upcoming GPUs and they clearly pointed to a full stack of R300 Rebrandeons.

    As for RMA rates lol yep, 5% is well within expected RMA return rates, so spate is not only overstated, its inaccurate characterization when most 970 users would not notice or not care to return a card that still functions without issue to this day.

    And how do you know the gap between them isn't large? We've already seen numerous reports of lower min FPS, massive frame drops/stutters, and hitching on Fury X as it hits its VRAM limit. Its a gap that will only grow in newer games that use more VRAM or in multi-GPU solutions that haven't been tested yet that allow the end-user to crank up settings even higher. How do you know 12GB is or isn't needed if you haven't tested the hardware yourself? While 1xTitan X isn't enough to drive the settings that will exceed 6GB, 2x in SLI certainly is and already I've seen a number of games such as AC: Unity, GTA5, and SoM use more than 6GB at just 1440p. I fully expect "next-gen" games to pressure VRAM even further.
  • 5150Joker - Thursday, July 2, 2015 - link

    If you visit the Anandtech forums, there's still a few AMD hardcore fanboys like Silverforce and RussianSensation making up excuses for Fury X and AMD. Those guys live in a fantasy land and honestly, the impact of Fury X's failure wouldn't have been as significant if stupid fanboys like the two I mentioned hadn't hyped Fury X so much.

    To AMD's credit, they did force NVIDIA to price 980 Ti at $650 and release it earlier, I guess that means something to those that wanted Titan X performance for $350 less. Unfortunately for them, their fanboys are more of a cancer than help.
  • chizow - Friday, July 3, 2015 - link

    Hahah yeah I don't visit the forums much anymore, mods tried getting all heavy-handed in moderation a few years back with some of the mods being the biggest AMD fanboys/trolls around. They also allowed daily random new accounts to accuse people like myself of shilling and when I retaliated, they again threatened action so yeah, simpler this way. :)

    I've seen some of RS's postings in the article comments sections though, he used to be a lot more even keeled back then but yeah at some point his mindset turned into best bang for the buck (basically devolving into 2-gen old FS/FT prices) trumping anything new without considering the reality, what he advocates just isn't fast enough for those looking for an UPGRADE. I also got a big chuckle out of his claims 7970 is some kind of god card when it was literally the worst price:perfomance increase in the history of GPUs, causing this entire 28nm price escalation to begin with.

    But yeah, can't say I remember Silverforce, not surprising though they overhyped Fury X and the benefits of HBM to the moon, there's a handful of those out there and then they wonder why everyone is down on AMD after none of what they hoped/hyped for actually happens.
  • mapesdhs - Friday, July 3, 2015 - link

    I eventually obtained a couple of 7970s to bench; sure it was quick, but I was shocked how loud the cards were (despite having big aftermarket coolers, really no better than the equivalent 3GB 580s), and the CF issues were a nightmare.
  • D. Lister - Thursday, July 2, 2015 - link

    @chizow

    Personally I think the reason behind the current FX shortages is that Fury X was originally meant to be air-cooled, trouncing 980 by 5-10 % and priced at $650 - but then NV rather sneakily launched the Ti, a much more potent gpu compared to an air-cooled FX, at the same price, screwing up AMD's plan to launch at Computex. So to reach some performance parity at the given price point, AMD had to hurriedly put CLCs on some of the FXs and then OC the heck out of them (that's why the original "overclockers' dream" is now an OC nightmare - no more headroom left) and push their launch to E3.

    So I guess once AMD finish respecing their original air-cooled stock, supplies would gradually improve.

Log in

Don't have an account? Sign up now