The Adreno 225 GPU

Qualcomm has historically been pretty silent about its GPU architectures. You'll notice that specific details of Adreno GPU execution resources have been absent from most of our SoC comparisons. Starting with MSM8960 however, this is starting to change.

The MSM8960 uses a current generation Adreno GPU with a couple of changes. Qualcomm calls this GPU the Adreno 225, a follow-on to Adreno 220. Subsequent Krait designs will use Adreno 3xx GPUs based on a brand new architecture.

As we discussed in our Samsung Galaxy S 2 review, Qualcomm's Adreno architecture is a tile based immediate mode renderer with early-z rejection. By Qualcomm's own admission, Adreno is somewhere in the middle of the rendering spectrum between IMRs and Imagination Technologies' TBDR architectures. One key difference is Adreno's tiling isn't as fine grained as IMG's.

Architecturally the Adreno 225 and 220 are identical. Adreno 2xx is a DX9-class unified shader design. There's a ton of compute on-board with eight 4-wide vector units and eight scalar units. Each 4-wide vector unit is capable of a maximum of 8 MADs per clock, while each scalar unit is similarly capable of 2 MADs per clock. That works out to 160 floating point operations per clock, or 32 GFLOPS at 200MHz.

Update: Qualcomm has clarified the capabilities of its 4-wide Vector ALUs. Similar to the PowerVR SGX 543, each 4-wide vector ALU is capable of four MADs (one per component). The scalar units cannot be combined to do any MADs, although they are helpful we haven't really been tracking those in this table (IMG has something similar) so we've excluded them for now.

Mobile SoC GPU Comparison
  Adreno 225 PowerVR SGX 540 PowerVR SGX 543 PowerVR SGX 543MP2 Mali-400 MP4 GeForce ULP Kal-El GeForce
SIMD Name - USSE USSE2 USSE2 Core Core Core
# of SIMDs 8 4 4 8 4 + 1 8 12
MADs per SIMD 4 2 4 4 4 / 2 1 ?
Total MADs 32 8 16 32 18 8 ?
GFLOPS @ 200MHz 12.8 GFLOPS 3.2 GFLOPS 6.4 GFLOPS 12.8 GFLOPS 7.2 GFLOPS 3.2 GFLOPS ?
GFLOPS @ 300MHz 19.2  GFLOPS 4.8 GFLOPS 9.6 GFLOPS 19.2 GFLOPS 10.8 GFLOPS 4.8 GFLOPS ?

Looking at the table above you'll see that this is the same amount of computing power than even IMG's PowerVR SGX 543MP2. However as we've already seen in our tests, Adreno 220 isn't anywhere near as quick. 

Shader compiler efficiency and data requirements to actually populate a Vec4+1 array are both unknowns, and I suspect both significantly gate overall Adreno performance. There's also the fact that the Adreno 22x family only has two TMUs compared to four in the 543MP2, limiting texturing performance. Combine that with the fact that most Adreno 220 GPUs have been designed into single-channel memory controller systems and you've got a recipe for tons of compute potential limited by other bottlenecks.

With Adreno 225 Qualcomm improves performance along two vectors, the first being clock speed. While Adreno 220 (used in the MSM8660) ran at 266MHz, Adreno 225 runs at 400MHz thanks to 28nm. Secondly, Qualcomm tells us Adreno 225 is accompanied by "significant driver improvements". Keeping in mind the sheer amount of compute potential of the Adreno 22x family, it only makes sense that driver improvements could unlock a lot of performance. Qualcomm expects the 225 to be 50% faster than the outgoing 220

Qualcomm claims that MSM8960 will be able to outperform Apple's A5 in GLBenchmark 2.x at qHD resolutions. We'll have to wait until we have shipping devices in hand to really put that claim to the test, but if true it's good news for Krait as the A5 continues to be the high end benchmark for mobile GPU performance.

While Adreno 225 is only Direct3D feature level 9_3 compliant, Qualcomm insisted that when the time is right it will have a D3D11 capable GPU using its own IP - putting to rest rumors of Qualcomm looking to license a third party GPU in order to be competitive in Windows 8 designs. Although Qualcomm committed to delivering D3D11 support, it didn't commit to a timeframe.

Memory Hierarchy & Process Technology MSM8960 Cellular Connectivity
Comments Locked

108 Comments

View All Comments

  • skydrome1 - Monday, October 10, 2011 - link

    I'm pretty sure they said it would sample by the end of this year and ship late 2012. If they were to delay it any more, they would be in serious trouble. ST Ericsson has quite a lot against them recently, and if they can't keep to their promises, TI is going to beat them quite badly. I'd estimate the Rogue to show up in an OMAP 5 in H2 2012 or H1 2013.

    All in all I'm just really excited by the PowerVR Rogue. Seeing the specifications of the Nova A9600 and what the Rogue can do is quite amazing. It's almost on par with the PS3.

    Could an article on that be done once information is available?

    I would love to have a portable gaming console :)
  • Haserath - Saturday, October 8, 2011 - link

    Metafor is right about the curve having to do with the process. His explanation kinda makes it seem like a temp increase causes the power increase though. It's the power increase that causes the temp increase, and "G" transistors are designed to handle more power without wasted heat(temperature increase) compared to "LP" transistors. There's also a second reason why 28nm is hotter than 40nm.

    If you have a certain amount of heat energy being produced at a certain power level, the 40nm transistors will be a certain temperature.

    Now take that same amount of heat energy being produced, and shrink the transistors to half their size. This increases their temperature within the same power envelope.

    Of course they labeled a thermal limit on the power side, because the holder of whatever phone this chip goes into is going to feel the heat coming from the chip due to how much power it's using(how much heat energy is put out), not just due to the temperature of the transistors.
  • ViRGE - Saturday, October 8, 2011 - link

    The graph is conceptually correct. While it's true that consuming more power produces more heat, the inverse is also true. The temperature of a transistor affects its leakage characteristics because resistance increases with heat. So at higher temperatures a CPU is going to consume more power to maintain its performance, compared to the same CPU at a lower temperature.

    You're basically looking at the principles of a superconductor applied in reverse.
  • JohnWH - Saturday, October 8, 2011 - link

    The number of MADs per 4 way SIMD is 4 not 8 as stated (plus 1 for scalar channel), so total flops per clock is (4+1) * 2 * 8 = 80 flops/clock or 16GFLOPs/s @ 200MHz and 24GFlops/s @ 300MHz.
  • Zingam - Saturday, October 8, 2011 - link

    According to this article we'll have to wait for at least another 3 years or maybe more until we get tablets with enough power and good battery life that would be actually useful.
    Yeah, maybe at 14nm and with tri-gate transistors somewhen in 2016 we'll be able to enjoy true mobile computing all day long (at least 16 hours without a recharge).

    Yeah, progress is good but way to slow sometimes. Too bad I was hoping for a ultracool and powerful e-book reader that delivers more tablet like experience rather than what currently is available.
  • dagamer34 - Saturday, October 8, 2011 - link

    Define "useful". I'd argue that a lot of CPU cycles are wasted doing meaningless background tasks in apps that you can't see when it would be better to just pause and resume them later when the user brings them back into focus (aka Windows 8).
  • bengildenstein - Saturday, October 8, 2011 - link

    I'm trying to post a relevant comment but it's being flagged as spam. Can anyone offer any insight into why this may be the case?
  • bengildenstein - Saturday, October 8, 2011 - link

    The post centers around a siggraph 2011 talk that touches on Adreno 205's fragment shader performance.

    The gist is that the Adreno 205 (xperia play) showed faster performance with complex shaders than the SGX543MP2 (ipad2).

    It seems I cannot post a link to the paper, but you can find it titled "Fast Mobile Shaders" at: aras-p [dot] info
  • Ryan Smith - Saturday, October 8, 2011 - link

    The spam filter is pretty aggressive against links.

    http://www.aras-p.info/texts/files/FastMobileShade...
  • s44 - Saturday, October 8, 2011 - link

    You sure it's the same Mali? I couldn't find it specified in any of Samsung's press releases.

Log in

Don't have an account? Sign up now