Yesterday AMD revealed that in 2014 it would begin production of its first ARMv8 based 64-bit Opteron CPUs. At the time we didn't know what core AMD would use, however today ARM helped fill in that blank for us with two new 64-bit core announcements: the ARM Cortex-A57 and Cortex-A53.

You may have heard of ARM's Cortex-A57 under the codename Atlas, while A53 was referred to internally as Apollo. The two are 64-bit successors to the Cortex A15 and A7, respectively. Similar to their 32-bit counterparts, the A57 and A53 can be used independently or in a big.LITTLE configuration. As a recap, big.LITTLE uses a combination of big (read: power hungry, high performance) and little (read: low power, lower performance) ARM cores on a single SoC. 

By ensuring that both the big and little cores support the same ISA, the OS can dynamically swap the cores in and out of the scheduling pool depending on the workload. For example, when playing a game or browsing the web on a smartphone, a pair of A57s could be active, delivering great performance at a high power penalty. On the other hand, while just navigating through your phone's UI or checking email a pair of A53s could deliver adequate performance while saving a lot of power. A hypothetical SoC with two Cortex A57s and two Cortex A53s would still only appear to the OS as a dual-core system, but it would alternate between performance levels depending on workload.

ARM's Cortex A57

Architecturally, the Cortex A57 is much like a tweaked Cortex A15 with 64-bit support. The CPU is still a 3-wide/3-issue machine with a 15+ stage pipeline. ARM has increased the width of NEON execution units in the Cortex A57 (128-bits wide now?) as well as enabled support for IEEE-754 DP FP. There have been some other minor pipeline enhancements as well. The end result is up to a 20 - 30% increase in performance over the Cortex A15 while running 32-bit code. Running 64-bit code you'll see an additional performance advantage as the 64-bit register file is far simplified compared to the 32-bit RF.

The Cortex A57 will support configurations of up to (and beyond) 16 cores for use in server environments. Based on ARM's presentation it looks like groups of four A57 cores will share a single L2 cache.


ARM's Cortex A53

Similarly, the Cortex A53 is a tweaked version of the Cortex A7 with 64-bit support. ARM didn't provide as many details here other than to confirm that we're still looking at a simple, in-order architecture with an 8 stage pipeline. The A53 can be used in server environments as well since it's ISA compatible with the A57.

ARM claims that on the same process node (32nm) the Cortex A53 is able to deliver the same performance as a Cortex A9 but at roughly 60% of the die area. The performance claims apply to both integer and floating point workloads. ARM tells me that it simply reduced a lot of the buffering and data structure size, while more efficiently improving performance. From looking at Apple's Swift it's very obvious that a lot can be done simply by improving the memory interface of ARM's Cortex A9. It's possible that ARM addressed that shortcoming while balancing out the gains by removing other performance enhancing elements of the core.

Both CPU cores are able to run 32-bit and 64-bit ARM code, as well as a mix of both so long as the OS is 64-bit.

Completed Cortex A57 and A53 core designs will be delivered to partners (including AMD and Samsung) by the middle of next year. Silicon based on these cores should be ready by late 2013/early 2014, with production following 6 - 12 months after that. AMD claimed it would have an ARMv8 based Opteron in production in 2014, which seems possible (although aggressive) based on what ARM told me.

ARM expects the first designs to appear at 28nm and 20nm. There's an obvious path to 14nm as well.

It's interesting to note ARM's commitment to big.LITTLE as a strategy for pushing mobile SoC performance forward. I'm curious to see how the first A15/A7 designs work out. It's also good to see ARM not letting up on pushing its architectures forward.

Comments Locked

117 Comments

View All Comments

  • name99 - Tuesday, October 30, 2012 - link

    To understand the actual issue against Intel, look at what is being shown here. It is not a new CPU, it is a new architecture. Intel is still on the same µarch for Atom that they had five years ago --- all they've done since then is process improvements and moving more parts onto the CPU die.

    In that same time, ARM have been able to make substantial arch changes --- over say the last two years we've seen A4 to A5 to A6/Swift in the Apple space, along with A15 and now these early ARM64 designs.

    THIS is, and always has been, ARM's strength and Intel's weakness. Intel takes five to seven years to spin a new design because their cores are so complex. The only way they can run the design machine faster is multiple teams. You can get some idea of the expense of that (and remember, they can only charge ARM-like prices for these chips, not desktop prices --- AND if the push too hard on these chips they will start to cannibalize the low-end desktop market) by seeing that they have not done this for Atom, even though it;s obvious how important it is.

    We'll get a new Intel design fairly soon. OK, big step forward. And then what --- five more years of stasis while Apple, nVidia, Qualcomm, ARM are all changing their architectures almost annually?

    In the past, Intel had complexity against it; but had the compensation of a huge market. Now they have a smaller market, while ARM has the huge market, AND their complexity is that much worse than it was in the days of the P6.
  • andrewaggb - Tuesday, October 30, 2012 - link

    I think you're forgetting that ARM is about to hit the IPC wall. These new chips have essentially the same execution abilities as modern intel and AMD chips have had for a long time. It's not that Intel and AMD are stupid, they just can't get more parallelism out of code. So they focus on caches, latency, buffers, mmx, sse, etc etc. Adding hyper threading, more cores, etc, etc.

    ARM is just getting to the hard part. Remember Itanium? PowerPC? Alpha? all of these other advanced processors that in practice aren't much different/better than a Xeon?

    Intel's threat is on power consumption, absolute performance has never been a problem for them. if ARM can power a cpu at 70-100% of the speed at 50% of the power, that's a threat.

    It's killing intel in the mobile space. In the server space it's a race to see if intel can get power consumption down faster than arm can get performance up.
  • MadMan007 - Tuesday, October 30, 2012 - link

    Intel is going to a 2-year cadence for Atom. They had originally used a 5-year cadence, which is their classic pre-tick/tock schedule, which is why we are finally about to see a new Atom architecture.

    In short, to answer this "And then what --- five more years of stasis.." No.
  • A5 - Tuesday, October 30, 2012 - link

    Eh. I'm guessing that if we had a slide deck about the 2014 Atom chips, it'd seem pretty exciting, too.
  • Krysto - Tuesday, October 30, 2012 - link

    Actually Intel tends to show them for 5 years or something. ARM announces them 2 years before, because that's how their business works, considering they only make the IP for the chips, so they can't announce it just one year earlier. Plus, one ARM CPU generation lasts 2 years.
  • Matias - Tuesday, October 30, 2012 - link

    Wow, looks like a big disrupt is on the horizon. Now lets hope they execute all these promisses, and we may see even some desktop use in the future with Win8 ported. Beware intel...
  • powerarmour - Tuesday, October 30, 2012 - link

    Windows 8 has already been ported somewhat = Windows RT

    If WinRT apps take off, and legacy x86 slowly simmers off the boil, then these CPU's should be the basis for some nice tablets and mini-desktops in future.
  • Krysto - Tuesday, October 30, 2012 - link

    Too bad Windows RT doesn't work like Linux and I expect Mac OS soon, too, and just transition smoothly from x86 to ARM, allowing all previous apps to work on ARM.

    Do we know at what clock speed A57 will start at? 2.5 Ghz maybe? A53 I assume at 1.0-1.2 Ghz.
  • Krysto - Tuesday, October 30, 2012 - link

    Answered my own question. It sounds like A53 will start at 1.3 Ghz, and A57 will end at 3 Ghz.

    "For those who are still looking for gigahertz performance numbers Hurley sais]d that new A-50 family will deliver performance ranging from 1.3 gigahertz to 3 Gigahertz depending on how the ARM licensees tweak their designs."

    From Gigaom:
    http://gigaom.com/2012/10/30/meet-arms-two-newest-...
  • aicom64 - Wednesday, October 31, 2012 - link

    Linux doesn't allow x86 binaries to run on ARM systems. You have to recompile from source or try to obtain a properly compiled package.

    I don't know if Mac OS will because usually it only makes sense to pay the emulator penalty when you're moving from a lower performance to a higher performance platform (68k -> PPC, PPC -> Intel).

Log in

Don't have an account? Sign up now