Last week, Apple made industry news by announcing new Mac products based upon the company’s new Apple Silicon M1 SoC chip, marking the first move of a planned 2-year roadmap to transition over from Intel-based x86 CPUs to the company’s own in-house designed microprocessors running on the Arm instruction set.

During the launch we had prepared an extensive article based on the company’s already related Apple A14 chip, found in the new generation iPhone 12 phones. This includes a rather extensive microarchitectural deep-dive into Apple’s new Firestorm cores which power both the A14 as well as the new Apple Silicon M1, I would recommend a read if you haven’t had the opportunity yet:

Since a few days, we’ve been able to get our hands on one of the first Apple Silicon M1 devices: the new Mac mini 2020 edition. While in our analysis article last week we had based our numbers on the A14, this time around we’ve measured the real performance on the actual new higher-power design. We haven’t had much time, but we’ll be bringing you the key datapoints relevant to the new Apple Silicon M1.

Apple Silicon M1: Firestorm cores at 3.2GHz & ~20-24W TDP?

During the launch event, one thing that was in Apple fashion typically missing from the presentation were actual details on the clock frequencies of the design, as well as its TDP which it can sustain at maximum performance.

We can confirm that in single-threaded workloads, Apple’s Firestorm cores now clock in at 3.2GHz, a 6.66% increase over the 3GHz frequency of the Apple A14. As long as there's thermal headroom, this clock also applies to all-core loads, with in addition to 4x 3.2GHz performance cores also seeing 4x Thunder efficiency cores at 2064MHz, also quite a lot higher than 1823MHz on the A14.

Alongside the four performance Firestorm cores, the M1 also includes four Icestorm cores which are aimed for low idle power and increased power efficiency for battery-powered operation. Both the 4 performance cores and 4 efficiency cores can be active in tandem, meaning that this is an 8-core SoC, although performance throughput across all the cores isn’t identical.

The biggest question during the announcement event was the power consumption of these designs. Apple had presented several charts including performance and power axes, however we lacked comparison data as to come to any proper conclusion.

As we had access to the Mac mini rather than a Macbook, it meant that power measurement was rather simple on the device as we can just hook up a meter to the AC input of the device. It’s to be noted with a huge disclaimer that because we are measuring AC wall power here, the power figures aren’t directly comparable to that of battery-powered devices, as the Mac mini’s power supply will incur a efficiency loss greater than that of other mobile SoCs, as well as TDP figures contemporary vendors such as Intel or AMD publish.

It’s especially important to keep in mind that the figure of what we usually recall as TDP in processors is actually only a subset of the figures presented here, as beyond just the SoC we’re also measuring DRAM and voltage regulation overhead, something which is not included in TDP figures nor your typical package power readout on a laptop.

Apple Mac mini (Apple Silicon M1) AC Device Power

Starting off with an idle Mac mini in its default state while sitting idle when powered on, while connected via HDMI to a 2560p144 monitor, Wi-Fi 6 and a mouse and keyboard, we’re seeing total device power at 4.2W. Given that we’re measuring AC power into the device which can be quite inefficient at low loads, this makes quite a lot of sense and represents an excellent figure.

This idle figure also serves as a baseline for following measurements where we calculate “active power”, meaning our usual methodology of taking total power measured and subtracting the idle power.

During average single-threaded workloads on the 3.2GHz Firestorm cores, such as GCC code compilation, we’re seeing device power go up to 10.5W with active power at around 6.3W. The active power figure is very much in line with what we would expect from a higher-clocked Firestorm core, and is extremely promising for Apple and the M1.

In workloads which are more DRAM heavy and thus incur a larger power penalty on the LPDDR4X-class 128-bit 16GB of DRAM on the Mac mini, we’re seeing active power go up to 10.5W. Already with these figures the new M1 is might impressive and showcases less than a third of the power of a high-end Intel mobile CPU.

In multi-threaded scenarios, power highly depends on the workload. In memory-heavy workloads where the CPU utilisation isn’t as high, we’re seeing 18W active power, going up to around 22W in average workloads, and peaking around 27W in compute heavy workloads. These figures are generally what you’d like to compare to “TDPs” of other platforms, although again to get an apples-to-apples comparison you’d need to further subtract some of the overhead as measured on the Mac mini here – my best guess would be a 20 to 24W range.

Finally, on the part of the GPU, we’re seeing a lower power consumption figure of 17.3W in GFXBench Aztec High. This would contain a larger amount of DRAM power, so the power consumption of Apple’s GPU is definitely extremely low-power, and far less than the peak power that the CPUs can draw.

Memory Differences

Besides the additional cores on the part of the CPUs and GPU, one main performance factor of the M1 that differs from the A14 is the fact that’s it’s running on a 128-bit memory bus rather than the mobile 64-bit bus. Across 8x 16-bit memory channels and at LPDDR4X-4266-class memory, this means the M1 hits a peak of 68.25GB/s memory bandwidth.

In terms of memory latency, we’re seeing a (rather expected) reduction compared to the A14, measuring 96ns at 128MB full random test depth, compared to 102ns on the A14.

Of further note is the 12MB L2 cache of the performance cores, although here it seems that Apple continues to do some partitioning as to how much as single core can use as we’re still seeing some latency uptick after 8MB.

The M1 also contains a large SLC cache which should be accessible by all IP blocks on the chip. We’re not exactly certain, but the test results do behave a lot like on the A14 and thus we assume this is a similar 16MB chunk of cache on the SoC, as some access patterns extend beyond that of the A14, which makes sense given the larger L2.

One aspect we’ve never really had the opportunity to test is exactly how good Apple’s cores are in terms of memory bandwidth. Inside of the M1, the results are ground-breaking: A single Firestorm achieves memory reads up to around 58GB/s, with memory writes coming in at 33-36GB/s. Most importantly, memory copies land in at 60 to 62GB/s depending if you’re using scalar or vector instructions. The fact that a single Firestorm core can almost saturate the memory controllers is astounding and something we’ve never seen in a design before.

Because one core is able to make use of almost the whole memory bandwidth, having multiple cores access things at the same time don’t actually increase the system bandwidth, but actually due to congestion lower the effective achieved aggregate bandwidth. Nevertheless, this 59GB/s peak bandwidth of one core is essentially also the speed at which memory copies happen, no matter the amount of active cores in the system, again, a great feat for Apple.

Beyond the clock speed increase, L2 increase, this memory boost is also very likely to help the M1 differentiate its performance beyond that of the A14, and offer up though competition against the x86 incumbents.

Benchmarks: Whatever Is Available
Comments Locked

682 Comments

View All Comments

  • Spunjji - Monday, November 23, 2020 - link

    @Kangal - I have a few disagreements with what you've written here.

    Firstly, I'm a little confused about why you see the Rosetta-based benchmarks as most relevant. I doubt that anyone buying an M1 device today will be getting rid of it before the majority of apps are converted across, so that performance is going to become increasingly *less* relevant as time passes.

    Secondly, this quote: "In short, Apple played it safe and didn't really do their best. That means they purposely left performance on the table, it was artificial and it was deliberate." - I just don't see how you could draw that conclusion. They used their highest-performing cores in the largest chip yet produced on 5nm. It would be bizarre for them to begin such a grand experiment from the top-down - it would produce an odd situation where their most demanding users, who are most likely to be using applications that currently need translation, would be expected to transition to an incomplete ecosystem with performance that doesn't exceed existing systems.

    To me, it makes perfect sense from both an engineering and a product perspective. They begin the transition with a relatively small (and thus high-yielding, despite the new process) chip as part of a platform for users who are relatively performance-insensitive, but who will still appreciate the immediate benefits of reduced heat and increased battery life.

    I'm also a bit confused about your perspective on their GPU. AFAIK the most modern low-profile low-power GPU out there is Nvidia's 1650 - and in terms of performance-per-watt, this iGPU thrashes it, with absolute performance being not far behind. Perf/Watt appears to be Apple's primary concern (for a given degree of absolute performance), so I see it as a resounding (and surprising) success. It's down to AMD and Nvidia to respond now.
  • Kangal - Wednesday, November 25, 2020 - link

    @Spunjji
    Thanks for the read, sorry it's quite long.

    I mean, the Apple Silicon M1 as it is, it's very good for the new Macbook Air. I guess for the cheap/budget Mac Mini it is also decent. However, it's kind of out of place on the Pro. Perhaps they will launch more Macs in the next 6 months, something beefy for their larger MacBook Pro, and maybe something desktop-worthy in an iMac and Mac Pro. I completely agree with your points. Apple now has the best chipset in the world, their large cores are highly competitive, and their GPU tech is the most efficient. In fact, their medium-cores are the best, they're an Out-of-order processor which sucks slightly less power than a Cortex A53 (or slightly more than A55 ?), but they're slightly faster than a Cortex A73 (or slightly slower than A72 ?). Either way, that's stupidly impressive.

    But as it stands, Apple has done the works but on the last yard, pulled its punches.... and I state that since they're saving money on the SoC by sourcing it themselves, and not paying those exorbitant Intel prices. So there's definitely (money and silicon) budget there to go more ambitious. I just wanted to see more competitive/better product segmentation, eg:

    Apple M10, ~10W, 8 large cores, 8cu GPU... for 11in laptop, ultra thin, fanless
    Apple M13, ~15W, 8 large cores, 16cu GPU... for 14in laptop, thin, active cooled
    Apple M15, ~25W, 8 large cores, 32cu GPU... for 17in laptop, thick, active cooled
    Apple M17, ~45W, 16 large cores, 32cu GPU... for 29in iMac, thick, AC power
    Apple M19, ~95W, 16 large cores, 64cu GPU.... for Mac Pro, desktop, strong cooling

    ...and after 1.5 years, they can move unto the next refined architecture/node (ex Apple M20, M23, M25, M27, M29 etc etc).
  • Sherlock - Monday, November 30, 2020 - link

    I believe the iPad Pros (if not all iPads) will move to the M1 chip and run the MacOS with the ability to run iPadOS/iOS Apps. With the detachable keyboards and Apple Pen support - they will become the ultimate Portable workstation. Knowing Apple's penchant for a limited product line - they may even drop the Apple Macbook Air.
  • BushLin - Saturday, November 21, 2020 - link

    "To be honest, a lot of comparisons of the Apple Silicon M1 are vague, misrepresentative or blatantly off..."
    <proceeds to list unattributed benchmark results with incorrect power labels>
  • Spunjji - Thursday, November 19, 2020 - link

    @vlad24 - I'm aware of how process node can affect voltage requirements and power draw, and the various TDP differences.

    I wasn't arguing that TSMC 5nm wouldn't help AMD's power efficiency, I was arguing with the nonsensical statement that it's the *sole reason* for Apple's good showing in that area. lilmoe's salty opinions aren't supported by the facts.

    You're correct that AMD at 5nm would probably regain an advantage over M1 in mobile devices, but that will be in a year's time, and Apple aren't standing still. It's likely we'll be seeing them leapfrog each other. In the meantime, it'll be interesting to see how competitive Cezanne ends up being with M1 and/or whatever Apple's next-largest chip will end up being.
  • vlad42 - Saturday, November 21, 2020 - link

    But if shrinking Zen 3 to the same 5nm process would make its mobile variant more energy efficient, then that would imply that Zen 3 is a more efficient architecture. It just happens that the architecture is held back in this specific comparison by the manufacturing process.

    We do not know if AMD will bother to port Zen 3 to 5nm, they could skip straight to Zen 4. Who knows what process Apple will be using by the time AMD moves to 5nm. 3nm could still be too expensive for chips larger than those used for phones.

    Granted if the energy efficiency of Zen 3 equals M1 when both are on 5nm, then the M1's efficiency cannot be solely due to 5nm unless that were also true for Zen 3.
  • mdriftmeyer - Saturday, November 21, 2020 - link

    Zen 4 is scheduled to have samples Q1 2021 on 5nm advanced node TSMC. The fact you don't know this tells me you don't follow AMD.
  • Spunjji - Monday, November 23, 2020 - link

    @mdriftmeyer - You'd be wrong in both assuming that I don't know and that I don't "follow AMD". Samples in Q1 2021 does not equal released product in Q1 2021, does it? I'm talking about product availability, and you're moving the goalposts for reasons that aren't clear to me.
  • magreen - Tuesday, November 24, 2020 - link

    @Spunjji - Thanks for your insightful responses, as usual. Sometimes I'm tempted to just hit Ctrl-F to find your comments and ignore the rest.
  • haghands - Tuesday, November 17, 2020 - link

    Cope

Log in

Don't have an account? Sign up now