Core-to-Core Latency

As the core count of modern CPUs is growing, we are reaching a time when the time to access each core from a different core is no longer a constant. Even before the advent of heterogeneous SoC designs, processors built on large rings or meshes can have different latencies to access the nearest core compared to the furthest core. This rings true especially in multi-socket server environments.

But modern CPUs, even desktop and consumer CPUs, can have variable access latency to get to another core. For example, in the first generation Threadripper CPUs, we had four chips on the package, each with 8 threads, and each with a different core-to-core latency depending on if it was on-die or off-die. This gets more complex with products like Lakefield, which has two different communication buses depending on which core is talking to which.

If you are a regular reader of AnandTech’s CPU reviews, you will recognize our Core-to-Core latency test. It’s a great way to show exactly how groups of cores are laid out on the silicon. This is a custom in-house test, and we know there are competing tests out there, but we feel ours is the most accurate to how quick an access between two cores can happen.


(Click on the image to enlarge)

Analyzing core-to-core latencies on the AMD Ryzen Threadripper 7980X (64C/128T), our test is limited to probing the first 64 threads, although scaling out to 128 threads would be identical. Each CCD on the Threadripper 7980X has 8 x Zen 4 cores, with 32 MB of L3 cache. Looking at the latency range within the CCD, we can see inner latencies between 7 and 20 ns, which increase to 89 and 96 ns as each core communicates within the CCX.


A visual render of the AMD Ryzen Threadripper 7980X with 8 x CCDs and IOD

Given we've reviewed the AMD Ryzen 9 7950X, which has the same Zen 4 cores and the same CCD complex approach to communicating between the cores, we see relatively similar latencies in both Threadripper 7000 and Ryzen 7000. A quad-channel DDR5 memory controller integrated within the large IOD and using PCIe 5.0 lanes as the primary pathway is important in enhancing the Infinity Fabric interconnect to reduce latencies and help counteract any penalties.

SPEC2017 Multi-Threaded Results Threadripper 7000 vs. Threadripper 3000: Generational Improvements
Comments Locked

66 Comments

View All Comments

  • thestryker - Monday, November 20, 2023 - link

    Forgot to add: these are just the lower SKU workstation parts not a resurrection of HEDT
  • wujj123456 - Monday, November 20, 2023 - link

    > the AMD Ryzen Threadripper 7980X ($4999), despite having eight fewer cores than the W9-3495X ($5889), half the memory channels (4 vs. 8) and being ultimately cheaper, it is the better option.

    Am I reading it wrong? 7980X has eight more cores than W9-3495X not fewer. Don't think it changes the conclusion though.
  • rUmX - Tuesday, November 21, 2023 - link

    You're right
  • Gavin Bonshor - Tuesday, November 21, 2023 - link

    Thanks for highlighting that obvious error, edited!
  • bernstein - Monday, November 20, 2023 - link

    It remains true, what has been true for every threadripper: if your software allows for computing on more than one node, using 5-10 ryzen servers for the same money gives you more performance, redundancy, more io-bandwith & for many usecases even more total ram.
  • vfridman - Monday, November 20, 2023 - link

    There is a lot of so called "professional" use cases that require a lot of RAM on a single machine. It often possible to split calculations across a cluster of machines, but not so with RAM.
  • quorm - Monday, November 20, 2023 - link

    A nice increase in performance, but seems like almost everyone would be better off with either desktop ryzen or pro/epyc.
  • Thunder 57 - Monday, November 20, 2023 - link

    You should either use bar graphs that show the 14900K's performance when limited to 125W, or you should just change the graphs and list the 14900K as 428W.

    AMD doesn't get a pass either but at least they are more honest. With these new Threadrippers they are actually spot on. Meanwhile the "350W" Xeon uses just over 500W. At the very least maybe include some efficiency charts?
  • thestryker - Monday, November 20, 2023 - link

    Not that the power consumption is good, but these represent the absolute maximum power draw number seen they do not represent workload power draw. If they were to pick "real" power numbers they would have to measure power consumption for every single test and show that.
  • Oxford Guy - Tuesday, November 21, 2023 - link

    Deceptive power usage needs to be stopped.

Log in

Don't have an account? Sign up now