There's been a lot of focus on how both Intel and AMD are planning for the future in packaging their dies to increase overall performance and mitigate higher manufacturing costs. For AMD, that next step has been V-cache, an additional L3 cache (SRAM) chiplet that's designed to be 3D die stacked on top of an existing Zen 3 chiplet, tripling the total about of L3 cache available. Today, AMD's V-cache technology is finally available to the wider market, as AMD is announcing that their EPYC 7003X "Milan-X" server CPUs have now reached general availability.

As first announced late last year, AMD is bringing its 3D V-Cache technology to the enterprise market through Milan-X, an advanced variant of its current-generation 3rd Gen Milan-based EPYC 7003 processors. AMD is launching four new processors ranging from 16-cores to 64-cores, all of them with Zen 3 cores and 768 MB L3 cache via 3D stacked V-Cache.

AMD's Milan-X processors are an upgraded version of its current 3rd generation Milan-based processors, EPYC 7003. Adding to its preexisting Milan-based EPYC 7003 line-up, which we reviewed back in June last year, the most significant advancement from Milan-X is through its large 768 MB of L3 cache using AMD's 3D V-Cache stacking technology. The AMD 3D V-Cache uses TSMC's N7 process node – the same node Milan's Zen 3 chiplets are built upon – and it measures at 36 mm², with a 64 MiB chip on top of the existing 32 MiB found on the Zen 3 chiplets. 

Focusing on the key specifications and technologies, the latest Milan-X AMD EPYC 7003-X processors have 128 available PCIe 4.0 lanes that can be utilized through full-length PCIe 4.0 slots and controllers selection. This is dependent on how motherboard and server vendors want to use them. There are also four memory controllers that are capable of supporting two DIMMs per controller which allows the use of eight-channel DDR4 memory.

The overall chip configuration for Milan-X is a giant, nine chiplet MCM, with eight CCD dies and a large I/O die, and this goes for all of the Milan-X SKUs. Critically, AMD has opted to equip all of their new V-cache EPYC chips with the maximum 768 MB of L3 cache, which in turn means all 8 CCDs must be present, from the top SKU (EPYC 7773X) to the bottom SKU (EPYC 7373X). Instead, AMD will be varying the number of CPU cores enabled in each CCD. Drilling down, each CCD includes 32 MB of L3 cache, with a further 64 MB of 3D V-Cache layered on top for a total of 96 MB of L3 cache per CCD (8 x 96 = 768).

In terms of memory compatibility, nothing has changed from the previous Milan chips. Each EPYC 7003-X chip supports eight DDR4-3200 memory modules per socket, with capacities of up to 4 TB per chip and 8 TB across a 2P system. It's worth noting that the new Milan-X EPYC 7003-X chips share the same SP3 socket as the existing line-up and, as such, are compatible with current LGA 4094 motherboards through a firmware update.

AMD EPYC 7003 Milan/Milan-X Processors
AnandTech Core/
Thread
Base
Freq
1T
Freq
L3
Cache
PCIe Memory TDP
(W)
Price
(1KU)
EYPC 7773X 64 128 2200 3500 768 MB 128 x 4.0 8 x DDR4-3200 280 $8800
EPYC 7763 64 128 2450 3400 256 MB 128 x 4.0 8 x DDR4-3200 280 $7890
EPYC 7573X 32 64 2800 3600 768 MB 128 x 4.0 8 x DDR4-3200 280 $5590
EPYC 75F3 32 64 2950 4000 256 MB 128 x 4.0 8 x DDR4-3200 280 $4860
EPYC 7473X 24 48 2800 3700 768 MB 128 x 4.0 8 x DDR4-3200 240 $3900
EPYC 74F3 24 48 3200 4000 256 MB 128 x 4.0 8 x DDR4-3200 240 $2900
EPYC 7373X 16 32 3050 3800 768 MB 128 x 4.0 8 x DDR4-3200 240 $4185
EPYC 73F3 16 32 3500 4000 256 MB 128 x 4.0 8 x DDR4-3200 240 $3521

Looking at the new EPYC 7003 stack with 3D V-Cache technology, the top SKU is the EPYC 7773X. It features 64 Zen3 cores with 128 threads has a base frequency of 2.2 GHz and a maximum boost frequency of 3.5 GHz. The EPYC 7573X has 32-cores and 64 threads, with a higher base frequency of 2.8 GHz and a boost frequency of up to 3.6 GHz. Both the EPYC 7773X and 7573X have a base TDP of 280 W, although AMD specifies that all four EPYC 7003-X chips have a configurable TDP of between 225 and 280 W.

The lowest spec chip in the new line-up is the EPYC 7373X, which has 16 cores with 32 threads, a base frequency of 3.05 GHz, and a boost frequency of 3.8 GHz. Moving up the stack, it also has a 24c/48t option with a base frequency of 2.8 GHz and a boost frequency of up to 3.7 GHz. Both include a TDP of 240 W, but like the bigger parts, AMD has confirmed that both 16-core and 24-core models will have a configurable TDP of between 225 W and 280 W.

Notable, all of these new Milan-X chips have some kind of clockspeed regression over their regular Milan (max core performance) counterparts. In the case of the 7773X, this is the base clockspeed, while the other SKUs all drop a bit on both base and boost clockspeeds. The drop is necessitated by the V-cache, which at about 26 billion extra transistors for a full Milan-X configuration, eats into the chips' power budget. So with AMD opting to keep TDPs consistent, clockspeeds have been dialed down a bit to compensate. As always, AMD's CPUs will run as fast as heat and TDP headroom allows, but the V-cache equipped chips are going to reach those limits a bit sooner.

AMD's target market for the new Milan-X chips is customers who need to maximize per-core performance; specifically, the subset of workloads that benefit from the extra cache. This is why the Milan-X chips aren't replacing the EPYC 70F3 chips entirely, as not all workloads are going to respond to the extra cache. So both lineups will be sharing the top spot as AMD's fastest-per-core EPYC SKUs.

For their part, AMD is particularly pitching the new chips at the CAD/CAM market, for tasks such as finite element analysis and electronic design automation. According to the company, they've seen upwards of a 66% increase in RTL verification speeds on Synopsys' VCS verification software in an apples-to-apples comparison between Milan processors with and without V-cache. As with other chips that incorporate larger caches, the greatest benefits are going to be found in workloads that spill out of contemporary-sized caches, but will neatly fit into the larger cache. Minimizing expensive trips to main memory means that the CPU cores can remain working that much more often.

Microsoft found something similar last year, when they unveiled a public preview of its Azure HBv3 virtual machines back in November. At the time, the company published some performance figures from its in-house testing, mainly on workloads associated with HPC. Comparing Milan-X directly to Milan, Microsoft used data from both EPYC 7003 and EPYC 7003-X inside its HBv3 VM platforms. It's also worth noting that the testing was done on dual-socket systems, as all of the EPYC 7003-X processors announced today could be used in both 1P and 2P deployments.

Performance data published by Microsoft Azure is encouraging and using its in-house testing, it looks as though the extra L3 cache is playing a big part. In Computational Fluid Dynamics, it was noted that there was a better speed up with fewer elements, so that has to be taken into consideration. Microsoft stated that with its current HBv3 series, its customers can expect maximum gains of up to 80% performance in Computational Fluid Dynamics compared to the previous HBv3 VM systems with Milan.

Wrapping things up, AMD's EPYC 7003-X processors are now generally available to the public. With prices listed on a 1K unit order basis, AMD says the EPYC 7773X with 64C/128T will be available for around $8800, while the 32C/64T model, the EPYC 7573X, will cost about $5590. Moving down, the EPYC 7473X with 24C/48T will cost $3900, and the entry EPYC 7373X with 16C/32T will cost slightly more with a cost of $4185.

Given the large order sizes required, the overall retail price is likely to be slightly higher for one unit. Though with the majority of AMD's customers being server and cloud providers, no doubt AMD will have some customers buying in bulk. Many of AMD's major server OEM partners are also slated to begin offering systems using the new chips, including Dell, Supermicro, Lenovo, and HPE.

Finally, consumers will get their own chance to get their hands on some AMD V-cache enabled CPUs next month, when AMD's second V-cache product, the Ryzen 7 5800X3D, is released. The desktop processor is based around a single CCD with a whopping 96 MB of L3 cache available, all of which contrasts nicely with the much bigger EPYC chips.

Comments Locked

58 Comments

View All Comments

  • Spunjji - Monday, March 21, 2022 - link

    There's probably some non-zero thermal penalty, but the disabled overclocking on the 5800X3D may have more to do with not wanting any possible issues where differing thermal expansion damages the connections between the CPU die and the cache die. It's probably also a good idea to keep voltages within tighter tolerances.
  • Dolda2000 - Monday, March 21, 2022 - link

    I don't remember where I heard it any longer, but I heard it somewhere through the grapevine that the reason for disabling overclocking was that the V-cache die has significantly tighter voltage tolerances, basically requiring a single specified voltage.
  • Slash3 - Monday, March 21, 2022 - link

    It also shares a common voltage plane with the underlying compute die.

    The L3 stack seems to be hard limited to 1.35v, rather than the 1.5v of the standard CCD, so the whole package is reigned in and locked down to the lower ceiling as a consequence.
  • Ryan Smith - Monday, March 21, 2022 - link

    The extra SRAM dies only cover the existing SRAM and associated plumbing on the CCD. The Zen 3 cores themselves are not covered by any silicon (though AMD does use a thermal spacer to keep the chip height consistent).
  • Iketh - Monday, March 21, 2022 - link

    I think the correct perspective is 2x transistors in the same surface area, rather than the problem of passing heat through additional silicon. That's the real issue.
  • nandnandnand - Monday, March 21, 2022 - link

    The SRAM should use less power than cores. But there was "structural silicon" added to the package.
  • Dolda2000 - Monday, March 21, 2022 - link

    Given that the base die has been shaved down to preserve the total Z-height, I'm not sure I'd expect any significant difference from that. I'm sure there's *something* due to the additional interface between the dies, but that has to be extremely minimal. I think Iketh's perspective that there's simply more transistors in the same volume is far more relevant, then.
  • ballsystemlord - Monday, March 21, 2022 - link

    They cut down on the base and boost frequencies quite a bit. I wonder if the chips are harder to cool, or there's some latency hiding involved in accessing the additional cache.
  • Wereweeb - Monday, March 21, 2022 - link

    SRAM is a notoriously power-hungry and transistor-inefficient type of memory, hence why it's restricted to CPU caches. I doubt there's much of a latency issue, since it's 3D stacked.
  • Jimbo123 - Monday, March 21, 2022 - link

    AMD chip’s performance is still behind Intel's Alder Lake or Sapphire Rapids, as Intel CEO explained, AMD is behind the rear mirror of Intel's. It will be so from here on, especially when Intel gets ahead of TSMC in 3nm or lower in 2023 or 2024, it will have the most advanced technology to make chips. I cannot see AMD could have any bright future, it will always be the second fiddler from here on as it always has been, it may sink Xilink along with it too.

Log in

Don't have an account? Sign up now