Details on AMD Bulldozer: Opterons to Feature Configurable TDPby Johan De Gelas & Kristian Vättö on July 15, 2011 12:00 AM EST
- Posted in
- IT Computing
- Cloud Computing
Overview of Bulldozer Lineup
AMD’s new Bulldozer-based CPUs are just around the corner. AMD has said the release of Zambezi CPUs will happen in Q3, which means any time from now. The latest word on the street suggests October release though. We know quite a lot about these CPUs already but there is at least one thing we didn't know until now and it may end up being a big thing in server market. AMD’s John Fruehe has published an interesting blog post where he reveals that AMD’s upcoming server CPUs, Operons, will feature a user-configurable TDP.
|AMD Bulldozer lineup|
|Market||High-end consumers||Low-end servers||High-end servers|
|Core count||8, 6 or 4||8 or 6||16, 12 or 8|
|Supported CPU configurations||Single CPU||Up to dual CPU||Up to quad CPU|
Lets start with a brief on Bulldozer. It’s AMD’s first new micro-architecture since K10 (if we ignore Bobcat), which was released in late 2007, and frankly it’s long overdue. It will be manufactured using GlobalFoundries’ 32nm SOI, just like Llano. Some of the architectural changes are covered here, so lets not get into that.
The regular desktop CPUs are codenamed Zambezi and will feature up to eight cores. They will use the AM3+ socket and some AM3 boards will also support the new Zambezi CPUs after a BIOS update. These CPUs will not feature an integrated GPU (unlike Llano and Ontario/Zacate) and will support up to 1866MHz DDR3 in dual-channel configuration.
Bulldozer actually gets more interesting when talking about the server parts, Opterons. For low-end and power efficient servers, AMD will offer CPUs codenamed Valencia. Specification wise these CPUs are pretty similar to Zambezi, with 8-core and 6-core variants. The memory support is also dual-channel just like in Zambezi but will be limited to 1600MHz. Valencia will be released under the Opteron 4200 Series brand and will support single- and dual-CPU configurations. It will aslo be compatible with AMD's current San Marino and Adelaide platforms (Opteron 4000 Series) for socket C32.
For high-end servers, AMD’s answer is Interlagos. It will feature up to 16 cores which is achieved by combining two 8-core dies into one package, similar to AMD’s current 12-core Magny Cours. There will also be 12-core and 8-core variants. Interlagos has up to four Hyper-Transport 3.0 links, meaning that quad-CPU configurations are supported. Apparently, there will also be CPUs with only two links, aimed at dual-CPU configurations. Memory support will be quad-channel 1600MHz DDR3, just like Intel’s Sandy Bridge-E (although we don’t know the speed of DDR3 that SB-E supports). Interlagos will be branded as the Opteron 6200 Series and will retain support for Maranello platform (Opteron 6000 Series) which utilizes socket G34.
Post Your CommentPlease log in or sign up to comment.
View All Comments
ltcommanderdata - Friday, July 15, 2011 - link"According to leaked product positioning slides, Zambezi is aimed to fight against Intel's Core i5 and i7 lineups. Zambezi will feature up to eight cores, which is twice as many as i7-2600(K)'s four cores. AMD said that they won't join the Hyper-Threading club and they will deliver as many physical cores as Intel delivers physical and virtual cores combined. It looks like AMD is keeping their word, though they're only delivering half as many "FP/SSE cores". "
With hyperthreading and now Bulldozer's double integer core/shared FPU design, core counts are becoming increasingly a difficult metric to compare. It's important to note that while Bulldozer has doubled the number of integer cores compared to Istanbul, each integer core is actually weaker since Bulldozer only uses 2 non-symmetric ALUs and 2 AGUs compared to 3 symmetric ALUs and 3 AGUs in Istanbul. Perhaps other architectural efficiencies can make up the difference, but I wouldn't be surprised if clock-for-clock each of Bulldozer's integer cores is slightly slower than Istanbul's. I believe Sandy Bridge's integer performance is clock for clock better than Istanbul, so Bulldozer likely need very well threaded code for it's doubled integer cores to shine.
FPU resources look to be be beefed up from 3 units in Istanbul to 4 units in Bulldozer. Compared to Sandy Bridge, Intel's big advantage is native 256-bit AVX units compared to Bulldozer which only has 128-bit FP/SSE resources and needs to split 256-bit AVX instructions halving performance. So if Intel can convince developers to quickly adopt 256-bit AVX, Sandy Bridge should have a pretty large SIMD advantage.
duploxxx - Friday, July 15, 2011 - linkdude, you just sound like a horrified Intel fanboy. "convince developers to adopt 256bit AVX). Then what about FMA3 and FMA4 which intel doesn't even have.....
A single BD Module can handle a 256bit AVX or can deside to split into 2 x 128 for each core . It is a decision from AMD to go that way just like intel decides to have a 256bit full for a PH + HT core..... 2 x 256 logic would just need more die space without usage, just like the choice to go for 2 ALU/AGU while the usage of 3 is almost no gain in server loads besides benchmarking....
While the FPU 128+128 might be a bit slower we are talking here about perhaps 2-3% since all other parts like cache and memory are shared for a single module and very neglictable difference unless you are a fanboy which is obvious.
ltcommanderdata - Friday, July 15, 2011 - link"Then what about FMA3 and FMA4 which intel doesn't even have....."
I believe Bulldozer supports FMA4, but not FMA3 due to Intel flip-flopping on which one they'll support at the last minute breaking commonality. While FMA4 is a great capability to have, you pointing out that Intel doesn't have it is the concern. AVX could see faster adoption because it's supported by both Bulldozer and Sandy Bridge.
"While the FPU 128+128 might be a bit slower we are talking here about perhaps 2-3% since all other parts like cache and memory are shared for a single module and very neglictable difference unless you are a fanboy which is obvious."
I mention AVX performance, because I'm under the impression that Bulldozer gangs it's two 128-bit FMACs together to do 1 AVX per module per cycle while Sandy Bridge has 3x256-bit AVX units per physical core. Sandy Bridge's AVX units are non-symmetric and there are no doubt other factors that will impact performance so it won't be a 3x performance difference, but I'd think it'd be more than 2-3% given the big difference in raw processing resources.
duploxxx - Friday, July 15, 2011 - linkmy 2-3% was only the difference between a single 256 vs 2 x 128, not against the intel part... lets see first how much AVX will be really used and how much will end up being 128 bit... doesn't mean something which is 256bit is always better then 128bit.
silverblue - Friday, July 15, 2011 - linkI believe I heard once that Intel's implementation can execute either one 128-bit or one 256-bit instruction per clock. Bulldozer's fused implementation may give up on AVX throughput, but only AVX.
rnssr71 - Friday, July 15, 2011 - link'It's important to note that while Bulldozer has doubled the number of integer cores compared to Istanbul, each integer core is actually weaker since Bulldozer only uses 2 non-symmetric ALUs and 2 AGUs compared to 3 symmetric ALUs and 3 AGUs in Istanbul.'
why does everyone get hung up on this? yes, phenom had 3 ALUs and 3 AGUs. big deal! it could only complete 3 instructions per clock- any combination of ALU and AGU instructions but no more than 3. so how often could it process 3 ALUs consecutively?
AMD has said that removing the 3rd AGU won't hurt performance and core 2, nehalem, and sandy bridge all have 2 AGU's.
Bulldozer can complete 4 instructions per clock- same as core 2, nehalem and sandy bridge. granted, the all have 3 ALU's available, but how often is the extra one used?
SanX - Friday, July 15, 2011 - linkGot kids Phenom II X6 1055T based PC for their games like GTA and just for fun ran on it some scientific FP-oriented tests - parallel algebra codes and some single-core ones.
Was shocked that at its 2.8GHz stock clock it is twice faster then my overclocked to 4GHz Intel processors. Is this what you guys get too? Kind of contradicting to all these game- and office-oriented and benchmarks where Intel is always on the top.
So i'm waiting for these 8-core 32nm chips in the hope to drive them to 4.5 GHZ and get additional factor of 2
Anyone wants to repeat them ?
cosminmcm - Friday, July 15, 2011 - linkYou mean compared to your Intel Pentium 4 @ 4 GHz?
GaMEChld - Friday, July 15, 2011 - linkI too am curious as to what Intel chip was used in that comparison.
beginner99 - Friday, July 15, 2011 - linkMost certainly a dual core with 1/3 of the cores or one of the slowest Core 2 Quads. Sure not a nehalem or sb Quad