Original Link: https://www.anandtech.com/show/4109/amd-and-globalfoundries-ces-2011
AMD and GlobalFoundries, CES 2011
by Jarred Walton on January 7, 2011 3:30 AM ESTGlobalFoundries – Expanding to Meet Demand
It’s been about a year since we last had a serious discussion with GlobalFoundries. Last CES, they were relatively recently separated from AMD and we didn’t have a whole lot to report; 12 months later, there’s a lot more going on. They are now doing full production of 45nm and 40nm chips, the latter going into many of the other electronics devices we use on a regular basis. Where AMD and Intel have lived on the 65nm, 45nm, and now 32nm nodes, most of the other IP developments are happening on the so-called half-nodes. Smartphones and their related technologies are the big one, and GF is working with many of the major players—Qualcomm, ST, ARM, and others. GlobalFoundries now has 300mm fabs in Germany, New York, and Singapore with additional 200mm fabs in Singapore; all told there are 12 locations with approximately 10,000 employees.
One of the big questions last year was how GlobalFoundries would handle the needs of other fabless semiconductor companies. The acquisition of Chartered Semiconductor increased their presence in a big way, also giving GF fabrication facilities in Singapore. While AMD continues to use nearly all of the 45nm capacity, the 40nm production has a long list of partners. The Dresden fab 1 is adding capacity and they’re pushing up to 80,000 wafers per month. The upstate New York fab 8 construction continues to progress, and demand for their services is great enough that GF has already begun working to expand the facility before the current construction is even complete! Fab 8 should come online in 2013, with production focused on 22/20nm output and capacity for around 60,000 wafers per month.
Right now, GlobalFoundries is entering full production mode for 32nm, with AMD’s Llano chips scheduled to be the first market solution to use the process. Later this year, AMD will also launch their Bulldozer cores on the 32nm process. In a similar vein, the 28nm node uses the same gate stack and will leverage the learning of their x86/32nm ramp. Qualification and production of 28nm will continue over the coming year, with plenty of new 28nm devices to come. One of the interesting points GF brought up is that in the past, doing test SRAM wafers has been the common practice for ironing out the glitches in a new process. SRAM wafers produce regular grids where errors are easy to detect; the problem is that complex logic stresses the latest process technology in a different way. To further help with process refinements, GF also runs a test SoC ARM Cortex A9 microprocessor design through the pipeline. The results of this test wafer not only help them find and fix problems, but they also give a better idea of how chips on the new process perform. The following slide summarizes what they’ve seen so far.
Bumping typical clock speeds from 1.5 to 2.0GHz for the lower power designs, and from 2.0 to 2.5GHz on higher performance chips is good, but even more importantly GF saw power requirements drop up to 30%. Providing 25-33% more performance using 30% less power (and a 100% increase in standby battery life) will be key to enabling the next generation of mobile devices, and we should continue to see a roughly two-fold improvement in performance at a given power envelope every couple of years. Over the course of the year we should see a major transition from 45nm to 32nm for AMD solutions, while the other devices should experience a similar transition from 40nm to 28nm.
GF looks to be doing well, and they are looking like a viable alternative to TSMC for companies looking for fabrication facilities. They have technologies ready for all the electronics segments, from CPUs and microprocessors to smartphones and DTV. They’re also looking at opportunities in the MEMS (Micro Electro Mechanical Systems, i.e. DLP, optical switches) and 3D stacking. Over the next two years, they should more than double their 300mm wafer capacity, and continue R&D into future technologies.
Moving beyond 32nm and 28nm, GlobalFoundries is well into the research and design phase on their 22nm and 20nm nodes. We asked about plans beyond 20nm, and GF says that the current planar transistor commonly used in today’s microprocessors should still be good for a few years. After that, they expect the next five or so generations of process refinements to move to more of a vertical arrangement for the transistor layout. Every so often the naysayers inevitably rear up and claim we’re reaching the end of what we can achieve with semiconductor technology. First it was getting below 1 micron, but we’ve long since smashed that barrier and are moving steadily towards the 1nm mark. How small can we go? No one is willing to call it quits yet at GF, and we expect to see additional creative solutions as we run into technological hurdles.
AMD Meetings: APUs Make a Big Splash
We also had a visit with AMD at their meeting rooms, which were filled with product demonstrations. Brazos laptops and netbooks occupied a large area just inside the door—we counted at least 20 different laptops of varying sizes and capabilities. The vast majority of there were running an AMD APU, in this case Brazos. There were 10” E-350 netbooks, 11.6” E350 ultraportables, and even 14” to 15.6” solutions all using the power friendly APU. A few of the systems also had K10.5 CPUs with the new 6000M GPUs (we’ll get to those next). Browsing around the show floor, though, Brazos looks to be making some real waves, providing a compelling alternative to Atom in the sub-$500 netbook market. In the next couple of months, we should see a lot of Brazos systems, from small nettop/desktop systems to netbooks… and yes, tablets as well. AMD reports battery life of up to 12 hours on some of their test netbooks; the reason they’re able to get such long battery life is pretty simple:
Intel’s Atom is a fairly tiny chip, but even though it manages to sip power, it’s not a very attractive performer. Brazos is even smaller than Atom, in part thanks to the use of 40nm (Brazos) vs. 45nm (Atom), and while raw CPU performance may not be that much higher than the current Atom options, the DX11 GPU is an order of magnitude more powerful than the GMA 3150 found in Pine Trail. AMD mentioned at one point that the Brazos APU is rated at up to 90GFLOPS of compute performance; to put that in perspective, the new quad-core Sandy Bridge CPU (no word on the GPU in SNB) provides a similar 87GLOPS of compute potential. GFLOPS isn’t the most useful of measurements, but it does help to put things in perspective: similar compute potential in a package that has an 18W TDP (E-350), where i7-2600K is specced at 95W.
AMD is aiming the new E-series Zacate parts at Intel’s P6000 processor, while the C-series is gunning for Atom. You need to consider the source when looking at the above slides—and note also that most of the graphs don’t start at 0—but if AMD can deliver 10.5 hours with an 18W Zacate chip that puts them in the same ballpark as Atom. We’ve never been super positive about the performance of Atom netbooks, so better performance and a similar price would be a great starting point, but what will really make or break the laptops is the design. Here’s what we saw:
Sadly, not a single netbook or laptop stands out as being clearly superior to anything else out there. Performance looks good, aesthetics vary from okay to great depending on your point of view, but the LCDs are all same-old, same-old. It would be awesome to see ASUS or HP or some other manufacturer step up to the plate and deliver a Zacate ultraportable with a beautiful screen—you know, like the IPS stuff they're putting into $400 tablets? After all, the APU is now able to provide all the multimedia prowess you could ask for; why not give us a display that can make the content shine?
To drive home the point about the superiority of the Brazos platform compared to Atom, AMD had one more demonstration for us. This involved a set of four netbooks/ultraportables from several (undisclosed) manufacturers. On the far left is an Atom N550 netbook; next in line was an E-350 laptop, then C-50 and last C-30. All four netbooks were running a looping 1080p H.264 video with no apparent problems. Then AMD pulled out a $6000 thermal imaging device—and yes, I really want one! You can see the results in the gallery above, for the Atom N550, C-50, and C-30 (we didn’t get a good shot of the E-350 top temp, but it was ~97F I think). The bottom of the netbooks was even warmer, hitting ~97 on E-350 and ~98 on C-50, compared to 112F on N550. The results weren’t too much of a surprise, as the Atom CPU lacks any form of HD video decoding acceleration and thus ends up hitting the CPU quite hard. Mostly it was a confirmation of the fact that decoding H.264 on a GPU is a lot more efficient than doing it on a CPU, even if the CPU is a low power Atom dual-core.
More AMD Demos and Future Roadmap
One thing we didn’t see at AMD is Bulldozer, the CPU architecture intended to bridge the gap between the current K10.5 solutions and Intel’s Nehalem and Sandy Bridge offerings. We’ve discussed some of the specifics of Bulldozer in the past, but we still don’t have anything concrete to report in terms of performance. GF reports that 32nm production of Orochi is going well, and Bulldozer will show up later this year, but there was no hands-on time with BD at CES to report on. Estimates however are that it should provide a drop-in replacement on existing AMD servers that should boost performance by around 50%. If the desktop processors can get a similar performance boost, that ought to put Bulldozer into close competition with Sandy Bridge, and there’s no doubt that a 500GFLOPS GPU core (i.e. something similar to the HD 5600 series) will put paid to Intel’s HD Graphics 3000.
Also present was a single "Llano-like" laptop, but it was only used for a software demonstration from another company. That demonstration consisted of a 3D camera and video camera recording a scene, similar to the Xbox Kinect. The difference here is that the Presentation demo used OpenCL code to process the video signal, analyze the 3D information, and remove the background from the video stream in real time. The result was a sort of blue-screen effect without the use of a blue screen, and the software additionally interacted with a PowerPoint presentation to integrate the presenter with the content—useful for putting the human element into a webcast. The resolution of the 3D signal was such that the outline of the human was a little fuzzy, and the demonstration still tells us very little about Llano performance, but it was still a cool demo.
Brazos is certainly showing uptake at the show, and netbooks should become quite a bit more capable thanks to the design. Going forward, AMD has the Trinity APU that will meld 2-4 Bulldozer cores with a fast GPU core, providing even better performance and flexibility. Where the “Stars” CPUs releasing this year and the Trinity core next year will both use 32nm process technology, it’s interesting that AMD is using 40nm TSMC for production of the Brazos core right now. (This apparently is due to the amount of IP that AMD already has with 40nm GPUs.) Next year, Krishna and Wichita will drop 1-4 Bobcat cores into an APU, and they’ll make the shift to 28nm. We suspect that these chips will shift over to GlobalFoundries 28nm node, though it’s possible AMD could source such chips from both TSMC and GF. Also coming at the top of the CPU performance pile are Zambezi (4-8 Bulldozer cores), roughly in the middle of 2011. That will be followed by Komodo, sporting a full eight Bulldozer cores; neither offering will include an IGP, on the assumption that these high-end CPUs will be paired with discrete GPUs.
(Belatedly) Examining AMD’s Mobility 6000M
Last but not least, we have AMD’s new mobile GPUs. We already discussed NVIDIA’s new 500M lineup, but somehow we slipped through the cracks and didn’t get briefed on AMD’s 6000M lineup in advance of the Tuesday unveiling. There was a bit of miscommunication between us and AMD, where we thought we were being briefed in person today on products that would be announced post-CES. AMD meanwhile thought we already had the basic information and we’d just get some additional detail and hands-on experience at the show. Well, that didn’t quite happen. We don’t have the depth of information available that we did with the 500M, but we did get the important details like shader counts, clock speeds, etc. As with the GeForce 500M launch, the Radeon 6000M also has some rebranding going on, but there are some completely new chips as well. Here’s the rundown.
AMD Radeon 6000M Specifications | ||||||
6900M | 6800M | 6700M/6600M | 6500M | 6400M | 6300M | |
Target Market | Ultra Enthusiast | Enthusiast | Performance | Performance Thin | Mainstream | Value |
Stream Processors | 960 | 800 | 480 | 400 | 160 | 80 |
Transistors | 1.7 Billion | 1.04 Billion | 715M | 626M | 370M | 242M |
Core Clock (MHz) | 560-680 | 575-675 | 500-725 | 500-650 | 480-800 | 500-750 |
RAM Clock (MHZ) |
900 (3.6GHz) |
900-1000 (3.6-4.0GHz) |
800-900 (3.2-3.6GHz) |
900 (3.6GHz) |
800-900 (3.2-3.6GHz) |
800-900 (1.6-1.8GHz) |
RAM Type | GDDR5 / DDR3 | GDDR5 / DDR3 | GDDR5 / DDR3 | GDDR5 / DDR3 | GDDR5 / DDR3 | DDR3 |
Bus Width | 256-bit | 128-bit | 128-bit | 128-bit | 64-bit | 64-bit |
Compute Performance | ~1.31 TFLOPS | ~1.12 TFLOPS | 696 GFLOPS | 520 GFLOPS | 256 GFLOPS | 120 GFLOPS |
Bandwidth (GB/s) | 115.2 | 57.6-64 |
51.2-57.6 GDDR5 or 25.6-28.8 DDR3 |
57.6 GDDR5 or 28.8 DDR3 |
25.6 GDDR5 or 12.8-14.4 DDR3 |
12.8-14.4 DDR3 |
ROPs | 32 | 16 | 8 | 8 | 4 | 4 |
UVD Version | UVD3 | UVD2 | UVD3 | UVD2 | UVD3 | UVD2 |
Eyefinity | Up to 6 | Up to 6 | Up to 6 | Up to 6 | Up to 4 | Up to 4 |
HDMI 1.4a | Yes | Via Software | Yes | Via Software | Yes | Via Software |
DisplayPort 1.2 | Yes | No | Yes | No | Yes | No |
All of the chips are still on 40nm, but the 6900M, 6700M, and 6400M use new designs based off the Barts architecture. You’ll note that they all include UVD3, HDMI 1.4a, and DisplayPort 1.2. On the rebranding side of things, 6800M, 6500M, and 6300M are all clock speed bumps of the existing 5000M series, which means they’re still the mobile variants of the Redwood architecture. AMD has apparently enabled a software “hack” that lets them do HDMI 1.4a, but they don’t support DP1.2, and they also don’t support Blu-ray 3D. (The HD 6430M also lacks 3D Blu-ray support.) We’ve previously covered the architectural enhancements in the Barts chips, so we won’t dwell on that much here. Clock for clock, Barts should be slightly faster than the previous generation Redwood series, it’s more power efficient, and it has a better video processing engine. One thing that sadly isn’t showing up in mobile GPUs just yet is the Cayman PowerTune technology; we’ll probably have to wait for the next generation mobile chips to get PowerTune as an option, and we’re hopeful that it can do for mobile GPUs what Intel’s Turbo Boost is doing for Sandy Bridge.
As with the NVIDIA hardware, the jury is still out on performance of the various solutions, but on paper everything looks reasonable. Starting at the bottom we have the 6300M, which looks to be a faster clocked HD 5470. That’s not going to win many awards for raw computational prowess, but as with NVIDIA’s 410M/520M it does provide an inexpensive option that will have AMD’s Catalyst drivers, so until Intel can get their Sandy Bridge IGP drivers to the same level we like having alternatives. Of course, we wouldn’t want switchable graphics with something as slow as the 6300M, as the goal should be noticeably better performance. The new 6400M should handle that role nicely. Sporting twice as many stream processors, 6400M should already offer a marked improvement over 6300M/HD 5470. Any configurations that get GDDR5 should reach the point where the GPU core is the sole limiting factor on performance, and while we’re not too fond of the 64-bit interface it should still be a good match for this “mainstream” offering.
Moving up to the next tier, we have the 6500M replacing the HD 5650, with the 6700M using the new architecture. The previous generation HD 5650 at 550MHz generally outperforms the NVIDIA GT 425M, so increasing the bandwidth and clock speeds (i.e. 6500M) should keep the series competitive with (or ahead of) the 525M/535M. The 6700M takes things a step further with 20% more stream processors, and provided the manufacturer uses GDDR5 you’ll get more than enough bandwidth—the 57.6GB/s figure makes the typical DDR3 configurations look archaic, but we worry there will be plenty of slower/cheaper DDR3 models on the market.
Finally, at the top we have the enthusiast and ultra-enthusiast offerings. 6800M is once more a higher clocked version of the existing HD 5850/5870. The 6900M is the potentially killer product. Total computation performance is up 17%, which is nothing special, but the memory interface is specced at 256-bit and 900MHz, yielding a whopping 115.2GB/s of bandwidth. We’ve seen quite a few games in the past where memory bandwidth appears to be a limiting factor, and the 6900M addresses this in a big way. Bandwidth is 80% higher than the previous generation 5870 and the 6800M, and it’s also 20% higher than what NVIDIA is offering with the GTX 485M. Of course, if the games/applications you’re running aren’t bandwidth limited, all that extra headroom might go to waste.
As we stated in the NVIDIA 500M announcement, NVIDIA has a very compelling platform with Optimus Technology allowing them to work seamlessly with integrated graphics and give you the appropriate performance or power savings as appropriate. Okay, so there are occasional bugs to work out with Optimus, but I’d put it at roughly the same level of teething pain as the current SLI support. Since NVIDIA lets you create custom profiles—for SLI as well as Optimus—most of the time things work out fine. The alternatives both involve compromises, namely: lack of regular driver updates in the case of switchable graphics, and lowered battery life with discrete-only.
AMD did inform us that they’re working on some updates to their switchable graphics design, which will involve putting a driver between the OS and the IGP/GPU drivers. They say it will allow users to update drivers for Intel’s IGP separate from the AMD GPU, and that it will address the concerns we’ve mentioned here and provide some needed competition to Optimus. When exactly will this new technology arrive and how will it work? That remains to be seen.
While I still think a good Optimus-enabled GPU with a quad-core Sandy Bridge processor is the best option for a balanced notebook, we need to see what AMD can do in terms of performance and battery life. Idle GPU power draw has been getting better with each generation, and we might not have to give up too much battery life. Certainly it’s less complex to only deal with a single GPU inside a system. There will also be plenty of AMD IGP + GPU designs that can use switchable graphics with AMD drivers, and since both sets of hardware use the same driver you don’t have to worry about lack of support. With Llano APUs later this year, we should see such configurations, but it’s hard to imagine Llana keeping up with Sandy Bridge on the CPU side. That means Trinity in 2012 will be the real alternative to the current “fast CPU + fast GPU + IGP” ecosystem NVIDIA and Intel are pushing.
Wrapping things up, there are a lot of laptops at CES using Brazos, plenty of AMD and Intel CPUs paired with AMD 6000M GPUs, and of course the Intel CPU + NVIDIA GPU combinations we mentioned earlier in the week. The mobile market just keeps growing, and we look forward to seeing how these new NVIDIA and AMD GPUs stack up. The proof will be in the pudding as usual.