NVIDIA Demonstrates Logan SoC: < 1W Kepler, Shipping in 1H 2014, More Energy Efficient than A6X?by Anand Lal Shimpi on July 24, 2013 9:00 AM EST
Ever since its arrival in the ultra mobile space, NVIDIA hasn't really flexed its GPU muscle. The Tegra GPUs we've seen thus far have been ok at best, and in serious need of improvement at worst. NVIDIA often blamed an immature OEM ecosystem unwilling to pay for the sort of large die SoCs necessary in order to bring a high-performance GPU to market. Thankfully, that's all changing. Earlier this year NVIDIA laid out its mobile SoC roadmap through 2015, including the 2014 release of Project Logan - the first NVIDIA ultra mobile SoC to feature a Kepler GPU. Yesterday in a private event at Siggraph, NVIDIA demonstrated functional Logan silicon for the very first time.
NVIDIA got Logan silicon back from the fabs around 3 weeks ago, making it almost certain that we're dealing with some form of 28nm silicon here and not early 20nm samples.
NVIDIA isn't talking about CPU cores, but it's safe to assume that Logan will be another 4+1 arrangement of cores - likely still based on ARM's Cortex A15 IP (but perhaps a newer revision of the core). On the GPU front, NVIDIA confirmed our earlier speculation that Logan includes a single Kepler SMX:
One Kepler SMX features 192 CUDA cores. NVIDIA isn't talking about shipping GPU frequencies either, but it did provide this chart to put Logan's GPU capabilities into perspective:
Don't get too excited as we're looking at a comparison of GFLOPS and not game performance, but the peak theoretical ALU bound performance of mobile Kepler should exceed that of a Playstation 3 or GeForce 8800 GTX (memory bandwidth is another story however). If we look closely at NVIDIA's chart and compare mobile Kepler to the iPad 4, we get a better idea of what sort of clock speeds NVIDIA would need to attain this level of performance. Doing some quick Photoshop estimation it looks like NVIDIA is claiming mobile Kepler has somewhere around 5.2x the FP power of the PowerVR SGX 554MP4 in the iPad 4 (76.8 GFLOPS). That works out to be right around 400 GFLOPS. With a 192 core implementation of Kepler, you get 2 FLOPS per core or 384 FLOPS per cycle. To hit 400 GFLOPS you'd need to clock the mobile Kepler GPU at roughly 1GHz. That's certainly doable from an architectural standpoint (although we've never seen it done on any low power 28nm process), but it's probably a bit too high for something like a smartphone.
NVIDIA didn't want to talk frequencies but they did tell me that we might see something this fast in some sort of a tablet. I suspect that most implementations will be clocked significantly lower. Even at half the frequency though, we're still talking about roughly Playstation 3 levels of FP power out of a mobile SoC. We know nothing of Logan's memory subsystem, which obviously plays a major role in real world gaming performance but there's no getting around the fact that Logan's Kepler implementation means serious business. For years we've lamented NVIDIA's mobile GPUs, Logan looks like it's finally going to change that.
API Support and Live Demos
Unlike previous Tegra GPUs, Kepler is a fully unified architecture and OpenGL ES 3.0, OpenGL 4.4 and DirectX 11 compliant. The API compliance alone is a huge step forward for NVIDIA. It's also a big one for game developers looking to move more seriously into mobile. Epic's Tim Sweeney even did a blog post for NVIDIA talking about Logan's implementation of Kepler and how it brings feature parity between PCs, next-gen consoles and mobile platforms. NVIDIA responded in kind by running some Unreal Engine 4 demos on Android on a Logan test platform. That's really the big story behind all of this. With Logan, NVIDIA will bring its mobile GPUs up to feature parity with what it's shipping in the PC market. Game developers looking to port games between console, PC, tablet and smartphone should have an easier job of doing that if all platforms supported the same APIs. Logan will take NVIDIA from being very behind in API support (with no OpenGL ES 3.0 support) to the head of the class.
NVIDIA took its Ira demo, originally run on a Titan at GTC 2013, and got it up and running on a Logan development board. Ira did need some work to make the transition to mobile. The skin shaders were simplified, smaller textures are used and the rendering resolution is dropped to 1080p. NVIDIA claims this demo was done in a 2 - 3W power envelope.
The next demo is called Island and was originally shown on a Fermi desktop part. Running on Logan/mobile Kepler, this demo shows OpenGL 4.3 and hardware tessellation working.
The development board does feature a large heatspreader, but that's not too unusual for early silicon just out of bring up. Logan's package size should be comparable to Tegra 4, although the die size will clearly be larger. The dev board is running Android and is connected to a 10.1-inch 1920 x 1200 touchscreen.
Post Your CommentPlease log in or sign up to comment.
View All Comments
fteoath64 - Thursday, July 25, 2013 - link"Do Intel and PowerVR have similar tricks up their sleeve?". Not Intel but PVR likely with Rouge. The key competitor here is Qualcomm with their Adreno 330 powerhouse. That has time to evolve to better efficiency and tuning. But competition on the mobile side is heating up with this announcement and potentially Nvidia's shilf to quickly make this part available. Bigger tablets and UltraLight notebooks can benefit from it rather than settle with Intel's junk iGP.
The disappointment of the market is no real Tegra4 and Tegra4i products in the market after such delays while Qualcomm churns out model after model of products for OEMs.
ocre - Thursday, July 25, 2013 - linkhuh? I really dont know what you people are talking about. Tegra 4s closest competition is the snapdragon 800 which is almost nonexistent in the market. Tegra 4 is beating the SD800 in almost every way. Including design wins!!!
"While some will argue that Nvidia's design wins are on the weak side, they have more announced design wins than their competitor, Qualcomm with the Snapdragon 800"
so while you guys might always have negative views towards nvidia, its not nearly as bad as you guys try to make it out to be
phoenix_rizzen - Thursday, December 19, 2013 - link4 months later, and there's dozens of phones and tablets out with Snapdragon S800 SoCs inside; while there are ... how many with Tegra4?
Doesn't matter how many "design wins" you have on paper if none of them ever reach the marketplace.
michael2k - Thursday, July 25, 2013 - linkYes. PowerVR has had similar performance available to licensees last January. Their first chips are expected to be out this year.
lmcd - Wednesday, July 24, 2013 - linkWe won't necessarily see the PowerVR implementations to beat this but the fact is that they exist and likely could beat this.
PowerVR needs reference implementations like all hell.
Qualcomm will beat this.
Doesn't 1-2 GCN cores do well against this, too?
Finally, Intel is already close, and will probably get this performance soon after T5 comes out, with lower power thanks to process advantage.
TheJian - Monday, August 5, 2013 - linkIf we won't see a powervr to "necessarily" beat this why discuss it? They don't exist until...umm...They EXIST. :) If I can't BUY it, it doesn't exist :)
I have a dream in my head of a 50000 gpu core soc that is less than .1w and 10000Tflops gpu power. Do I now have the fastest core on the planet? No, it doesn't EXIST until you can BUY it. Your first sentence really made me laugh. The fact is they don't exist. I'll give you powervr chips that beat this don't exist YET, but that's still just a MAYBE they will...ONE DAY. They'll face bankruptcy if nobody goes for mips tech. They clearly don't make enough from gpus on every apple device to survive the next few gens. TI exited due to not buying icera (or any modem), and Imagination may exit due to mips meaning nothing (could be wrong, we'll see - deving for mips is a tough sell I'd think vs arm/x86). I suspect they'll be bought or die in under 5yrs. They have a market cap of 1B if memory serves and make about 30mil/yr (only 2012, less previously, we'll see for 2013). Good luck keeping up with everyone else making 12x or far MORE than that in the same business. You're dead or bought. Apple could by them for the equivalent of a song (not sure why they haven't) but maybe they're making their own to dump them eventually. They bought PA Semi for the inhouse cpu. Do they have a gpu coming? Otherwise why not purchase IMG.L? They are already in everything you make. Odd, whatever.
The process advantage is disappearing quickly for Intel. Shortly Intel will have to out CHIP the competition, not out fab them. It's a new ballgame from here on out for Intel and the competition's profits blows Intel's away (IE Samsung making almost Intel's yearly profits in a Q). You can't outspend the competition on fabs when they make 3.5x what you do per year and can spend it all on FAB R&D until you go broke. Intel raised 6B in bonds to fund a buyback. It's a sign on weakness IMHO. They will have to cut dividends soon also (last I checked they can't fund them for more than a few years and even less with dropping profits). Again a sign of weakness.
Let me know when powervr (in something other than Apple products) beats a T4, let alone T5. I can't see a product yet :) Anything COULD beat NV. But until something DOES in a PRODUCT who cares? I'll need to see shipping product doing it first. Just like people ranting on lack of T4 wins (well duh, it just came out), I don't see S800 announcements right and left either. Why? It's NOT out either (not shipping in enough volume for massive announcements anyway, not yet). T4 just started shipping in July to ODM's.
I can predict everything will beat a T4 in 5 yrs...LOL. But none of it is shipping NOW. I can buy a toshiba tablet with T4 now (and HP also), though I wouldn't want the toshiba until quality is better. They appear to have design problems with their tablet. But it's shipping now. Qcom will beat this, but WHEN? And how do you know? S800 won't do it vs. T5 (possibly T4 but we don't know in what or if at all yet). Have you seen benchmarks for the next rev past S800? You are special. You work for Qcom or something and have a T5 in your hands? I'll reserve anything regarding AMD for when they actually SHIP a soc.
Also note Intel has hit 14nm snags recently and is already delaying chips. I agree with your PowerVR needs something for reference out now though ;) Where is this stuff? Are they that far away from shipping something? No xmas for them then, nor S800 if it doesn't get out the door in volume soon. We see T4 announcements from Toshiba, HP (both have T4's for sale), Asus, etc so they will be in xmas stuff for sure. How many S800 devices will make it for xmas? Are they shipping to ODM's yet? I know xperia Z is supposedly coming Sept so it has to be shipping but if anyone has some links to verify that...Then again when they ship devices with 3 different chips they can claim shipping but not really ship your desired chip for months. Exynos 5420 is ramping in Aug this month, but I guess we'll have to wait for Note3 to see T628 MP6 perf. They are claiming it's 2x faster than 5410 PowerVR544MP3. But I'll wait for benchmarks of a T628 before I believe this and battery life to go with that benchmark. How long before it is in a product? Note3 is expected Sept, but it has a long list of supposed chips, so not sure which has the S800 or 5420 or when these come specifically (or if I even have the correct chips, I saw a claim of S600 for one model - who knows at this point). I wish companies would just ship ONE soc per product or name them something different like note3.1 for s800, note 3.2 for 5420 etc etc. Something to differentiate better for customers who don't read every review on the planet. Would the REAL note3 please stand up...LOL. Granted its really about the included modem for specific regions but it's still confusing for a lot of people.
Intel is moving to their own soc gpus. I'll believe they beat people when I see it in anything gpu :) NV had samples of 20nm socs ages ago, so Intel will need 14nm to do any damage here and broadwell is slipping to 2015. When we will see a 14nm soc if they are having problems? All socs will be 20nm next year, so Intel's 22nm will be facing these (T5 etc) not far after silvermont/valleyview/baytrail release (tablet by xmas, phone next year?). All have sped up 14nm plans, so they won't be too far behind a DELAYED intel 14nm process especially samsung swimming in billions to dump into their fab tech. T6 is supposed to be 16nm finfet early 2015, so not much of a process advantage for Intel at 14nm then and T6 has maxwell in it. Intel better have a great gpu and this won't be a basic arm port for cpu either it is in house DENVER :)
So 20nm risk production started 2013 Q1, and 16nm risk starts by end of year.
"Chang indicated that TSMC already moved its 20nm process to risk production in the first quarter of 2013. As for 16nm FinFET, the node will be ready for risk production by the year-end, Chang said."
So depending on who T6 comes from it's either 16nm (tsmc, it's at least this) or 14nm (sammy?)? Either way Intel faces stiff competition from here out. The fab party is over. Time to make some unbeatable chips or pay the price. You won't outFAB the competition any more. Intel is about to release 22nm tablets/phones and everyone else is about to do 20nm shortly after that. Not sure how that's a real victory. I don't see one at 14nm either. We may actually see Intel get beaten to 10nm if samsung keeps pounding out 7.9B+ in profits per Q (Q1 was 7.9B, 8.3B i think for Q2). This kind of profit can kill Intel's fab lead quickly. Unless they seriously boost profits samsung has 10nm in the bag and that's assuming Intel wins to 14nm which has yet to be proven and is showing problems or broadwell wouldn't be delayed right? Will apple be fabbing for people shortly? They certainly have the cash to build 3-4 fabs today easily and laugh in 3yrs. I'm surprised it hasn't already happened (though they did buy one, why not start some new ones while you're rich?). Well rumor is they bought UMC, but it's probably true. I'd be doing that and building 3 450mm fabs for the future (what's that 20bil for 3 of them? Even at 25-30bil who cares, apple has ~130). Buy IMG.L and start pumping out gpus in one fab, socs in another and memory in a 3rd for all your devices in 3yrs. Samsung makes as much as they do on a device because 60+% of a phone/tablet comes from IN HOUSE.
As far as I can see this xmas for tablets is owned by S800/T4. Xmas Phones are owned by S600 I think, with maybe enough volume for S800 to make it into some things. T4i is Q1 and will miss xmas, and I don't think 5420 will make it into anything volume wise for xmas as it's just ramping this month.
T5 will be out before July 2014 on 20nm (so will a 20nm S800 etc). Intel won't be seeing 14nm until 2015 in volume if they crack 2014 at all. I don't call that SOON after T5. Unless you want to say VERY soon after Intel puts out a 14nm soc it will face T6 etc.
phoenix_rizzen - Thursday, December 19, 2013 - linkHere it is, Christmas time, and your predictions are a little off. There are many S800 SoCs in phones, many S600 SoCs in phones, nearly all the good tablets have S800 SoCs, and no Tegra4 to be seen.
If you can't even predict the next 4 months (July-Dec), why should we listen to your predictions for the next 3 years? :)
Tegra4 is a flop. Tegra4i is an admission by nVidia that Cortex-A15 can't cut it in phones. And Tegra5 won't be much better (if we even see it in anything by this time next year).
Zoolookuk - Sunday, July 28, 2013 - linkI don't know why people are picking on your comment - faster than A6? I'd hope so, given A6 is form 2012, and this is arriving in 2014. The question is whether it'll be faster than A8, not something from 18 months ago.
karasaj - Wednesday, July 24, 2013 - linkWow. Even if we account for a new iPad delivering 2x the performance of the last one (unlikely?) Keller is still ahead, potentially in both power and performance. This is crazy!
michael2k - Wednesday, July 24, 2013 - linkExcept it will be first among equals. PowerVR 6 is designed to scale up to 1TF, and realistically ship in 200GF to 600GF configurations, making the Logan nothing special.