Last month I took a tour of Oak Ridge National Laboratory and visited the final stages of the assembly of the Titan supercomputer. Titan brings together 18,688 compute nodes, each complete with a 16-core AMD Opteron 6274 CPU and an NVIDIA Tesla K20X GPU. The total core count ends up at 299,008 AMD x86 CPU cores and 50,233,344 NVIDIA GPU cores. Each CPU gets 32GB of DDR3 while each GPU is paired with 6GB of GDDR5 for a total of 710TB of memory in Titan altogether. The entire machine will use as much as 9 megawatts of power under full load.

I described some of the types of applications that will run on Titan in our earlier article, but one of the first applications that Titan was tuned for was the LINPACK benchmark. The Top500 list of world's fastest supercomputers as measured by LINPACK is updated twice a year: in June and November. The Titan upgrade was completed just in time to tune and run LINPACK on the machine and submit an official score. As with any new system there's always the threat of hardware or software issues, but luckily the teams working on Titan were able to upgrade all 18,688 compute nodes and get this system stable and running in time to meet the November deadline for submissions.

Rows of Titan cabinets

The result is very impressive, a score of 17.59 petaflops in the LINPACK benchmark while drawing 8.21MW of power. Titan's first LINPACK score gives it the first place position on the Top500 list of supercomputing sites. Number two on the list is IBM's BlueGene/Q system using PowerPC A2 CPUs and delivering 16.33 petaflops.

Comments Locked


View All Comments

  • Ktracho - Monday, November 12, 2012 - link

    My wild guess is that it is popular because Intel is providing grants to institutions willing to harbor these machines. Note, however, that they are not nearly as power efficient as IBM or Cray. I would guess they will argue that power efficiency will improve with each generation of their accelerators.
  • cjl - Wednesday, November 14, 2012 - link

    The latest Green500 list (power efficiency rankings of supercomputers) would disagree with you there - the top system is an Intel Xeon Phi, at 2500 MFLOPS/W.
  • TheJian - Wednesday, November 14, 2012 - link

    Where do you get that it's easy to code for (other than Intel engineer's saying it). Larabee failed on the desktop and never came out (if memory serves) because (partially) it was thought to be complicated to keep it running loaded and wouldn't be optimized for by coders who'd prefer to stay with old lazy tech coding they already knew. It would have been highly programmable, but I never thought that meant easy.

    ""Larrabee silicon and software development are behind where we hoped to be at this point in the project," stated Intel in a email to DailyTech."

    "Intel recognized the importance of software and drivers to Larrabee's success, leading to the creating of the Intel Visual Computing Institute at Saarland University in Saarbrücken, Germany. The lab conducts basic and applied research in visual and parallel computing."

    If it was easy to code for I'd think they shouldn't have been behind with their own software efforts. Intel has huge resources compared to NV/AMD. But was beaten by both to 2TFlops and then 5TF etc.

    I agree with Ktracho...I think their seeding these, not people investing in them (yet? could change). If Intel offers some free chips, you use them unless your stupid :) On the other hand, I think CRAY pays for Nvidia's ;)
  • eddman - Monday, November 12, 2012 - link

    That's not it.

    Jaguar XT5-HE, no GPU: 75.4%
    Jaguar XK6, Tesla 2090: 73.8%
    Titan: 64.8%
    Stampede, Xeon Phi: 67.1%

    Titan and Stampede rate about the same in efficiency.

    The interesting thing here is that Jaguar XK6 was almost as efficient as the CPU only XT5.

    Why Titan dropped 9%? Could it be that the Gemini interconnect isn't able to cope with such a load? Is it because of unoptimized software?

    If so, does it mean that Titan can achieve a higher rate later on?
  • MrSpadge - Monday, November 12, 2012 - link

    I wouldn't rule out that they'll be able to squeeze a few more PFlops out of it via software tuning. We don't have much experience yet with such systems ;)
  • biostud - Tuesday, November 13, 2012 - link

    I wonder how efficient their PSU's are?
  • Arbie - Thursday, November 15, 2012 - link

    Actually, what I really wonder is why we are paying for this project. What is ORNL going to do with it other than set records and boost the chipmakers' bottom lines? What simulations are we relying on that just have to run ten times faster than they were before? The only obvious one is weather prediction, where real-time could matter a lot. So I would suppose that the weather service would own the computer - not ORNL.

    At least it appears to be a success as a project, which is all too rare in government agencies.
  • cjl - Thursday, November 15, 2012 - link

    Lots of scientific research can benefit from as much computing power as you can throw at it. You already mentioned one of the big ones: weather. In addition to that however, aerodynamics research can use a huge amount of CPU power to run CFD, which will help future aircraft fly more efficiently and possibly faster. Coupled multiphysics simulations (aerodynamics, thermal, and structural simulations rolled into one) also require enormous computational power, and they allow for improved design of high speed aircraft, spacecraft, and reentry vehicles for the space program. Nuclear simulations can both help simulate nuclear detonations, including how aging has affected the nuclear arsenal (which is probably going to happen more on LLNL's Sequoia supercomputer rather than this one, but it is a comparably powerful machine) and simulate nuclear reactors, which could be a substantial portion of our move away from fossil fuels in the future.

    More in the realm of basic, rather than applied research, astronomical simulations also require a huge amount of computing power. Everything from galactic and universal structure formation in the early universe to stellar evolution, including simulations of the last moments of a massive star going supernova requires hugely more computational power than even the most powerful computers today.

    They can even be used for simulations of things we take for granted - one good example of this is combustion simulation in internal combustion engines. Yes, we have perfectly functional engines already, but to properly simulate the combustion takes an enormous amount of computing power, and once it can be properly simulated, it can be improved and optimized, helping reduce fuel usage and emissions from new cars.
  • Arbie - Friday, November 16, 2012 - link

    Thanks for the thorough reply. Some of your examples are areas that I never thought of.
  • hmoobphajej - Friday, November 16, 2012 - link

    Anand went over in great detail about the capabilities of ORNL's super computer. You should really read it and watch the videos he posted because they are pretty interesting. Even with the super computer as powerful as it is, it isn't powerful enough for many things scientist still wants to do.

Log in

Don't have an account? Sign up now