Conclusion

The first impression that the Xeon 7500 series made on the world was seriously blurred. Part of the reason is that the testing platform had a firmware bug that decreased the memory bandwidth by 20% and more. Another reason were the weird benchmarking choices of reviewers. Lightwave, folding@home and Cinebench were somehow popular measuring sticks portraying the Xeon X7560 as the more expensive and at the same time slower brother of the Xeon X5670. That kind of software is run mostly on sub $4000 workstations and cheap 1U server farms, and we seriously doubt that anyone in their right mind would spend $30,000 on a server to run these kind of workloads.

Our own benchmarking was not complete either, as our virtualization benchmarking fell short of giving 32—let alone 64—threads enough work. Still, the impressive SAP S&D benchmark numbers, one of the most reliable and most relevant industry standard benchmarks out there, made it clear to us that we should give the Xeon X7560 another chance to prove itself.

Our new virtualization benchmark vApus Mark II shows that we should give credit where it is due: servers based on the X7560 are really impressive when consolidating services using virtualization: a quad Xeon X7560 can offer 2.3 times better performance than the best dual socket systems today! You might even call the performance numbers historical: for the first time in history, Intel’s multi-socket servers run circles around the dual socket servers. Remember how the quad Xeon 7200 hardly outperformed the dual Xeon 5300 at the end of 2006, and how the quad 7400 was humiliated by the dual Xeon X5500 in 2009? And even if we go even further back in history, the Xeon MP never outperformed the dual socket offerings by a large margin. Memory capacity and RAS features were almost always the main selling points. For the first time, scalability is more than just a hollow phrase; a Xeon X7560 server can replace two or more smaller servers in terms of memory capacity and processing power.

The end result is that these servers can be attractive for people who are not the traditional high-end server buyers. Using a few quad Xeon X7560 servers instead of a lot of dual socket servers to consolidate your software services may turn out to be a very healthy strategy. Based on our current data, two quad Xeon X7560 ($65k- $70k) are worth about five Xeon 5600 servers ($50k-$65k). The acquisitions costs are slightly higher, but you need fewer physical servers and that lowers the management costs somewhat. There are two questions that remain:

1) How bad or good is the power/performance ratio?

2) If RAS is not your top priority, does a quad Opteron 6174 make more sense?

A Dell R815 with four twelve-core Opteron 6174 processors has arrived in our labs. So our search for the best virtualization building block continues.

 

A big thanks to Tijl Deneut and Dieter Vandroemme.

The Virtualization Landscape So Far
Comments Locked

51 Comments

View All Comments

  • fynamo - Wednesday, August 11, 2010 - link

    WHERE ARE THE POWER CONSUMPTION CHARTS??????

    Awesome article, but complete FAIL because of lack of power consumption charts. This is only half the picture -- and I dare to say it's the less important half.
  • davegraham - Wednesday, August 11, 2010 - link

    +1 on this.
  • JohanAnandtech - Thursday, August 12, 2010 - link

    Agreed. But it wasn't until a few days before I was going to post this article that we got a system that is comparable. So I kept the power consumption numbers for the next article.
  • watersb - Wednesday, August 11, 2010 - link

    Wow, you IT Guys are a cranky bunch! :-)

    I am impressed with the vApus client-simulation testing, and I'm humbled by the complexity of enterprise-server testing complexity.

    A former sysadmin, I've been an ignorant programmer for lo these past 10 years. Reading all these comments makes me feel like I'm hanging out on the bench in front of the general store.

    Yeah, I'm getting off your lawn now...
  • Scy7ale - Wednesday, August 11, 2010 - link

    Does this also apply to consumer HDDs? If so is it a bad idea to have an intake fan in front of the drives to cool them as many consumer/gaming cases have now?
  • JohanAnandtech - Thursday, August 12, 2010 - link

    Cold air comes from the bottom of the server aisle, sometimes as low as 20°C (68F) and gets blown at high speed over the disks. Several studies now show that this is not optimal for a HDD. In your desktop, the temperature of the air that is blown over the hdd should be higher, as the fans are normally slower. But yes, it is not good to keep your harddisk at temperatures lower than 30 °C . use hddsentinel or speedfan to check on this. 30-45°C is acceptable.
  • Scy7ale - Monday, August 16, 2010 - link

    Good to know, thanks! I don't think this is widely understood.
  • brenozan - Thursday, August 12, 2010 - link

    http://en.wikipedia.org/wiki/UltraSPARC_T2
    2 sockets =~ 153GHz
    4 sockets =~ 306GHz
    Like the T1, the T2 supports the Hyper-Privileged execution mode. The SPARC Hypervisor runs in this mode and can partition a T2 system into 64 Logical Domains, and a two-way SMP T2 Plus system into 128 Logical Domains, each of which can run an independent operating system instance.

    why SUN did not dominate the world in 2007 when it launched the T2? Besides the two 10G Ethernet builtin processor they had the most advanced architecture that I know, see in
    http://www.opensparc.net/opensparc-t2/download.htm...
  • don_k - Thursday, August 12, 2010 - link

    "why SUN did not dominate the world in 2007 when it launched the T2?"

    Because it's not actually that good :) My company bought a few T2s and after about a week of benchmarking and testing it was obvious that they are very very slow. Sure you get lots and lots of threads but each of those threads is oh so very slow. You would not _want_ to run 128 instances of solaris, one on each thread, because each of those instances would be virtually unusable.

    We used them as webservers.. good for that. Or file servers that you don't need to do any cpu intensive work.

    The theory is fine and all but you obviously have never used a T2 or you would not be wondering why it failed.
  • JohanAnandtech - Thursday, August 12, 2010 - link

    "http://en.wikipedia.org/wiki/UltraSPARC_T2
    2 sockets =~ 153GHz
    4 sockets =~ 306GHz"

    You are multiplying threads times clockspeed. IIRC, the T2 is a finegrained multithread CPU where 8 (!!) threads share two pipelines of *one* core.

    Compare that with the Nehalem core where 2 threads share 4 "pipelines" (sustained decode/issue/execution/retire) per cycle. So basically, a dual socket T2 is nothing more than 16 relatively weak cores which can execute 2 instructions per clockcycle at the most, or 32 instructions per cycle. The only advantage of having 8 threads per core is that (with enough indepedent software threads) the T2 is able to come relatively close to that kind of throughput.

    A dual six-core Xeon has a maximum throughput of 12 cores x 4 instructions or 48 instructions per cycle. As the Xeon has only 2 threads per core, it is less likely that the CPU will ever come close to that kind of output (in business apps). On the other hand, it performs excellent when you have some amount of dependent threads, or simply not enough threads in parallel. The T2 will only perform well if you have enough independent threads.

Log in

Don't have an account? Sign up now