Dynamic Power Management: A Quantitative Approach
by Johan De Gelas on January 18, 2010 2:00 AM EST- Posted in
- IT Computing
The Hardware
One of the latest low power Xeons is the Intel Xeon L3426. It is the Xeon version of the "Lynnfield" architecture. What really makes this CPU special is the ability to boost its clock speed from 1.86GHz (default) to 3.2GHz without dissipating more power than 45W. The Turbo Boosted clock speed of 3.2GHz can only be reached if only one core is under load. Combine this with the fact that the CPU can cope with eight threads at once (four cores + Hyper-Threading) and you will see why this CPU deserves special attention. It should offer very low power when the load on the server is low as in that case the CPU steps back to a 1.2GHz clock. At the same time the CPU can scale up to 3.2GHz to provide excellent single threaded performance at 3.2GHz. And it also holds the promise that the CPU will never consume more than 45W, even when under heavy load. Performance will be "midrange" in those situations as the CPU cannot clock higher than 1.86GHz, but eight threads will still be running simultaneously.
At $284 this CPU looks like the best Intel offering ever in the entry level server space... but it does have drawbacks of course. In the desktop space, the affordable Core i7 860 "Lynnfield" CPU relegated the expensive Core i7 900 series "Bloomfield" CPUs to the shrinking high-end desktop CPU market. This is not going to happen in the server world: the "Lynnfield" CPU has no QPI links, so it cannot be used in multi-socket servers. The triple channel Xeon X55xx series ("Gulftown") will continue to be Intel's dual socket Xeons until they are followed up by quad- and six-core 32nm Westmere CPUs. So the first drawback is that you will be limited to four physical cores per server (and four logical SMT cores). For those of us running non-virtualized workloads, this is probably more than enough CPU power.
Talking about virtualization, that is the second drawback: each of the memory channels supports three DIMMs. As a result the CPU does not support more than 24GB. This is another reason why the cheap "Lynnfield" Xeon is not going to threaten the Xeon 5500 series anytime soon: a dual Xeon "Gulftown" server supports up to 144GB.
We also have the Xeon X3470 in the lab. When running close to idle, this CPU can also throttle back to 1.2GHz and save a lot of power. For the occasional single threaded task, the X3470 offers the best single threaded performance in the whole server CPU industry: one core can speed up to 3.6GHz. Yes, this is the "Xeonized" Core i7 860. Why does this CPU interest us? Well, the Xeon 3470 (2.93GHz) is only one speed bin faster than the X3460 (2.83GHz). The X3460 costs around the same price ($316) as the L3426 and can Turbo Boost up to 3.46GHz. And the X3460 brings up an interesting question: is a L3426 really the most interesting choice if you want a decent balance between performance and power? The L3426 has the advantage that you are sure you will never break a certain power limit. The X3460 however offers low power at low load too, and has more headroom to handle rare peaks. Below you can find the specs of our Intel server:
Intel SC5650UP
Single Xeon X3470 2.93GHz or Xeon L3426 1.86GHz
S3420GPLC, BIOS version August 24, 2009
Intel 3420 chipset
8GB (4 x 2GB) 1066MHz DDR-3
PSU: 400W HIPro HP-D4001E0
While we did not have a comparable AMD based server yet, this article would not be complete without a look at an AMD based system. AMD promised to send us a low power server, but after some back and forth correspondence it became clear the system would not be able to meet our deadline. Rest assured that we will update you once we get the new low power system from AMD. At the moment AMD looks a bit weak in the low cost server arena as its honor is defended by the 2.5GHz - 2.9GHz Opteron "Suzuka" CPU. That is a single CPU solution based on "Shanghai": four K10 cores and a 6MB L3 cache. This platform is almost EOL: we expect the San Marino and Adelaide platform around CeBIT 2010. Servers based on these new AMD platforms will save quite a bit of power compared to their older siblings. The six-core Lisbon that will find a home in these servers will be a slightly improved version of the current six-core AMD Opteron. Below are the specs of our AMD server:
Supermicro A+ Server 1021M-UR+V
Dual Opteron 2435 "Istanbul" 2.6GHz or Opteron 2389 2.9GHz
Supermicro H8DMU+, BIOS version June 18, 2009
8GB (4 x 2GB) 800MHz
PSU: 650W Cold Watt HE Power Solutions CWA2-0650-10-SM01-1
AMD picked the components in this server for our previous low power comparison of servers. We went with the Opteron 2435 as the six-core Opteron offers a very decent performance/watt ratio on paper. We will update the numbers once the Opteron 2419 EE arrives.
Making the comparison reasonably fair
So we had to work with two different servers. While the AMD versus Intel side of things is not our main focus, how can we make a reasonably fair comparison? The difference in power supplies is hardly a problem: both AMD and Intel feel that these power supplies are among the best available as they were chosen for their low power platforms. Both power supplies are 90% efficient over a very wide range of power load. The problem is the fans.
The fans in the AMD machine are small and fast, with speeds up to 11900 rpm! We disabled fan speed control to keep the power consumption of the fans constant. There are four fans and we measured the fan power consumption by taking out the fans that blow over the memory while keeping the two fans that cool the CPUs. This way we were sure that our CPU would not overheat and leak more power. We carefully measured the temperature of the CPU and jotted down the power measurements in all of our tests. We found out that each fan consumes about 8W. We did the same thing for the Intel machine: the power consumption of each fan was measured at the electrical outlet. The memory DIMMs were also checked: there was no significant difference between DDR2-800 and DDR3-1066, both in idle as well as under load. By taking the fans out of the equation, we can get a very reasonable comparison of both platforms. So how well do the current CPUs manage power?
35 Comments
View All Comments
JohanAnandtech - Tuesday, January 19, 2010 - link
Well, Oracle has a few downsides when it comes to this kind of testing. It is not very popular in the smaller and medium business AFAIK (our main target), and we still haven't worked out why it performs much worse on Linux than on Windows. So chosing Oracle is a sure way to make the projecttime explode...IMHO.ChristopherRice - Thursday, January 21, 2010 - link
Works worse on Linux then windows? You have a setup issue likely with the kernel parameters or within oracle itself. I actually don't know of any enterprise location that uses oracle on windows anymore. "Generally all Rhel4/Rhel5/Sun".TeXWiller - Monday, January 18, 2010 - link
The 34xx series supports four quad rank modules, giving today a maximum supported amount of 32GB per CPU (and board). The 24GB limit is that of the three channel controller with unbuffered memory modules.pablo906 - Monday, January 18, 2010 - link
I love Johan's articles. I think this has some implications in how virtualization solutions may be the most cost effective. When you're running at 75% capacity on every server I think the AMD solution could have possibly become more attractive. I think I'm going to have to do some independent testin in my datacenter with this.I'd like to mention that focusing on VMWare is a disservice to Vt technology as a whole. It would be like not having benchmarked the K6-3+ just because P2's and Celerons were the mainstream and SS7 boards weren't quite up to par. There are situations, primarily virtualizing Linux, where Citrix XenServer is a better solution. Also many people who are buying Server '08 licenses are getting Hyper-V licenses bundled in for "free."
I've known several IT Directors in very large Health Care organization who are deploying a mixed Hyper-V XenServer environment because of the "integration" between the two. Many of the people I've talked to at events around the country are using this model for at least part of the Virtualization deployments. I believe it would be important to publish to the industry what kind of performance you can expect from deployments.
You can do some really interesting HomeBrew SAN deployments with OpenFiler or OpeniSCSI that can compete with the performance of EMC, Clarion, NetApp, LeftHand, etc. NFS deployments I've found can bring you better performance and manageability. I would love to see some articles about the strengths and weaknesses of the storage subsystem used and how it affects each type of deployment. I would absolutely be willing to devote some datacenter time and experience with helping put something like this together.
I think this article really lends itself well into tieing with the Virtualization talks and I would love to see more comments on what you think this means to someone with a small, medium, and large datacenter.
maveric7911 - Tuesday, January 19, 2010 - link
I'd personally prefer to see kvm over xenserver. Even redhat is ditching xen for kvm. In the environments I work in, xen is actually being decommissioned for VMware.JohanAnandtech - Tuesday, January 19, 2010 - link
I can see the theoretical reasons why some people are excited about KVM, but I still don't see the practical ones. Who is using this in production? Getting Xen, VMware or Hyper-V do their job is pretty easy, KVM does not seem to be even close to being beta. It is hard to get working, and it nowhere near to Xen when it comes to reliabilty. Admitted, those are our first impressions, but we are no virtualization rookies.Why do you prefer KVM?
VJ - Wednesday, January 20, 2010 - link
"It is hard to get working, and it nowhere near to Xen when it comes to reliabilty. "I found Xen (separate kernel boot at the time) more difficult to work with than KVM (kernel module) so I'm thinking that the particular (host) platform you're using (windows?) may be geared towards one platform.
If you had to set it up yourself then that may explain reliability issues you've had?
On Fedora linux, it shouldn't be more difficult than Xen.
Toadster - Monday, January 18, 2010 - link
One of the new technologies released with Xeon 5500 (Nehalem) is Intel Intelligent Power Node Manager which controls P/T states within the server CPU. This is a good article on existing P/C states, but will you guys be doing a review of newer control technologies as well?http://communities.intel.com/community/openportit/...">http://communities.intel.com/community/...r-intel-...
JohanAnandtech - Tuesday, January 19, 2010 - link
I don't think it is "newer". Going to C6 for idle cores is less than a year old remember :-).It seems to be a sort of manager which monitors the electrical input (PDU based?) and then lowers the p-states to keep the power at certain level. Did I miss something? (quickly glanced)
I think personally that HP is more onto something by capping the power inside their server management software. But I still have to evaluate both. We will look into that.
n0nsense - Monday, January 18, 2010 - link
May be i missed something in the article, but from what I see at home C2Q (and C2D) can manage frequencies per core.i'm not sure it is possible under Windows, but in Linux it just works this way. You can actually see each core at its own frequency.
Moreover, you can select for each core which frequency it should run.