Factual 'Big Mac' Results 566
danigiri writes "Finally Varadarajan has put some hard facts on the speed of the VT 'Big Mac' G5 cluster. Undoubtedly after some weeks of tuning and optimization, the home-brewn supercluster is happily rolling around at 9.555 TFlops in LINPACK.
The revelations were made by the parallel computing voodoo master himself at the O'Reilly Mac OS X conference. It seems they are expecting and additional 10% speed boost after some more tweaking. Srinidhi received standing ovations from the audience.
Wired news is also running a cool news piece on it. Lots of juicy technical and cost details not revealed before. Myth dispelling redux: yes, VT paid full price, yes, it's running Mac OS X Jaguar (soon Panther), yes, errors in RAM are accounted for, Varadarajan was not an Apple fanboy in the least... read the articles for more booze."
Brewn? (Score:3, Interesting)
interesting points (Score:5, Interesting)
What more do you need? Faster systems, cheaper total cost, and slick looking cases.
Re:Full price (Score:3, Interesting)
Dumb Question... (Score:4, Interesting)
1) Why can't they just shout "Let 'er rip!!" and crank the thing wide open?
2) Why all the media buzz concerning this as a `surprise' when they've already got its performance figured out, apparently?
Sorry.
Too bad some software patents will be filed (Score:4, Interesting)
What's up with that?
Used to be that work like this done at a Univeristy was considered 'open' as in available to anyone to help advance the state-of-the-art. Not anymore...
Re:Super computer? (Score:2, Interesting)
In terms of raw processing power, the computer on your desk is more powerful than an early Cray. But if you tried to do weather modelling or finite element analysis with both, the Cray would win.
Re:Full Price? WHY?!? (Score:2, Interesting)
Re:Dumb Question... (Score:3, Interesting)
They new in advance what they could likely achieve with this cluster and they have surpassed what they were expecting. Now with some more tweaking they may take it a bit further. It's like a race car engine, you know the specs but once you get it and tune it you can often surpass the specs by a wide margin.
Power PC 970 and G5 (Score:3, Interesting)
"The IBM with a PowerPC 970 was a first choice but the earliest delivery date would have been January 2004."
"On June 23 Apple announced the G5."
I was under the impression that the G5 was a Power PC 970. Is it just some derivative of the Power PC 970... or what?
Memory errors? (Score:3, Interesting)
How, pray tell, are they planning on detecting these errors? I can understand how you could reduce the frequency of errors with only a slight loss in performance, ie take some sort of checksum of your data after every x number of cycles, but that doesn't eliminate the errors, only reduces their frequency. Maybe it reduces the frequency by enough that you don't need to worry about it, especially if 'x' is a sufficiently small number, but it still seems like a pretty risky prospect to me.
Anyone seen any actual TECHNICAL details on this point, ie not just some Mac fan yelling "Deja Vu, DEJA VU!!!"?
Re:Full price? (Score:5, Interesting)
You'd think apple would at least sell G5's to VT without SuperDrives
OTOH, five years from now, when they have the world's 65,000th fastest supercomputer, they could just pull the thing apart and give/sell complete computers to their students. Then it's back to the Apple Store to order up a whole lot of G7's.
Re:Anyone find the efficiency of this thing? (Score:5, Interesting)
For comparison, ASCI Q (#2 on Top500) reaches 68% efficiency, MCR Linux Cluster (currently #3, but to be pushed by by this new Mac cluster) reaches 69% efficiency, and the #1 spot, Earth Simulator, reaches a quite impressive 88% efficinecy.
Of course, there are other ways to measure efficinecy. When it comes to performance/price, this Mac cluster does very well, even if you do take into account the real costs (ie MUCH more than just the $5.2 million up front cost). For cost/power consumption it seems reasonable, but not outstanding. 10TFlops/1.5MW of power is ok, and not too far off the Earth Simulator's 35TFlops/3.5MW of power, but it's certainly nothing to write home about. Cray's next big cluster, Red Storm, is likely to get over 30TFlops when it's released, but will consume only 2.0MW of power.
Re:Anyone find the efficiency of this thing? (Score:1, Interesting)
No.
If you're going to measure the gigaflops per dollar of a computing system and use that to compare one computing system to another, you have to normalize all variables. If you're going to count the cost of the building, then you have to count the cost of the building the Earth Simulator is in, too.
Either way, the Virginia cluster is the most cost-effective supercomputer ever constructed.
Run the numbers for yourself.
Re:Memory errors? (Score:2, Interesting)
I'm just guessing, but you'd probably implement the same ECC mechanism in software that ECC memory does in hardware.
A quick google shows that ECC memory typically uses Hamming codes (or similar variations), which is pretty much what you'd expect. Skimming a few of the links, it would appear that most ECC memory is designed to correct a 1-bit error on a word. It is entirely possible that you can have the right combination of bit-errors that will slip past the ECC, regardless of whether it was implemented in hardware or software.
It does seem a bit tedious to implement it in software, though. Each read and write to memory would have to be wrapped in the code that reads/detects or generates/writes the ECC bits to another location in memory.
For the curious, you can learn more about Hamming codes here [rad.com].
Re:building supercomputer with desktops sucks (Score:3, Interesting)
Also, I've heard that the system controller supports 16GB of ram but that Apple has only certified 1GB DIMMs so far. This would seem likely as a lot of Macs can accept more memory than initially advertised... only because larger memory modules became common (I put 1GB of ram in an old wallstreet G3 powerbook for someone and got it running even though it's officially rated at 512MB,.. I've got a sony from the same period here that absolutely won't take more than 256MB in to slots)
Re:Favorite Quote - Correction About Apple (Score:3, Interesting)
The $5.2M figure seems to just be the Towers (Dual 2Ghz + 4GB RAM is $4814 with the standard educational discount, mulitply by 1100 and you get $5295400). What was the additional cost of the Infiniband cards and switches, the Cisco switches, the racks, and the cooling equipment? Were any modifications necessary for the building (more power, etc)?
Re:interesting points (Score:3, Interesting)
Speaking of cache, somewhat under-reported in the technical press was IBM's revelations of its upcoming Power5 server architecture. [theinquirer.net] Yup, that's four dual-core processors each with 2MB of L2 cache, and four 36MB L3 cache chips all in the same package. IBM is leveraging it's packaging advantages against Intel's process advantages. Well, that, and making each processor die dual-core multithreaded.
Re:interesting points (Score:4, Interesting)
If by "about 1.3Ghz", you mean 1.5Ghz, then, yes, Itanium only goes up to 1.5Ghz. But at 1.5Ghz is faster than the fastest 3.2Ghz Pentium 4. With a decent process and less cache, it could easily scale to 2+ Ghz.
" but the Itanium is neither cheap nor cool (130W!)"
This has to do with the fact that the CPU has 3MB of cache on it. That makes the die huge which makes the CPU expensive. It also makes it heat up like a toaster. As a comparison, the latest Pentium 4s are ~90W, and they only have 512K of cache.
"In the performance arena, Moore's law is useless unless chip designers figure out how to use MORE transistors to compute more quickly."
My statement was that, for a given performance level, Itanium uses less transistors than RISC. Itanium was *designed* to use more transistors. That's why the instruction set is designed to produce code that runs well in paralell. RISC CPUs have to figure out what can be run in paralell in hardware - Itanium does it in the compiler.
Answers (Score:3, Interesting)