Forgot your password?
typodupeerror
OS X Businesses Operating Systems Technology (Apple) Apple Technology

Apple Wins VT in Cost. vs. Performance 105

Posted by pudge
from the i-knew-it-all-along dept.
danigiri writes "Detailed notes about a presentation at Virginia Tech are posted by by an attending student. copied most of the slides of the facts presentation and wrote down their comments. He wrote some insightful notes and info snippets, like the fact that Apple gave the cheapest deal of machines with chassis, beating Dell, IBM, HP. They are definitely going to use some in-house fault-tolerance software to prevent the odd memory-bit error on such a bunch of non-error-tolerant RAM and any other hard or soft glitches. The G5 cluster will be accepting first apps around-November." mfago adds, "Apple beat Dell, IBM and others based on Cost vs. Performance alone, and it will run Mac OS X because 'there is not enough support for Linux.'"
This discussion has been archived. No new comments can be posted.

Apple Wins VT in Cost. vs. Performance

Comments Filter:
  • Interesting (Score:3, Interesting)

    by Isbiten (597220) <isbiten&gmail,com> on Monday September 08, 2003 @01:41PM (#6901612) Homepage
    Dell - too expensive [one of the reasons for the project being so "hush hush" was that dell was exploring pricing options during bidding]

    Who could have guessed? ;)
    • > Dell - too expensive [one of the reasons for the project being so "hush hush" was that dell was exploring pricing options during bidding]

      Here's my question, and no, I'm not trying to start a flame war...if they were going purely on a price/performance ratio, why didn't they let Dell "explore pricing options"? Presumably, Dell would have given them a better deal due to educational and/or prestige factors, if Dell really wanted the deal. If that was the case, the uni may have been better served with the
      • Re:Interesting (Score:3, Insightful)

        by Ster (556540)
        You misunderstand: it was hush-hush while Dell was exploring pricing options. Only after they came back with their lowest-price did Apple win the contract.

        At least, that's the way I've been parsing it.

        -Ster
      • Re:Interesting (Score:2, Interesting)

        by Killigan (704699)
        I've heard (from a fairly reliable source) that the project was actually postponed about 5 months or so while Dell worked on lowering it's prices for VT, finally Dell gave it's lowest possible price, which still wasn't good enough for VT, so they did indeed give Dell plenty of time to try and beat Apple.
      • >> Presumably, Dell would have given them a better deal due to educational and/or prestige factors, if Dell really wanted the deal. If that was the case, the uni may have been better served with the Dell option.

        Oh, really? Does Dell have a competitive 64-bit solution? I don't think so. Even the 32-bit dual Xeon Dell is more expensive the dual G5 Power Mac. Don't bother mention Itanium2, because it's too hot and expensive, and there is hardly any native apps, which might be why people are not buying t
    • by reporter (666905) on Monday September 08, 2003 @10:21PM (#6906355) Homepage
      Even if Apple computers were to cost slightly more than Dell computers, we should consistly buy the former instead of the latter. Price is only 1 aspect of any product. There are also ethical considerations. They do not matter much outside of Western society, but they matter a great deal in Western society.

      As an American company, Dell is a huge disgrace. Please read the "Environmental Report Card [svtc.org]" produced by the Silicon Valley Toxics Coalition [svtc.org]. Dell received a failing grade and is little better than Taiwanese companies, which are notorious for destroying the environment and the health of workers. Dell even resorted to prison labor [svtc.org] to implement its pathetic recycling program.

      ... from the desk of the reporter [geocities.com]

      • My problem with Dell is that they tend to send you less than you paid for. I ordered a Dimension 8250 last December, which I am using to write this post. The sound card was not what they claimed and wouldn't accept Creative drivers, and I recently discovered that they didn't send a backup copy of Norton Antivirus as claimed.
  • by Anonymous Coward
    How does that work?? How does software even KNOW if there was a glitch? Can I get this on my non-ECC Linux box????

    IMHO the lack of ECC RAM is the only flaw in an otherwise perfect machine (well that, and the massive HEAT).
    • Mmmmm... I bet that to know how does that work one needs at least a PhD on CS or a bunch of them, anyway. Believe me, 'em these people academics are SMART.

      In my humble ignorance, I can devise a simple stratagem (surely far simpler, very inneficient and dumber than the one used by VT). Just duplicate all calculations (effectively halving processing power) on different machines, chances the same error hitting both machines would be vanishingly small.. If a discrepancy in results is found, just recalculate.

    • >> IMHO the lack of ECC RAM is the only flaw in an otherwise perfect machine (well that, and the massive HEAT.

      You have to look a little further, and don't let the number fool you. The reason for the huge heat sink and 9 individually controlled fans in the G5 is reduce noise level.

      The G5 consumes about 40W at 1.8 GHz, which is much more efficient than both the 1.5 GHz Itanium 2 (130 W) or the 3 GHz P4 (75W ?).
  • whew! (Score:4, Informative)

    by Anonymous Coward on Monday September 08, 2003 @01:57PM (#6901833)
    3 MW power, double redundant with backups - UPS and diesel

    1.5 MW reserved for the TCF

    2+ million BTUs of cooling capacity using Liebert's extreme density cooling (rack mounted cooling via liquid refrigerant)

    traditional methods [fans] would have produced windspeeds of 60+ MPH
  • by Anonymous Coward
    Does it render?

    Yes, incidentally, it does. The units came with high end graphics cards


    Aside from games, when is a high end graphics card needed for rendering and not just displaying a rendering.
    • Aside from games, when is a high end graphics card needed for rendering and not just displaying a rendering

      IANAS, but:
      • Graphical representation of turbulance systems?
      • Wheather analysis?
      • Any graphical representation of Chaotic systems?
      Like I said, IANAS, but there HAS to be a reason, methinks.
      • by selderrr (523988) on Monday September 08, 2003 @02:52PM (#6902584) Journal
        for any represantation, you need only 1 graphics card : the one the monitor is attached to. Parallelizing realtime display-only stuff is not much good since you'd lose to much time in data transmission.

        So they could equip one G5 with a radeon9800 and let that one display the results. No need to buy another 1099 Radeons.
        • Actually, you're not totally right. You can spread the rendering job across multiple radeon chips, each handling only a portion of the display. The performance and depth of the rendering could be greatly enhanced. SGI does something like this...
          • that is useful only of those multiple cards are IN THE SAME MACHINE. In the cluster case, those cards are spread over multiple computers, requiring that you transfer the rendered result over the network to the "master" video card which sends it to the monitor. I seriously doubt the efficiency of such a solution for realtime display.
            • Well, Ok, they're not planning on using the machines in this manner. I have seen it done where a cluster was used to display the data in portions. All had access to the shared memory containing the real-time event data. Each machine was tasked with doing some processing and displaying one aspect of the data. Not exactly real-time video rendering; but, this was in '88 using Sun Ultrasparc's
            • That's not necessarily true. Over at Stanford for the project they built a graphics system with 32 PCs that render to a tiled display. Imagine a display made of 1000 monitors in a 40x20 grid. That would be pretty freaking cool.
              • One of the Stanford grad students (Greg Humphreys) that was on that distributed GL system projects came to UVA last year. WireGL they call it. And it is really awesome.

                Johann

              • Imagine a display made of 1000 monitors in a 40x20 grid. That would be pretty freaking cool.


                Or better yet, a display made of 1000 monitors in a 50x20 grid, so that it would make sense.

        • Unless you wanted to use the GPU on the Radeon for instructions it would handle well, which is quite probable.
      • graphics in science (Score:3, Interesting)

        by trillian42 (674714)
        I am a scientist, and lots of money gets put into transforming the tons of numbers that supercomputers produce into images that make sense to the human brain.

        The system doesn't have to be chaotic, just complex:

        Watching protein folding simulations.
        Watching full 3-D seismic waves propagate through the Earth.
        Watching, in general, any kind of 3-D model or simulation of a complex process evolving over time.

        A couple links:

        The Scripps Institute of Oceanography Visualization Center:
        http://siovizcenter.ucsd
    • by WasterDave (20047) <(moc.pekdez) (ta) (pevad)> on Monday September 08, 2003 @08:07PM (#6905563)
      Not all renders are real time, not all renders are onto a screen.

      Now that "consumer" graphics cards run in floating point and have comparitively complex shader engines, it's quite possible to start working on rendering movies etc. with the substantial quantity of hardware acceleration possible on these things. You don't have to hit 60fps, and you can have as many passes as you like.

      Mind you, with 1100 nodes if you can render a frame in 45 seconds .... on a twin G5 with a Radeon 9800 ... then you can render 24fps in real time. Real time lord of the rings, anyone?

      Dave
      • That would be really funny to see. I'd bet you'd get an average 24fps over some period of time, but that doesn't mean you'd get a constant 24 frames * Hz (one frame every 1/24th second). It could be rather jerky unless sophisticated timing stuff is going on.
  • by dhall (1252) on Monday September 08, 2003 @02:32PM (#6902364)
    One of the primary concerns for a multi-node cluster is insured latency among all components within the cluster. It doesn't have to be the fastest, it just needs to insured exacting timing for latency across all nodes. IBM can do this with their "wormhole" switch routing on SP and has done this with Myranet on their Intel X-series clusters.

    From most of my reading with Infiniband, it was designed from the ground up as a NAS style solution, than for large multi-node cluster computing. I'm curious as to if they have any issues with cluster latency.

    http://www.nwfusion.com/news/2002/1211sandia.htm l

    The primary timings and white papers I've seen published for Infiniband have been for small clustered filesystem access. Although it's burst rate is much higher than Myranet, it's hard to find any raw retails for their multiple node latency normalization.

    I hope it scales, since Intel's solution appears to be less cost prohibitive than some of the other solutions offered on the market, and would really open up the market even for smaller clusters (16-36 node) for business use.

  • by gnuadam (612852) * on Monday September 08, 2003 @02:35PM (#6902392) Journal

    I wonder if by "lack of support in linux," that they're refering to the fact that the fans are controlled by the operating system in the powermac? Or the fact that there are relatively few support companies for ppc linux?

    Any insiders care to comment?

  • ECC FUD (Score:5, Informative)

    by J0ey4 (233385) on Monday September 08, 2003 @02:35PM (#6902395)
    Okay before we get going with the same discussion about ECC vs. Non ECC, and all the flames start from people perusing slashdot who think they are more in the know than the PhD's at VT who have been working on this for months I want to point a few things out.

    1. The majority if not all of the bit errors that ECC corrects are caused by thermal noise. Thermal noise is an issue in a cluster of rack mounted 1U units due to the difficulty of cooling such tightly spaced units generating so much heat in so small a space. It is not an issue in a cluster of DESKTOP machines utilizing a Liebert system with way more cooling capacity than is needed.

    2. Even if somehow a none-thermal bit error occurs, each node has 4GB RAM. The probability of it being in an OS or application critical (especially given the converging nature of many long running calculations) piece of RAM as opposed to an empty piece of RAM is small.

    How many of you are reading this from a desktop without ECC RAM that has an obnoxiously huge uptime? ECC is a non-issue in a well-cooled cluster of desktop cased machines.
    • Re:ECC FUD (Score:1, Interesting)

      by Anonymous Coward
      The probability of it being in an OS or application critical (especially given the converging nature of many long running calculations) piece of RAM as opposed to an empty piece of RAM is small.

      Errr, what is the point of putting 4+ GB into your cluster nodes if you're not going to use it? This isn't a SETI@home cluster. Seems to me that "long running converging apps" tend to have large datasets associated with them. The higher the data density per node the less network bandwidth needed except for "em

    • by dsb (52083)
      Quote
      1. The majority if not all of the bit errors that ECC corrects are caused by thermal noise. Thermal noise is an issue in a cluster of rack mounted 1U units due to the difficulty of cooling such tightly spaced units generating so much heat in so small a space. It is not an issue in a cluster of DESKTOP machines utilizing a Liebert system with way more cooling capacity than is needed. /Quote

      Why is it necessary then to jam 1U units stacked on each other? If you can get the same performance and storage c
    • Re:ECC FUD (Score:4, Interesting)

      by Anonymous Coward on Tuesday September 09, 2003 @01:22AM (#6907313)
      2. Even if somehow a none-thermal bit error occurs, each node has 4GB RAM. The probability of it being in an OS or application critical (especially given the converging nature of many long running calculations) piece of RAM as opposed to an empty piece of RAM is small.

      Think before you post. The failure rate is constant in each memory chip (actually it goes up a bit with higher capacity due to higher density). Unless you setup the memory to be redundant (which the G5 can't do either...) you will experience MORE errors since a good OS tries to use the empty memory for things like file buffers.

      How many of you are reading this from a desktop without ECC RAM that has an obnoxiously huge uptime? ECC is a non-issue in a well-cooled cluster of desktop cased machines.

      Sigh... this is a 2200-cpu *cluster*. Here's a primer on statistics. Assume the probabiliy of a memory error is 0.01% for some time interval (say a week or month). The likelyhood for a perfect run is then 99.99% on your single CPU, which is just fine. Running on 2200 CPUs, the probability of not having any errors is 0.9999^2200=0.8, or 20% probability of getting memory-related errors somewhere in the cluster.

      The actual numbers aren't important - it might very well be 0.01% probablility for an error per year, but the point is that when you run things in parallel the chance of getting a memory error *somewhere* is suddenly far from negligible.

      ECC is a cheap and effective solution that almost eliminates the problem. Incidentally, one of the challenges for IBM with "Blue Gene" is that with their super-high memory density even normal single-bit ECC might not be enough.

      But, what do I know - I've only got a PhD from Stanford and not VT....

    • That attitude towards ECC, and other forms of hardware error detection and correction, has led people into building supercomputers that were expensive disasters, like the ILLIAC IV. What's the point of having a fast supercomputer if you have to run a job two or three times to have some confidence in the results?

      There is nothing worse than having a computer without ECC or parity memory, and trying to detect and diagnose subtle pattern sensitivity memory problems.

      Besides thermal noise, you also have to co

  • neat. (Score:4, Interesting)

    by pb (1020) on Monday September 08, 2003 @02:37PM (#6902408)
    Looks like the costs come out to $23,636 per node, or $4727 per machine. According to the Apple Store, an equivalently specced machine (dual proc G5, 160GB HD, 1GB RAM) comes out to just a little over $3,000. I suppose you might want a display on the management machine in each node, but that won't raise the price that much (say, $3,200 per machine instead). So that leaves ~$1,500 per machine for the networking hardware and whatever other expenses.
    • Re:neat. (Score:1, Informative)

      by Anonymous Coward
      And keep in mind that the RAM on each machine is 4 GB.
      • by pb (1020)
        I wasn't sure how they were using the word 'node' there. That would raise the price to... $5,120.00 per machine! Their consumer prices for RAM must be hugely inflated, seeing as how you could get a 1U dual 2Ghz Opteron with 4GB RAM for $4,500.00...
    • ECC memory too.
      And never forget the costs of installing these puppies. Cooling systems, power busses, cable harnesses, UPS, Diesel backups, Air filtering, locks, redundant parts.
      and what about the disk servers....
      • by pb (1020)
        I don't know if the cost of installing them was included in that estimate, but maybe it was.

        As for the disk 'servers', I figured they were just sharing all of the 160GB HDs over the network, seeing as how 160GB x 1100 ~= 176TB (ok, it's more like 172TB, but who's counting...)
      • No ECC...G5's don't support it.
    • Re:neat. (Score:5, Informative)

      by confused one (671304) on Monday September 08, 2003 @04:39PM (#6903702)
      Read on. They're putting 8GB of RAM in each machine.
      • Actually, Srinidhi Varadarajan (who gave the first portion of the presentation) said that there would only be 4GB of RAM in each machine. Why not 8GB, I don't know.

        -Waldo Jaquith
  • Dude... (Score:5, Funny)

    by yoshi1013 (674815) on Monday September 08, 2003 @02:39PM (#6902433) Homepage
    At this point all I really want to know is what the hell does 1100 G5s look like???

    Certain things are easy to imagine in large quantities, but dude.

    Just....dude....

  • by BortQ (468164) on Monday September 08, 2003 @02:47PM (#6902523) Homepage Journal
    The very last slide states that
    Current facility will be followed with a second in 2006
    It will be very interesting to see if they also use macs for any followup cluster. If it works out well this could be the start of a macintosh push into clustered supercomputers.

    • I caught that too. Use of Macs in 2006 no doubt depends on 2 factors: 1) how well the 2003 cluster works out, and 2) how the Mac compares to competitors in 2006. Could be a nice win for Apple, again, if they manage to keep both 1 and 2 competitive. Which remains to be seen, and I'm holding my breath.
      • by eweu (213081) on Monday September 08, 2003 @04:09PM (#6903382)
        I caught that too. Use of Macs in 2006 no doubt depends on 2 factors: 1) how well the 2003 cluster works out, and 2) how the Mac compares to competitors in 2006. Could be a nice win for Apple, again, if they manage to keep both 1 and 2 competitive. Which remains to be seen, and I'm holding my breath.

        I don't know. Holding your breath until 2006 sounds... dangerous.
  • Nice rack! (Score:2, Interesting)

    by Alex Reynolds (102024) *
    If they do not fit into a standard rack enclosure, I would be curious to learn what customization was required to rack the G5s.

    (Especially seeing as a G5 XServe will probably be at least several months away -- at least until most of the desktop orders can be filled.)

    -Alex
  • by mTor (18585) on Monday September 08, 2003 @03:42PM (#6903101)
    Could someone please shed some light on this:
    Why so secret? Project started back in February; secret with Dell because of the pricing issues; dealt with vendors individually because bidding wars do not drive the prices down in this case.
    Why exactly is that? Is there a collusion between the vendors since there's so few of them? Does anyone have any experience with this sector?
    • Probably due to in small part to the G5 not being public at the time.
      • by mTor (18585)
        I was actually referring to the last sentence:

        "dealt with vendors individually because bidding wars do not drive the prices down in this case."

        I don't think they've even dealt with Apple until Apple's G5 announcement but they did deal with other vendors. I'm interested why VU dealt with all of them individually and why do prices not come down when you deal with them in this way. This is why I was alluding to collusion.
        • Because when dealing with an open bidding process, often times you end up with Guy #2 coming in with a bid just SLIGHTLY under Guy #1, and so on, until you're left at the end with a price that, while cheaper than Guy #1's quote - is still fscking expensive.

          Silent bidding with all the potential vendors knowing that you are getting bids from other vendors means they don't just fudge numbers to come in lower than the other guy.

          And before you start - 3 fixed rounds of bidding just means round after round of f
  • by dpbsmith (263124) on Monday September 08, 2003 @04:00PM (#6903289) Homepage
    I believe you can get a VT [utk.edu] for well under $1000, and I've even heard that some of them now support advanced "sixel" graphics.

    And they scroll MUCH more smoothly than OS X.
  • Sorry, i couldn't help myself. Really i am. Go ahead an mod this to the deep bowels of /.. I'l soo sorry i did this.
  • I want to know how VT was able to do it's cost analysis so fast.
    From what I've heard, VT ordered the G5 the day they came out, or shortly after. But if one were to perform a cost vs performance, they would need background data. Also, they should have been hesitant to accept Apple's specs on the machine, and hoped for some real world test, or maybe some in-house testing of a few machines.
    I find it hard to believe that VT was able to truly compare the G5 to competitor products, with out prior data of the ma
    • Re:Cost Analysis (Score:5, Insightful)

      by 2nd Post! (213333) <gundbear&pacbell,net> on Monday September 08, 2003 @11:55PM (#6906872) Homepage
      I bet at the time of initial consideration of vendors, there were no competitive Opteron or Itanium solutions (none with chassis, the slides say), and I am also willing to bet that Apple had at least a hardware prototype they could demonstrate, at least a motherboard + dual CPU setup, even if the chassis was incomplete and the not all the major subsystems were 100%

      Just enough to demonstrate that Apple *would* have a solution, and enough that VT could narrow down the decision to a possible, pending the actual production and purchase of a single machine... then, the contract being 99% complete, they just had to sign a couple papers and purchase, overnight, 1,100 dual G5s.

      On the flip side I bet they had a similar contract in the wings with other vendors, all pending on 'simple' bottlenecks.
  • And someone decides I'm a guy. Thanks. Not that any of that matters of course... as per the cluster, they've been assembling at an amazing pace - such that the shifts I volunteered for have all been cancelled as they will not be needed after all. They even cabled the sucker without me... sadness! -Myuuchan

The first Rotarian was the first man to call John the Baptist "Jack." -- H.L. Mencken

Working...