Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Apple Businesses Hardware

Virginia Tech Supercomputer Up To 12.25 Teraflops 215

gonknet writes "According to CNET news and various other news outlets, the 1150-node Hokie supercomputer rebuilt with new 2.3 GHz Xserves now runs at 12.25 Teraflops. The computer, the fastest computer owned by an academic institution, should still be in the top 5 when the new rankings come out in November."
This discussion has been archived. No new comments can be posted.

Virginia Tech Supercomputer Up To 12.25 Teraflops

Comments Filter:
  • by millahtime ( 710421 ) on Tuesday October 26, 2004 @07:14AM (#10629568) Homepage Journal
    Currently they aren't doing anything with them except getting them up and running. Status is listed at...
    Assembly - Completed!
    System Stablization - In Progress
    Benchmarking - In Progress

    When up and going the system will probubly do some high end scientific calculations.
  • by ehmdjii ( 622451 ) on Tuesday October 26, 2004 @07:20AM (#10629590) Homepage Journal
    this is the official homepage of the listing:

    http://www.top500.org/
  • Re:2.3GHz? (Score:5, Informative)

    by mmkkbb ( 816035 ) on Tuesday October 26, 2004 @07:21AM (#10629593) Homepage Journal
    they were sold off by MacMall at a slight discount around 6 months ago, along with a certificate of authenticity and a "property of virginia tech" sticker
  • by joib ( 70841 ) on Tuesday October 26, 2004 @07:36AM (#10629638)

    Would be interesting to know exactly what stuff do these machines do? Maybe they would even be able to share some code so that people can fiddle around with it optimizing


    I don't know about the VT cluster specifically, but here's a couple of typical supercomputer applications that happen to be open source:

    ABINIT [abinit.org], a DFT code.

    CP2K [berlios.de], another DFT code, focused more on Car-Parinello MD.

    Gromacs [gromacs.org], a molecular dynamics program.


    (should be fun)


    Well, if optimizing 200 000 line Fortran programs parallelized using MPI sounds like fun to you, jump right in! ;-)

    Note: Above applies to abinit and cp2k only, I don't know anything about gromacs except that it's written in C, not Fortran (though inner loops are in Fortran for speed).

    Oh, and then there's MM5 [ucar.edu], a weather prediction code which I think is also open source. I don't know anything about it, though.
  • Re:Density (Score:5, Informative)

    by UnknowingFool ( 672806 ) on Tuesday October 26, 2004 @07:36AM (#10629639)
    Not necessarily. Processing power doesn't really scale linearly like that. Add 4 times as many processors doesn't mean the speed will increase 4x.

    First, as they try to increase the speed of the system, the bottlenecks start becoming more of a factor. Interconnects is one big obstacle. While the new System X may use the latest and greatest interconnects between the nodes, they still run at a fraction of the speed that the processors can run.

    Also the computing problems that they are trying to solve may not scale either with more processors. For example, clusters like this can be used to predict and simulate weather. To do so, the target area (Europe for example) is divided into small parts called cells. Each node takes a cell and handles the computations of that cell.

    In this case adding more processors does not necessarily mean that each cell is processed faster. Getting 4 processors to do one task may hurt performance as they may interfere with each other. More likely the cell is further subdivided into 4 smaller cells and the detail of the information is increased not the speed. So add 4x processors only increases data 4x but it doesn't mean that the data is solved any faster.

  • Re:Wow! (Score:1, Informative)

    by Knx ( 743893 ) on Tuesday October 26, 2004 @07:45AM (#10629661) Homepage
    Correct me if I'm wrong, but it actually looks an awful like a Beowulf cluster by nature.

    Oh, and btw: here [vt.edu] are some pictures.
  • Re:Density (Score:4, Informative)

    by luvirini ( 753157 ) on Tuesday October 26, 2004 @07:55AM (#10629691)
    Indeed, Breaking up computational tasks to smaller pieces that can be processed by these architectures is on of the biggest challenges in the high end computing.

    Many processes are indeed easy to divide to parts. Take for example ray-tracing, you can have one processor run each ray if you want, getting huge benefits compared to singleprocessor designs. But many tasks are such that the normal way of calculting them requires you to know the previous result. Trying to break up these tasks is one of the focuses in the reserearch around supercomputing.

  • by erick99 ( 743982 ) <homerun@gmail.com> on Tuesday October 26, 2004 @08:00AM (#10629702)
    If power was only equated to speed then you would be correct. However, as other posters have pointed out, there are several reasons why a Cray is a more powerful system besides sheer speed.
  • by TAGmclaren ( 820485 ) on Tuesday October 26, 2004 @08:12AM (#10629741)
    Currently they aren't doing anything with them except getting them up and running


    Their site is out of date then: http://www.wired.com/news/mac/0,2125,65476,00.html ?tw=newsletter_topstories_html [wired.com]
    Now that the upgrade is complete, System X is being used for scientific research. Varadarajan said Virginia Tech researchers and several outside groups are using it for research into weather and molecular modeling. Typically, System X runs several projects simultaneously, each tying up 400 to 500 processors.


    If there's a Wired article, and a Cnet article, go with the Wired article every time. It's written by people who love tech.
  • Re:hrm (Score:2, Informative)

    by TimothyTimothyTimoth ( 805771 ) on Tuesday October 26, 2004 @08:51AM (#10629901)
    If you are thinking along these lines you might already be aware of this link, but if not, might I recommend:

    http://singinst.org/index.html [singinst.org]

  • Re:Speed at top (Score:3, Informative)

    by Anonymous Coward on Tuesday October 26, 2004 @09:05AM (#10629986)
    The system isn't just in the top 5 (or at least top 10), but it's the cheapest by a factor of
    at least 2.

    The $5.8M number is how much the computers (and maybe racks) cost, not the whole system. AFAICT, that number appears leaves out US$2-3M worth of InfiniBand hardware that somebody (probably Apple) must've "donated" so it wouldn't show up as part of the purchase price. IB gear costs ~US$2k/node in bulk, on top of the cost of the node itself. It's highly unlikely someone else could build this exact configuration for US$5.8M without serious underwriting or hardware donations. Heck, I can't even get the Apple online store to give me a price on a G5 Xserve that includes an education discount, and I work for a fairly large public university.

  • There were AI CPUs (Score:2, Informative)

    by scattol ( 577179 ) on Tuesday October 26, 2004 @09:10AM (#10630017)
    For a while there were CPUs specifically designed to run LISP [andromeda.com], aka AI . Symbolics was one of the better knowns one.

    It failed in bankrupcy. My vague understanding was that the designing dedicated LISP processors was hard and slow and with little resources they could not keep up. Essentially the Symbolics computers ran LIPS pretty quickly given the MHZ but SUN and Intel kept moving up the MHZ faster than Symbolics could keep up. In the end there were not speed advantage to a dedicated LISP machine, just an increase in price. Economics might change eventually. Who knows.
  • Re:hrm (Score:3, Informative)

    by Glock27 ( 446276 ) on Tuesday October 26, 2004 @09:38AM (#10630211)
    By the way, IBM BlueGene/L is going to produce 360 teraflops by end 2004, so if the report of Moravec's estimate is correct, and he is correct, that AI Overlord welcome could be pretty soon.

    If you read the article (I know, I know) you'll find that the peak performance of this Cray system is 144 teraflops with 30,000 processors.

  • Simulations (Score:5, Informative)

    by Ian_Bailey ( 469273 ) on Tuesday October 26, 2004 @09:51AM (#10630300) Homepage Journal
    The vast majority of clusters are for simulating very complex systems that require lots and lots of calculations.

    You can get a few hints by looking just at their names.

    The number one "Earth Simulator Centre" [top500.org] is fairly self-explanatory, going to their website [jamstec.go.jp] show they create a variety of models for things such as weather, tectonic plate movement, etc.

    The number 3 LANL supercomputer [com.com] "is a key part of DOE's plan to simulate nuclear weapons tests in the absence of actual explosions. The more powerful computers are designed to model explosions in three dimensions, a far more complex task than the two-dimensional models used in weapons design years ago." I imagine that most US government simulations would be doing something simmilar.
  • Re:hrm (Score:2, Informative)

    by fitten ( 521191 ) on Tuesday October 26, 2004 @10:10AM (#10630450)
    Depends on the site and their main focus. The Earth Simulator in Japan (#1 on the list) for example, is used to simulate and predict weather. Various machines at some of the national labs in the USA are used to simulate nuclear events. Some other machines in the biotech industries are used to do protien folding and things like attempting to simulate a human cell. Financial institutions use them to attempt to predict the economy, the stock market, and the like. Automobile manufacturors use them to simulate crash tests. Aeronautic firms use them to simulate new vehicles.

    In the past, there has been talk about companies that exist solely to supply compute power. Such a company would have a warehouse full of computers and control them through schedulers (batch, etc.) and sell time on the machines to anyone who wanted it. So far, I don't think anyone has been successful with the idea yet.
  • Re:hrm (Score:5, Informative)

    by autophile ( 640621 ) on Tuesday October 26, 2004 @10:14AM (#10630483)
    According to Wired [wired.com]...
    Now that the upgrade is complete, System X is being used for scientific research. Varadarajan said Virginia Tech researchers and several outside groups are using it for research into weather and molecular modeling. Typically, System X runs several projects simultaneously, each tying up 400 to 500 processors.

    "At the end of the day, the goal is good science," he said. "We're just building the tools. The top 500 is nice, but the goal is science."

    --Rob

  • by fitten ( 521191 ) on Tuesday October 26, 2004 @10:42AM (#10630746)
    The reason is this.. more and more of these 'supercomputer' entries appear to be many machines hooked up together, possibly doing a distributed calculation.

    However, would projects such as SETI, GRID, and UD qualify with their many thousands of computers all hooked up and performing a distributed calculation ?

    If not, then what about the WETA/Pixar/ILM/Digital Domain/Blur/You-name-it renderfarms ? Any one machine on those renderfarms could be put to use for only a single purpose: to render a movie sequence. Any one machine could be working on a single frame of that sequence. Does that count ?


    Yes, all of these mentioned belong to a class of supercomputer applications called "Embarassingly Parallel". These types of algorithms are (by far) the easiest to implement since their calculations don't depend on anything being calculated by other nodes in the system. They are characterized by minimal/no communication among nodes (many times, just the communication to hand the node the data on which it is to compute and then one communication at the end to submit the results back to the central node) and lots of compute resources working on the local data. So, they *are* supercomputing of a type, just one that isn't that interesting from a computer science point of view.

    There are many problems that require much more communication between the nodes. Calculations performed by one node is dependent upon the results generated by other nodes in the system. Some known solutions require so much communication/synchronization between nodes that it isn't practical to parallelize the problem and a serial solution is more optimal. There has been lots of work on various problems creating algorithms that are more "parallel friendly" in order to speed the solutions. There are some problems that have prizes for someone who can invent a way to make them more parallel.

    Anyway, there are many "grains" of parallel computing. The "granularity" of a problem is the ratio of the amount of communication required per computation of the solution algorithm. Coarse grained problems are problems that have low communication/computation requirements. Embarassingly parallel problems are an example of this. The amount of communication required by a distributed node running SETI@HOME is very tiny compared to the amount of computation required for each WU. The same can be said for a render farm. Each node in a render farm receives the image to be rendered, then goes off for a while and renders that frame of the movie and hands the result back to the coordinator, which gives the node the next frame to render. Fine grained problems are the other side of the spectrum and require more communication for each computation operation. Solutions to systems of equations are an example. A problem that has to communicate its results to each of its "neighbors" and receive the results from each of its neighbors on each iteration so that the next iteration can be calculated is more finely grained.

    Beowulf clusters, with their slow interconnects, are good at solving coarse grained problems. Other systems, like the new Cray with the high bandwidth, low latency interconnects, work better for fine grained problems. In a fine grained problem, a machine with "slow" processors but a fast interconnect may outperform a Beowulf type cluster that has the fastest commodity CPUs available but a slow interconnect.

    By the way, LINPack (the benchmark used for the Top500) is a rather coarse grained problem. That's why Beowulf style clusters appear in the list. There are plenty of other benchmarks that could be used where these clusters would have a hard time.
  • by Anonymous Coward on Tuesday October 26, 2004 @11:00AM (#10630905)
    If you look at the latest list, VT is already out of the top 5. They are in 7th. The new list is here:

    http://www.netlib.org/benchmark/performance.pdf (Page 54)

    IBM is first with BlueGene (PowerPC 440), but is also 3rd with their 3564 CPU PowerPC 970 2.2 GHz JS20 system.

    1) 36010 - BlueGene/L DD2 - 16384 0.7 GHz PowerPC 440
    2) 35860 - Earth Simulator - 5120 NEC processors
    3) 20530 - IBM eServer BladeCenter JS20 - 3564 2.2 GHz PowerPC 970 G5
    4) 19940 - QsNetII Intel Tiger4 - 4096 Itanium 2 1.4 GHz
    5) 19564 - NASA Project Columbia SGI Altix 3000 - 4032 Itanium 2 1.5 GHz
    6) 13880 - ASCI Q AlphaServe EV-68 - 8160 Alpha 1.25 GHz
    7) 12250 - Virginia Tech Apple Xserve - 2200 2.3 GHz PowerPC 970 G5
    8) 11680 - BlueGene/L DD1 - 8192 0.5 GHz PowerPC 440
    9) 10310 - IBM eServer pSeries 655 - 2880 1.7 GHz POWER4+
    10) 9819 - Dell PowerEdge 1750 - 2500 3.06 GHz Xeon

    BTW, I wouldn't be surprised to see the 2.3 GHz Xserves announced by Apple in January for general consumption.

Software production is assumed to be a line function, but it is run like a staff function. -- Paul Licker

Working...