Inside the PowerPC 970 163
daveschroeder writes "Jon "Hannibal" Stokes has posted a long-awaited, very detailed analysis of the IBM PowerPC 970 at Ars Technica. Notable quote: 'The 970 was made for Apple'."
"Sometimes insanity is the only alternative" -- button at a Science Fiction convention.
DUPE (Score:4, Informative)
Re:DUPE (Score:1, Offtopic)
It's a bug. Patch is available (Score:2)
Link here. [slashdot.org] In your browser, find "CmdrTaco", click on the checkbox next to it, and then go to the bottom and click "submit" (rough translation from swahili: submit = "apply patch".
[JUUUUST kidding, don't do this or you won't see any more of CmdrTaco's articles.]
Actually, a real suggestion (Score:2)
Nonsense (Score:3, Interesting)
Bullshit. When I worked foy the University Daily Paper we had no problem avoiding duplicate stories all over the paper... And we ran FAR MORE THAN 30 STORIES A DAY.
In my example it was a bunch of drunk/high/rushing out to get laid coward students--Can't professionals who are being paid do their damn job right do AT LEAST as good as the wasted college kids?
deja vu (Score:5, Funny)
Re:deja vu (Score:1, Offtopic)
Dupe? (Score:5, Funny)
One long read... (Score:1, Interesting)
Re:One long read... (Score:1, Interesting)
Re:One long read... (Score:1)
btw poeple, how is my 1st post a troll?
In the market for a 64-bit workstation? (Score:5, Insightful)
Sun: Nice hardware, very expensive, CDE.
AMD: Commodity hardware, cheap, WinXP.
HP: Intel hardware, very expensive, CDE or WinXP.
I think I know what I'd buy.
Of course, the Athlon64/Opteron would get quite a bit of consideration due to my hobbies.
But I think it'd end up being the Mac.
Re:In the market for a 64-bit workstation? (Score:4, Insightful)
Re:In the market for a 64-bit workstation? (Score:3, Informative)
http://wwws.sun.com/software/star/gnome/ [sun.com]. Also, you take a shot at CDE rather than Solaris? Wow.
Earth-to-poster: Linux runs in 64-bit [com.com]. Thank you.
Re:In the market for a 64-bit workstation? (Score:2)
Well the original post was fair by saying "Nice hardware," and, really, CDE is still the default offering in Solaris 9. Sun's GNOME 2 is very promising and looks great, but they are still refining it until it is worthy of being the default desktop.
Sun's workstations really are great machines, in spite of what SPEC zealots say. For example, few people mention that typical PC graphics cards look like crap relative to even the elderly Creator 3D (trans
Re:In the market for a 64-bit workstation? (Score:5, Interesting)
It's a dupe (Score:1, Redundant)
Re:It's a dupe (Score:2)
On-call doesn't mean "I'm at The Matrix Reloaded and can check e-mail when I get back." And if it did, I certainly wouldn't be here at work right now.
Re:It's a dupe (Score:2)
There has been a change ... (Score:4, Funny)
Is this the G5? (Score:2)
Re:Is this the G5? (Score:4, Informative)
Mot actually had a G5 on the roadmap. They apparently got all the way to samples, but then ditched the effort. There never was a competition per se wrt the G5 name. There was a bit of friction over AltiVec, as IBM wanted to focus on clock speed and didn't think AV was worth the complexity (and hence why Mot came out with the G4 while IBM stuck with the G3). Motorola hasn't been serious about the mainstream cpu market for a while as they've been losing money on it. They'd rather focus on things like embedded proccies and cell phones (and related chips).
I don't know which came first, Mot ditching G5 so Apple pleads with IBM to come out with 970. Or Mot gets whiff of 970, so sees a way out of doing G5. Perhaps others more "in the know" can chime in?
Re:Is this the G5? (Score:2)
Re:Is this the G5? (Score:2)
Wow, ditching a whole CPU has got to hurt. At least Motorola, however, knows how to steer around the iceburg to focus on their core business.
Re:Is this the G5? (Score:4, Informative)
Re:Is this the G5? (Score:2, Insightful)
Re:Is this the G5? (Score:2)
Re:Is this the G5? (Score:2)
The Motorola G5 is a much smaller and lower-power chip than it was originally going to be. As a result, Apple obviously asked IBM to make a next-generation PowerPC to be the heart of their G5 computers.
The G designations are just code-names, or marketing names. The Power Mac G4 has had a number of different CPU
Re:Is this the G5? (Score:2)
Inaccuracy, Part I (Score:4, Informative)
Apple'd be putting DDR400 on the G4 right now if they could. None of this (well, except the decision to go Moto) was their fault.
My real problem with the current G4e situation, aside from the 167 SDR FSB, is the fact that it's a shared bus topology, which is just ridiculous. To my knowledge, there's nothing stopping Apple from putting out a chipset that gives each G4e a dedicated FSB (even if it's still 167MHz SDR) to the chipset.
As far as the low MHz and SDR situation, I've also never been totally convinced that Apple wasn't partially to blame for this either, unless they just have zero clout with Moto SPS.
Re:Inaccuracy, Part I (Score:2, Interesting)
Re:Inaccuracy, Part I (Score:1)
Isn't it obvious... (Score:5, Funny)
Tierce
Dual FPUs! (Score:4, Insightful)
Yeah, yeah, they are hog-tied because you can't easily re-compile the entire windows platform to use new instruction sets. Linux users, of course, don't have this problem (muhahahah).
Did anyone else catch the bit on the twin FPU's? I'm just imagining what this thing is going to do with vector operations and frequency transforms.
For most of you non-engineers:
-Most 3d vector operations are affine tranformations. Using a 4x4 array of floating point numbers you can translate, rotate, and scale. Works beautifully, but it's a lot of calculations.
-The Fast Fourier Transform (FFT) is used a lot in signal processing. It's a floating point monster.
Turning the FFT into an integer monster. (Score:3, Interesting)
That, where phi is any angle. That being the case, it seems to me that you could pick your values phi to co
integer FFTs aren't uncommon (Score:4, Insightful)
Some really old C code doing something along these lines is available here [www.jjj.de].
Re:Dual FPUs! (Score:3, Funny)
Yeah, but they only work when the aftermarket mini-turbochargers are attached and a fiberglass spoiler is added to the heatsink. The resulting turbo lag adds latency that really defeats any advantage of the second FPU. It's really too bad, because the 970 could have pushed Photoshop easily into the 12s.
Re:Dual FPUs! (Score:2)
drop AltiVec (Score:2, Insightful)
This is probably true and rather unfortunate. AltiVec is important for Apple marketing because it lets them claim impressive performance figures without actually needing to push the state of the art in terms of processor design further th
nope. (Score:4, Interesting)
You're pretty thankful for your Altivec then...
I saw such an insane improvement in Reaktor when it got Altivec enhanced...
Re:nope. (Score:3, Insightful)
Re:nope. (Score:2)
If you're thinking of an Intel/AMD SIMD that gives you a 1% improvement then it would make sense to drop it from a next-gen chip that is 100% faster.
Altivec often gives double or more performance, though. Since the CPU-intensive Mac apps all use it, it makes sense to keep it in there and just make it faster, too.
Altivec is also exactly what you'd want if you are doing DSP and encryption and encoding MPEG. We do a lot of that on the Mac. All you have
Re:nope. (Score:1, Flamebait)
But there's like three people in the world who actually use altivec. Hardly optimizing for the common case... (that said, I do run mplayer which I think relies on similar SIMD instructions on the x86 to provide realtime img post-processing).
Wouldn't it be cheaper for all concerned for apple to seriously subsidize a PCI photoshop accelerator (basically a card full of DSPs and RAM with a wide bus)?
I'm thinking back to the days when I would read BYTE and to drool over this black (really cool, in those da
Re:nope. (Score:3, Insightful)
More than 3 people have ripped music in iTunes. Then there's the tremendous acceleration it provides for encoding DVDs, Final Cut Pro's real-time effects, BLAST, and plenty more. It's not even close to just Photoshop.
yep (Score:3, Insightful)
Re:nope. (Score:2, Informative)
Re:nope. (Score:2, Interesting)
They actually made two: the Quadra 660AV and the Quadra 840av (there was also a centris line without the fancy stuff). The 660 used a 25MHz 68040, and the 840 used a 40MHz 68040, and had a seperate DSP that you had to write specifically for.
Apple's thought was kinda ahead of the curve at the time, in that they were
Re:nope. (Score:2, Insightful)
But if you spend the same $3800 on x86 hardware, you get a small compute cluster that runs a lot of software faster and without AltiVec optimizations. For most scientific applications, as well as most video and audio applications, that's probably a better deal in terms of bang-for-the-buck, but, admittedly, it's probably
Re:nope. (Score:2)
Clustering multiple machines together is still problematic for stuff like this.
It makes writing music with software instruments/fx bloody annoying as well.
Scientific stuff is a whole nother ballpark, just pointing out that I know I'm not the only musician who uses the Altivec stuff a hell of a lot...
Re:nope. (Score:4, Insightful)
Coding for a cluster introduces all kinds of communication and synchronisation headaches, especially since it takes such a long time to communicate between nodes (1ms is a very long time in terms of a CPU).
Re:nope. (Score:2, Interesting)
They are just different. Clustering allows program-level parallelism, which gives you nearly linear scaling for throughput for arbitrary programs with no programming effort. SIMD and vector processors are very specialized and require a lot of effort to use well, and that effort is usuall only worth it if you need to lower processing latencies.
(Note, incidentally, that one of the most important SIMD machines, the Connection Machine, was built as a
troll? (Score:3, Interesting)
I do agree with you that clustering could be far more useful than it currently is, but as you say, anything that requires low latency is kind of problematic...
As far as clustering goes, you know you're able to put together a PC processing monster and use VST System Link [steinberg.net] ?
Been considering this to add to my TiBook...
Re:troll? (Score:2)
On the Mac side, Apple's Shake sends jobs out over the network using Rendezvous to find more compute power. Logic and Final Cut Pro will start doing this soon. This kind of stuff is easy to do right on Mac OS X.
Re:troll? (Score:2)
Re:nope. (Score:2)
You benefit especially in PowerBooks where you do these massive computations on a low-power CPU.
Altivec apps (off the top of my head): Pro Tools, Logic, Cubase, Performer, iTunes, Final Cut, Avid, iMovie, iDVD, iPhoto, QuickTime, Mac OS X (Quartz, CoreAudio, Disk Copy, more), Photoshop, Illustrator, Dreamweaver, Fireworks, Flash, FreeHand.
You have to run an Intel chip at 70 watts to get it to brute force these computations as fast as a 15 watt chip with Altivec. Get
Re:drop AltiVec (Score:2, Insightful)
(A) SIMD is really bloody fast if you use it. And Apple does. Heavily. Would you want to rewrite OSX, significantly slowing it down, to create an altivecless version?
(B) Apple has gone through two major transtions: 68k->PPC, Mac OS Kernel->BSD kernel. Another rewrite requiring transition is possible, but over something this small? That seems unlikely. And I bet users would be THRILLED when some apps just stop working.
(C) The other option is to just crack those 128 bit instructions dow
Re:drop AltiVec (Score:2, Insightful)
Er ... OS X doesn't need to be rewritten. It runs on Altivec-less G3s, and probably
Re:drop AltiVec (Score:5, Interesting)
ATLAS [sourceforge.net] is a BLAS implementation that is tuned for each system that it runs on. The people at Mathworks use this as the underlying BLAS system in Matlab. Mathematica Maple, etc. [sourceforge.net] use this as well. There is even a G4/AltiVec optimized version available here [sourceforge.net]. This is the whole point of layered software.
Re:drop AltiVec (Score:3, Insightful)
IIRC, it will partically evaluate the code against the known size of the input, and I think also do some data-driven special-casing.
Basically, it beats the pants off standard-library FFTs.
While I'm at it, responding to grandparent:
Would you care to elaborate? I mean, if you're not writing against known ultra-optimized libraries, what business do you have expecting your software to run fast? Th
FFTW speed claims (Score:2)
Re:drop AltiVec (Score:2)
You simply do not necessarily need to work directly with BLAS (and LAPACK for that matter) to enjoy the fruits of ex
Re:drop AltiVec (Score:5, Interesting)
My iTunes ripping of mp3s nearly tripled when I went from a 466 MHz G3 to a 400 MHz G4 due to iTunes being optimized for AltiVec.
Some Photoshop actions and filters see up to 800% improvments.
Running iMovie exports on a 600 MHz G3 iMac take 2-300% longer than on a 400 or 500 MHz G4
Altivec - Logic Audio; value of Mac platform (Score:2)
Re:drop AltiVec (Score:3, Informative)
Re:drop AltiVec (Score:5, Informative)
No, AltiVec is important for Apple full stop - in the short term to make up for the anemic bus speeds allowed by the G4, and in the longer term because a SIMD unit is now as expected a component of modern desktop CPUs as an FPU is.
And even something like a hand-coded vectorized BLAS library doesn't help because most scientific software still doesn't use such libraries
The only thing you can really sure about "most" scientific software is that it needs an FPU. Scientists and engineers do a huge variety of simulations, some of which are vectorizable and some of which aren't.
If AltiVec has a weakness in the scientific field, it's the lack of support for double precision. And there's nothing in the instruction set which precludes this, so I wouldn't be surprised to see it appear in some future CPU.
Imagine how much better it would be if Apple could ship systems based on the 970 today, rather than after a few months additional delay due to AltiVec.
If it didn't have AltiVec, it wouldn't be what Apple needs in a desktop CPU - not much point in getting what you don't need a few months early (not like that would happen anyway: this isn't lego: you can't unplug "the AltiVec bits" without any impact on the rest of the design).
And every dollar and watt that is shaved off the AltiVec price makes it a much more viable processor for servers and blades, which would get volume up and prices down.
Except that Apple aren't currently in the blade market at all, and have a fairly small presence in the more general server market. If they can sell a few boxes there, fine, but getting the volume up means targetting consumers - not server farms.
Re:drop AltiVec (Score:5, Interesting)
Don't confuse "new" with "state of the art". The former is just something that hasn't been done before. The latter is something that yields "impressive performance figures". If Altivec is competitive with Intel, then it is state of the art, by definition, even if it's 20 years old. The CPU cache is a decades old concept, yet CPUs with caches are still state of the art.
Imagine how much better it would be if Apple could ship systems based on the 970 today, rather than after a few months additional delay due to AltiVec.
Don't underestimate the cost of software. Your idea is expensive, because it requires software vendors to maintain two different versions of their code. This can lead to buggier or more expensive products, or it can lead to the "abandonment" of the G4 installed base. That could easily be worth the few months for Apple.
Re:drop AltiVec (Score:5, Insightful)
Re:drop AltiVec (Score:4, Informative)
Re:drop AltiVec (Score:2)
Yet you keep saying they have nothing better to do than recompile their code whenever Intel blesses them with a ne processor, like SPEC tells them to.
Re:drop AltiVec (Score:2)
I have actually consistently argued against using Intel's Pentium 4 compiler (and Itanium as well), for pretty much the same reasons I think AltiVec is a bad idea.
But, hey, don't let little details like facts get in the way. After all, you are on a crusade, and if the cause is as worthy as Holy Apple, anybody who doesn't agree with you is a mortal enemy,
Re:drop AltiVec (Score:2)
Re:drop AltiVec (Score:2)
And what does that have to do with anything? SPEC scores can be measured with many different compilers, and, obviously, I prefer to use SPEC scores that don't rely on weird compilers like the Intel compiler.
But even if people use Intel compilers for benchmarking, that still tells us a lot. For the comparisons you seem to be obsessed with, the P3/G4, the gcc and Intel scores for the P3 are similar. P4 systems beat G4 systems both in terms of top
Re:drop AltiVec (Score:2)
Sure, if you think that 10% higher scores (ICC 7.0 against gcc 3.2.2) or even 20% (against gcc 2.95) are "similar" - and that includes "evil" SSE support on gcc. And if you happen to use MS C...
Last but not least: even if all you care about is raw, non-SIMD performance - unless you used a 604 based Mac over a Pentium+ PC, why should I (or infact anybody) listen to you?
Re:drop AltiVec (Score:2)
When an app is updated to use Altivec, you typically see double the performance. So you can run 40 effects in real-time and then you get the Altivec version of your app and you can run 80 effects in real-time.
Photoshop filters are the least of Altivec's uses. Encoding MPEG-2 and MPEG-4, encryption of any kind, Digital Signal Processing
Mac users are doing heavy audio, video, encodin
Imagine a Beowulf clu... (Score:1, Funny)
Eh, nevermind.
Friend's friend. (Score:2)
Ah frig'..
Apple is going along
Up to marketing, not technology (Score:4, Interesting)
For me, the most interesting part of the article concerns the pricing of the new machines as the real question. According to the author, the chip will make Apple machines technologically competitive. The question is, will Apple price them to gain market share, or continue to sell to a disappearing niche of luxury computer buyers.
Maybe Apple's concentration on developing software, and selling that software (rather than giving it away), along with its new business ventures, such as .Mac and the new iTunes online music store, point to a new business model that can afford to cut the margins on hardware.
If they don't lower the price of their machines -- the top ones, namely -- they will suffer, long-term. I don't think they need to be on par with PC's; I just think they cannot be too much more expensive than the PC's.
Apple's done fine with this pricing for decades (Score:2)
And I agree in theory on your pricing opinion, but it's just that in reality Apple have been pricing their machines in pretty much this way for 20 years, and they have made a very successful business out of it. They've also continuously been pronounced on the verge of death for all those 20 years.
So I don't expect either Apple pricing or their good fortunes to chang
Re:What the heck is 'Altivec' anyway? (Score:1, Informative)
Re:What the heck is 'Altivec' anyway? (Score:1)
for instance?
Re:What the heck is 'Altivec' anyway? (Score:5, Informative)
Re:What the heck is 'Altivec' anyway? (Score:5, Informative)
"The CPU does important stuff."
For all of your "What is AltiVec?" needs, check this out instead:
http://www.motorola.com/SPS/PowerPC/AltiVec/
Re:What the heck is 'Altivec' anyway? (Score:1)
Re:What the heck is 'Altivec' anyway? (Score:3, Funny)
The hue of the sky is determined by a phenomenon known as the "Tyndall Effect", the scattering of light through a colloid by dust or molecules suspended in a transparent medium.
Note that the light scattering that determines what color you see isn't due to dust in the air, as some think, but rather oxygen and nitrogen molecules.
However, all we are, as Bill and Ted once pointed out, dust in the wind, dude.
</t-i-c>
Re:What the heck is 'Altivec' anyway? (Score:5, Informative)
In a much simplified analogy, it's like lighting 200 candles with a flame thrower instead of one by one with a match.
Re:What the heck is 'Altivec' anyway? (Score:4, Informative)
Well, not really, but you're close. You can't just pass the Altivec unit an array of numbers and tell it to do some operation on them. Altivec (and MMX, etc) simply allows you to process the data in bigger chunks that normal.
Altivec can process 128 bits of data at a time. For example, it can add 16 8-bit integers to another 16 8-bit integers, resulting in yet another vector of 16 8-bit integers with a single instruction, rather than doing them one at a time.
Re:What the heck is 'Altivec' anyway? (Score:1, Informative)
Re:What the heck is 'Altivec' anyway? (Score:2, Funny)
Re:What the heck is 'Altivec' anyway? (Score:2, Informative)
You've not been looking at the distributed.net results, have you? The Altivec/VMX technology currently used by Moto and soon to be used by IBM is LEAGUES ahead.
Re:What the heck is 'Altivec' anyway? (Score:2)
With RC5-64 that was true. Unfortunately for RC5-72, no one has written an optimized Mac core yet so the PC versions are way faster.
Re:What the heck is 'Altivec' anyway? (Score:2)
Re:What the heck is 'Altivec' anyway? (Score:2)
Re:What the heck is 'Altivec' anyway? (Score:3, Informative)
What ARE you blathering about? Pentium 4 has SSE2, PowerPC has Altivec - here's a clue for you, when people code for x86 SIMD, they choose MMX, SSE and SSE2, they don't choose 3D Now!, when people code for SIMD under PowerPc ISA, they choose Altivec. Both SSE2 and Altivec are available to day, both are used in "commodity" CPU families. I think you'll find that it's "x87" FPu strength that typically marks out AMD's current CPUs, not their patchy implementati
Re:Another Dupe (Score:1)
But this time in the apple section (Score:2)
If you haven't seen it, it's new to you.
Floggings will continue until moral improves! (Score:3, Funny)
"But Mom, I don't want to go to France!" "Shut up and keep rowing!"
Re:Floggings will continue until moral improves! (Score:2)
Good question (Score:2)
Whining-by-proxy, substitute whining, pitch-whiners, designated whiners, ghost whiners and stand-in whiners are all permitted (first-round whiners, all levels), but only for Rhode Island, New Jersey, some part
Re:I've never done this before but... (Score:1)
These guys get paid? I always assumed it was a run-from-my-mom's-basement operation - what else could explain the poor spelling, frequent dupes (god help you if it's +-5 days from April Fool's), and the ho'ing out of the site to Microsoft advertisements?
Yet amazingly... (Score:2)
There are also 7th graders who can spell and edit better than these guys. It's really embarrassing.
In spite of all these horrible shortcomings, we get all of the good things Slashdot provides for free. It's not perfect. They make mistakes.
Get over it. Anyone can be a sharpshooter, waiting for someone else to screw up. But it takes a lot of hard work and dedication to put something like Slashdot together. Cut the guys some slack, or create your own website and call it Gripeslash.