Apple M3 Pro Chip Has 25% Less Memory Bandwidth Than M1/M2 Pro (macrumors.com) 78
Apple's latest M3 Pro chip in the new 14-inch and 16-inch MacBook Pro has 25% less memory bandwidth than the M1 Pro and M2 Pro chips used in equivalent models from the two previous generations. From a report: Based on the latest 3-nanometer technology and featuring all-new GPU architecture, the M3 series of chips is said to represent the fastest and most power-efficient evolution of Apple silicon thus far. For example, the 14-inch and 16-inch MacBook Pro with M3 Pro chip is up to 40% faster than the 16-inch model with M1 Pro, according to Apple.
However, looking at Apple's own hardware specifications, the M3 Pro system on a chip (SoC) features 150GB/s memory bandwidth, compared to 200GB/s on the earlier M1 Pro and M2 Pro. As for the M3 Max, Apple says it is capable of "up to 400GB/s." This wording is because the less pricey scaled-down M3 Max with 14-core CPU and 30-core GPU has only 300GB/s of memory bandwidth, whereas the equivalent scaled-down M2 Max with 12-core CPU and 30-core GPU featured 400GB/s bandwidth, just like its more powerful 12-core CPU, 38-core GPU version.
Notably, Apple has also changed the core ratios of the higher-tier M3 Pro chip compared to its direct predecessor. The M3 Pro with 12-core CPU has 6 performance cores (versus 8 performance cores on the 12-core M2 Pro) and 6 efficiency cores (versus 4 efficiency cores on the 12-core M2 Pro), while the GPU has 18 cores (versus 19 on the equivalent M2 Pro chip).
However, looking at Apple's own hardware specifications, the M3 Pro system on a chip (SoC) features 150GB/s memory bandwidth, compared to 200GB/s on the earlier M1 Pro and M2 Pro. As for the M3 Max, Apple says it is capable of "up to 400GB/s." This wording is because the less pricey scaled-down M3 Max with 14-core CPU and 30-core GPU has only 300GB/s of memory bandwidth, whereas the equivalent scaled-down M2 Max with 12-core CPU and 30-core GPU featured 400GB/s bandwidth, just like its more powerful 12-core CPU, 38-core GPU version.
Notably, Apple has also changed the core ratios of the higher-tier M3 Pro chip compared to its direct predecessor. The M3 Pro with 12-core CPU has 6 performance cores (versus 8 performance cores on the 12-core M2 Pro) and 6 efficiency cores (versus 4 efficiency cores on the 12-core M2 Pro), while the GPU has 18 cores (versus 19 on the equivalent M2 Pro chip).
Ok (Score:1)
But if the overall performance is still better than previous chips then who cares? Like all the morons here talking about needing more pcie lanes and then can't ever say they hit a bandwidth limit.
Re: (Score:2, Insightful)
The onus is on you to define "better performance". We can discuss your point once we know what you actually mean.
Re: (Score:1)
Is the M3 faster than every M1/M2 benchmark? This isn't complicated.
Re: Ok (Score:1)
Re: (Score:2)
I have found it very difficult to hit 400GB/s on my M1 Max.
It can do it, but only in pretty unrealistic workloads.
That's not a fault of the chip, it's just the complication of using hardware bandwidths and assuming software will ever be able to achieve them.
The CPU block (on the M1 Max anyway) has 200GB/s of bandwidth, and the GPU has about 300GB/s.
If the GPU is unloaded, the CPU can hit 200 pretty easily, however when the GPU is loaded, the bottleneck on the CPU makes it fairly
Re:Ok (Score:5, Insightful)
I don't have a dog in this fight because I'm not looking to buy a new laptop from anyone in the next 18 to 24 months, but really this isn't hard. Consider the following postulated decision flow:
1. M2 has X memory bandwidth, but usage statistics show it's almost never used because performance is bottlenecked elsewhere in the system. Thus, each part is more expensive because you're including capacity that is never used.
2. M3 development possibly made two different prototypes due to #1; one with an equal X memory bandwidth, and one with X - n memory bandwidth. These two parts were then tested to see what the difference, if any, is.
3. They discovered that only the highest core count M3 actually required X memory bandwidth, and chips with less active cores never got into using it, so having two parts with different memory bandwidth capacities makes sense.
This is how this kind of thing usually comes about - the design has been refined and instead of just putting a massive memory controller and memory on regardless of cost, they've tuned it a bit better to reduce part and fab costs and complexity in order to deliver what's needed to make the whole system perform better, but not have unused expensive capacity unless it actually can be used.
Really, it's no different than putting really fast RAM on your Ryzen CPU that doesn't get you any more performance than buying cheaper RAM that clocks 1:1 with the memory controller. Yes, technically there is more bandwidth there because the clock is higher, but it won't ever be used by the system during real operation.
I don't know why people would be bitching about having a better tuned system, other than the fact that Apple will never pass on the cost savings being realized by having a better tuned configuration in the products.
Re: Ok (Score:1)
Re: (Score:2)
"I don't know why people would be bitching about having a better tuned system..."
Are people bitching? It seems more a problem for Apple apologists who are losing a talking point. Most people wouldn't know or care.
"Yes, technically there is more bandwidth there because the clock is higher, but it won't ever be used by the system during real operation."
But it can be used by fanboys for the claims of superiority.
Re: (Score:2)
Apple users spend a lot less time worrying about justifying themselves than you think they do.
Re: (Score:2)
and I have never justified using a non apple product in my entire life
Re: (Score:2)
and I have never justified using a non apple product in my entire life
To be fair, the only justifications you need are: It cost 1/3 the price and plays all the games a Macbook pro cant.
Re: (Score:2)
Re: (Score:2)
Apple users spend a lot less time worrying about justifying themselves than you think they do.
Yeah, they only spend about 98% of their time doing it, rather than 99%.
Re: (Score:2)
There main limitation in most M3 systems is going to be RAM. 8GB shared with the GPU is not a lot. My phone has more than that.
Re: (Score:2)
8 GB even not shared with the GPU isn't a lot. My 13 year old (admittedly high end) laptop came with 16 (rocking 32 after a recent upgrade), and the obsolete PC I rescued from my in-laws after they replaced theirs had 6. It was a cheap desktop, about a decade old. It now has 16.
8G is low end these days. Great on a, raspberry pi, pathetic on a premium machine.
Re: (Score:2)
At work the M1 with 8GB RAM I use outperforms the 2012 Mac Pro Trash Can we have with 128GB RAM and many more cores in the CPU (I forget the chip). Needless to say, both far outperform my 8GB RAM Raspberry Pi.
I'm no EE so I'm not going to make claims about the technical details that make this the case, but from observation I know it's true. The best I can liken it to is that just because a BMW Z4 and a Honda Civic both have a 2.0 liter 4 cylinder engine doesn't mean you ought to expect the same performance.
Re: (Score:2)
So I don't really know how to parse what you said.
Your 8GB Air will absolutely perform worse than the Mac Pro Trash Can if you try to do something that has a 16GB working set.
My M1 MBA has 16, my coworker's has 8. He came to regret the 8 pretty quickly, since it's pretty easy to end up with ~20GB of actively used memory in macOS with a busy desktop, turning your poor air into a swapping nightmare.
Even 16 ended up b
Re: (Score:2)
If you're doing work which requires less than 8G then why on earth did you buy a 128G machine a decade ago? That must have cost a fortune.
RAM doesn't help performance until you don't have enough of it. If your data in ram approaches the limit of memory your performance will tank. Before that though, RAM is faster than any SSD so file caches or ram disks can hugely boost performance.
Re: (Score:2)
8 GB even not shared with the GPU isn't a lot. My 13 year old (admittedly high end) laptop came with 16 (rocking 32 after a recent upgrade), and the obsolete PC I rescued from my in-laws after they replaced theirs had 6. It was a cheap desktop, about a decade old. It now has 16.
8G is low end these days. Great on a, raspberry pi, pathetic on a premium machine.
Memory requirement generalities on one Platform have little to do with those on another Platform.
macOS just seems to do better (much better!) than, say Windows. It's just a fact.
And people buying a Budget laptop are probably not editing 8 streams of 4k Video as part of their Job.
Re: (Score:2)
Memory requirement generalities on one Platform have little to do with those on another Platform.
That's just not true. The memory floor varies from platform to platform, but a browser will be the same level of hoggyness on any platform.
macOS just seems to do better (much better!) than, say Windows. It's just a fact.
Lol. Sure, but I'm comparing Linux and OSX, not Windows and OSX.
And people buying a Budget laptop are probably not editing 8 streams of 4k Video as part of their Job.
They're also not buying a mac
Re: (Score:2)
Memory requirement generalities on one Platform have little to do with those on another Platform.
That's just not true. The memory floor varies from platform to platform, but a browser will be the same level of hoggyness on any platform.
macOS just seems to do better (much better!) than, say Windows. It's just a fact.
Lol. Sure, but I'm comparing Linux and OSX, not Windows and OSX.
And people buying a Budget laptop are probably not editing 8 streams of 4k Video as part of their Job.
They're also not buying a mac since it's not a budget laptop. It's the brand that gives you budget specs for premium prices.
1. You never mentioned Linux.
2. Why do you bring up Browsers?
3. Your In-Laws ran Linux? Sure. . .
4. Budget Specs? Not what reviewers and owners are saying about Apple Silicon Macs.
Re: (Score:2)
1. I never mentioned windows. Why did you assume.
2. They are memory hogs and everyone used one
3. Eh?
4. Yeah budget specs. 8G is very much a budget machine level of ram.
Re:Die Size (Score:1)
I took note of the small die size for the base M3 chip. That's the one they are going to ship the most of. That one needs to have the lowest cost to maintain margins in all the products that have it (laptops, mini, iPad pro, etc).
Re: (Score:2)
Utter trash reasoning. They have a "Pro" model with 8GB of ram? They realized a lot of people needing a Pro laptop can do with 3GB ram, sometimes 6GB? We are in Chromebook territory. It is an insult to buyers. Not that they haven't been insulting buyers for decades, but now feels more like spitting in their faces while calling them names and making middle finger signs, all the while trimming their bank accounts. Is Apple bad? No, but they have become gutless, despicable, and they didn't need to do it.
Why not go try one, and see if you can make it laggy with a workload appropriate for an entry-level machine of any Platform?
Re: (Score:2)
Re: (Score:1)
You do, considering you're posting about it. Worried?
Re: (Score:2)
My workstation has video cards with nVidia and Intel chipsets, a QDR Infiniband HBA, a 4xu.2 HBA and a couple extra 1xm.2 to PCIe adapters. I don't think I've ever hit the LIMIT of any of the PCIe 4.0 connectors on my motherboard, but I'm certainly happy that I have enough lanes to have that much I/O available to me.
(The Intel GPU is present because it offers hardware video compression support that nVidia and AMD don't provide; Infiniband is far and away the cheapest way to get a fast connection to my file
Re: (Score:2)
Why would you care that the board has that much bandwidth available to you if a) you're not using it, b) you can't possibly use it?
That's the scenario here. This isn't a nerf, it's an architectural optimization. There is next to zero chance they did something so stupid as to decrease performance by decreasing memory bandwidth, after all the time and effort they put into chip development.
Re: (Score:2)
Why would you care that the board has that much bandwidth available to you if a) you're not using it, b) you can't possibly use it?
That's the scenario here. This isn't a nerf, it's an architectural optimization. There is next to zero chance they did something so stupid as to decrease performance by decreasing memory bandwidth, after all the time and effort they put into chip development.
This.
Re: (Score:2)
accessing memory over PCIe is very slow and high latency. You likely don't hit bandwidth limits because there is a natural tendency for high bandwidth devices like GPUs to avoid accesses that cross between CPU's memory controller and the PCIe bus.
Re: (Score:2)
You're the moron. Here is a fine example of what happens when you don't have enough PCIe bandwidth. The card loses about 8-10% of its performance when going from PCIe Gen 4 to Gen 3.
https://www.tomshardware.com/r... [tomshardware.com]
Re: (Score:2)
Re: (Score:2)
That is an example of what happens when you limit the bandwidth of a system that is using that bandwidth. Yes, its performance drops.
However, if you limit the bandwidth on a system that does not use it, there will be no performance cost.
This isn't rocket science. Try not to look like so much of a moron while calling people a moron in the future.
Re: (Score:2)
Wow, whoosh.
That is an example of what happens when you limit the bandwidth of a system that is using that bandwidth. Yes, its performance drops.
However, if you limit the bandwidth on a system that does not use it, there will be no performance cost.
This isn't rocket science. Try not to look like so much of a moron while calling people a moron in the future.
Exactly!
For an alleged tech-site, Slashdot sure seems to have a dearth of meatsacks that have even a modicum of actual Engineering savvy. . .
Re: Ok (Score:2)
Re: Ok (Score:2)
M3 is one better than M2, so of course the performance is better! /sarcasm
Re: (Score:2)
You seem awfully upset that Apple is selling a $7k laptop.
Re: (Score:1, Troll)
Upset? I am overcome with laughter.
In another thread, SuperKendall felt it was important to point out that Apple offered a $7200 configuration. Why would he be motivated to determine that?
Rather pathetic troll, though. Try harder.
Re: architectural superiority (Score:2)
Re: (Score:2)
And yes, one of the first things I tried to do was max out its bandwidth. It's exceptionally difficult to do, and no actual real task I've seen comes anywhere close, so it really does make sense to reduce it if it frees up more die space for other things.
Re: (Score:2)
I bought my $7200 MBP all by myself ;)
And yes, one of the first things I tried to do was max out its bandwidth. It's exceptionally difficult to do, and no actual real task I've seen comes anywhere close, so it really does make sense to reduce it if it frees up more die space for other things.
Wow!
What did you do to do that? Write an Application specifically designed to hammer memory?
Re: (Score:2)
The trick is that bandwidth is unevenly shared between the GPU and the CPU. If you're using the GPU too hard, you will starve the CPU, and hurt the ability of your software to hit the maximum actual bandwidth.
Re: (Score:2)
OpenCL kernel + small tool to talk to it and hammer memory itself, as well as exhaust the SLC at need for accurate tests.
The trick is that bandwidth is unevenly shared between the GPU and the CPU. If you're using the GPU too hard, you will starve the CPU, and hurt the ability of your software to hit the maximum actual bandwidth.
As far as user Experience goes, prioritizing the GPU probably makes sense.
Re: (Score:2)
For example, CPU alone, I can hit 200GB/s. GPU alone, I can hit 300GB/s.
Combined would be 500, which is obviously more than I have.
But when I have the GPU pegged at 300 and the CPU pegged at 100 (trying to do 200), they fight over that last 100 very hard. The GPU tends to win I think simply because it can issue bus requests faster than the CPU can.
From the rest of the article (Score:4, Informative)
"Taken together, it's presently unclear what real-world difference these changes make to M3 performance when pitted against Apple's equivalent precursor chips in various usage scenarios, especially given that the latest processors include new Dynamic Caching memory allocation technology which ensures that only the exact amount of memory needed is used for each task. "
Maybe that Dynamic Caching makes it hard to compare and probably:
- The performance might be very close
- You might not notice any difference
Re:From the rest of the article (Score:4, Insightful)
It will probably be like nVidia 4000 series cards, it will do really well at low resolutions but fall flat on its face when compared to the 3000 version at high resolutions. 3060ti vs 4060ti is a great example.
Binning? (Score:4, Interesting)
I wonder if we'll see an M3 Macbook Air at some point. Sticking the M3 only in the iMac and 14" Pro will ensure there isn't a huge demand for it and it will also keep margins higher. Maybe the MBA will skip the M3 generation altogether, or maybe if yields of Pro and Max binned chips are bad enough and demand for the iMac and 14" MBP w/ vanilla M3 are low enough, Apple will throw an M3 refresh MBA out there.
The only thing about the M3 that currently looks appealing is AV1 decode support. My guess is that the M4 generation, still on "3nm" but maybe a better yielding version, will probably "fix" a lot of these oddities with core count and memory bandwidth on the M3 generation.
Re: (Score:2)
Re: (Score:2)
Form factor wise, I am envious of his.
But sometimes I do find myself happy that I got mine. The big screen is nice. It's just not very portable.
Re: (Score:2)
I got the 16". My CEO has the 14".
Form factor wise, I am envious of his.
But sometimes I do find myself happy that I got mine. The big screen is nice. It's just not very portable.
It's a bigger screen in virtually the same size machine as yesteryear's 15" MBP.
What's not to like?
Re: (Score:2)
Depends on how much lifting the word 'virtually' is doing there, really.
Re: (Score:2)
Depends on how much lifting the word 'virtually' is doing there, really.
This comparison is from 2019; but that 16" is close to the current MBP 16".
14.09" x 9.68" for 2019 16" MBP. 14.01" x 9.77" for the 2023 16" MBP. And the 2019 15" MBP was 13.75" x 9.48".
As I said, virtually the same size.
https://9to5mac.com/2019/11/14... [9to5mac.com]
https://www.apple.com/macbook-... [apple.com]
Re: (Score:2)
It's big, and it's heavy. I still use my air for field work.
Re: (Score:2)
I thought that was pretty clear- it's not very portable.
It's big, and it's heavy. I still use my air for field work.
Whatever.
Re: (Score:2)
The smaller machine is infinitely more portable.
I won't deny that I don't love the big screen on the 16" when I'm not lugging it around, though.
Re: (Score:2)
I suspect Apple with settle into some sort of tick-tock-boom cadence with chip releases similar to this:
* introduce a new hardware platform at the low end for the air, etc.
* release the 'tock', a linear improvement from the original (eg. the M2)
* release the 'boom' - aka the "maximum capability" for the platform
Run that out for 6-8 years, then introduce a new architecture and start over...
Re: (Score:2)
I saw a Youtube video from Snazzy Labs that discussed these points and speculated that it's due to yields from the TSMC 3NB process, which is presumably what these chips are produced on (like the A17 Pro in the new iPhone Pro 15). I'm guessing that due to poor yields, Apple finds themselves having to disable cores in quite a few of the M3-based chips, which is why we see decreased memory bandwidth and core count along with so many different variations of the chip vs. in prior generations. Does that have a performance impact? I'm sure it does. Does it make the chips perform worse than the M1 and/or M2 in any real-world benchmarks? I guess we'll find out soon.
I wonder if we'll see an M3 Macbook Air at some point. Sticking the M3 only in the iMac and 14" Pro will ensure there isn't a huge demand for it and it will also keep margins higher. Maybe the MBA will skip the M3 generation altogether, or maybe if yields of Pro and Max binned chips are bad enough and demand for the iMac and 14" MBP w/ vanilla M3 are low enough, Apple will throw an M3 refresh MBA out there.
The only thing about the M3 that currently looks appealing is AV1 decode support. My guess is that the M4 generation, still on "3nm" but maybe a better yielding version, will probably "fix" a lot of these oddities with core count and memory bandwidth on the M3 generation.
WTF, over?!?
There are no more variants of the M3 than the M2 (actually less, since the M3 Ultra has yet to drop).
There is no M3 Air because Apple is either burning through some inventory of M2 MBAs, and wanted to avoid The Osbourne Effect; or they wanted to showcase the MacBook Pros, which reportedly suffered recently from slower sales, before everyone jumping on an M3-powered lower-margin-but-still-more-than-enough-computer-for-most-people MacBook Air.
Maybe 200 GB/s wasn't real? (Score:4, Interesting)
Re:Maybe 200 GB/s wasn't real? (Score:4, Interesting)
Depends on what exactly you're measuring and whether the system has enough power on both ends of the bus to use the bandwidth in the first place. It's perfectly reasonable to think that Apple may have over-provisioned in their first iterations and after testing found the cores or memory chips can't keep up anyway. Or found other ways to boost the performance.
We see the same issue with DDR5 and PCIe5.0 in PC, if all you're doing is putting a GeForce GT 1080 chip on a PCIe5.0 bus, it wouldn't matter if the manufacturer takes out half the lanes, which is what you often see on budget motherboards (x16 slots operating at x4 or x8)
Re: (Score:2)
Except that Apple specifically added this feature, presumably because it did actually make a difference.
"Or found other ways to boost the performance."
You seem confused. Apple found THIS way to "boost the performance".
Memory Bandwidth (Score:1, Troll)
Apple chips are basically the most easily accessible, high-performance hardware for at-home/open-source LLM usage. Memory bandwidth and size are the key constraints on inference speed. Maybe it's intended, maybe it's a happy coincidence for Apple that anyone who wants the best LLM performance on a Mac is going to be pushed into buying the most expensive tier(s). I suspect they knew what they were doing.
Re: (Score:3)
For all those masses implementing their own LLMs at home on consumer hardware. I mean, who doesn't keep their own personal few hundred terabytes of training data around to push though their MacBook Pro?
Re: (Score:3)
For all those masses implementing their own LLMs at home on consumer hardware. I mean, who doesn't keep their own personal few hundred terabytes of training data around to push though their MacBook Pro?
Hundreds of terabytes? The pile is less than a TB... World class models are trained on something on the order of 4T tokens. Besides I'm pretty sure everyone is talking about inference rather than training.
I only have about 150GB of memory bandwidth on my workstation so even taking full advantage of AMX tile registers I'm lucky to see a token a second (non batched) on something like a Falcon 180B model perhaps closer to 2 if heavily quantized. With a MAC you can get over twice that which for the price and
Re: (Score:2)
From the outside of the argument... (Score:5, Insightful)
Apple haters spend an awful lot of time imagining things that apple fans must be saying to justify themselves and refuting them.
Apple users spend little to no time worrying about justifying themselves.
It's a debate where one side is frothing at the mouth, and the chairs on the other side of the stage are empty.
Re: (Score:3)
Re: (Score:1)
Oh, hey. There you are. :)
Re: (Score:2)
There're also just trolls looking to get a reaction. Being unhinged and mouth-frothy is a good way to get a reaction. A well-reasoned and mild-mannered comment can be consumed an
Re: (Score:2)
Re: (Score:2)
Apple haters spend an awful lot of time imagining things that apple fans must be saying to justify themselves and refuting them.
Apple users spend little to no time worrying about justifying themselves.
It's a debate where one side is frothing at the mouth, and the chairs on the other side of the stage are empty.
Well said, bravo!
Re: (Score:3)
Apple haters spend an awful lot of time imagining things that apple fans must be saying to justify themselves and refuting them.
Apple users spend little to no time worrying about justifying themselves.
It's a debate where one side is frothing at the mouth, and the chairs on the other side of the stage are empty.
Irony is not your strong suit.
You've made at least two posts on this article saying the same thing.
You seem to spend an inordinate amount of time trying to justify why you've bought Apple products and trying to dismiss anyone who disagrees with you as "haters".
You did get one thing right, only one side of this argument is "frothing".
Good to see (Score:1)
"Apple M3 Pro Chip Has 25% Less Memory Bandwidth Than M1/M2 Pro"
It's good to see Apple conserving bandwidth like this, I bet Lenovo and Dell wished they'd thought of it. Brilliant.