However, the 5950X really is a consumer CPU and can be purchased for $600 right now, so you could build 2-3 5950X PCs for the price of a single M1 Ultra Mac Studio.
The M1 Ultra is an impressive part, but the $4K price of entry is steep. On the other hand, if you need a Mac for Mac software then none of this matters and you're going to buy whatever Apple offers, so it's great to have something like this available.
Thanks for letting me know, I'm not sure why the numbers for the one I happened to click on are so far above the norm but I've now made an edit with a screenshot of more average numbers.
Intel's latest desktop parts are fantastic. They're the undisputed performance champions, the best value for gaming and consumer CPUs, and AMD has yet to catch up. The mid-level parts are even relatively power efficient (for desktop parts).
The pro-AMD anti-Intel commentary on the internet got completely out of control for a while. Intel is still very good at what they do, despite a few stumbles in recent history. It's just not fashionable to say good things about Intel right now.
The Intel i9-12900k has a TDP of something like 240 watts (I don't remember the exact number).
The AMD 5950x has a TDP of 105 watts.
The 5950x does roughly 10-15% worse single core and 10% worse all-core.
Intel is far behind AMD.
Overclock the 5950 a bit and you're back to where the i9 is stock, but still with less power. (I haven't actually done any research into that last sentence, but that seems plausible)
The AMD customer experience is also much better. I can upgrade - and soon will be upgrading - my 2 year old 3900x system to a 5950x. I don't have to buy a new motherboard like I would have had to with Intel. Plus my system has had PCIe 4.0 for a few years (which I have been utilizing with a super fast SSD) and also supports ECC RAM - Intel will price gouge you for those by making you go to their enterprise targeted CPUs/motherboards (to be fair, they do now support PCIe 4 on the i9).
Plus Intel withholding ECC from their consumer CPUs - even the high end ones - is just a major dick move and hurts all of the people like us on HN who want to have good hardware. And considering all the scummy anticompetitive stuff Intel has done over the last few decades, I'd say it's better to go with AMD even if their chips had a slight perf/$ disadvantage compared to Intel (but they don't). Of course, I'm not saying AMD is amazing and wouldn't do evil stuff if they got the opportunity, like Intel had, but AMD is still the underdog for just a bit longer, and in duopolies it's generally good to support the underdog if it doesn't cost you much (and here it doesn't really cost anything).
> The Intel i9-12900k has a TDP of something like 240 watts (I don't remember the exact number).
> The AMD 5950x has a TDP of 105 watts.
I have experience with both. The 5950X has significantly higher idle power consumption. It's a known downside of AMD parts. Unless you're running the CPUs at 100% all the time, the Intel platform will probably consume less power overall.
I know it defies all of the headlines and such, but it's true. Idle power consumption matters more than peak power consumption for most of us whose CPUs sit idle most of the time.
If power is a concern, you get the 12600K instead of the hot rodded 12900K.
> Overclock the 5950 a bit and you're back to where the i9 is stock, but still with less power. (I haven't actually done any research into that last sentence, but that seems plausible)
I can tell you haven't done any research because this isn't true. The 5950X doesn't overclock well at all. You can try to force higher all-core speeds with a lot of voltage, but it's going to become a power hungry monster of a CPU for very little gain.
The Intel really is the superior CPU.
> Plus Intel withholding ECC from their consumer CPUs - even the high end ones -
This is what I was talking about when I said the pro-AMD anti-Intel rhetoric was out of control. It's like basic facts don't matter any more. People just want to hate Intel.
The AMD parts are already running close to max. You can squeeze a few percent more out of them, overclocking like in the old days is out of the question.
>It's just not fashionable to say good things about Intel right now.
Well, being single-handedly responsible for the nearly complete stagnation in processor performance for 10 out of the last 12 years probably has something to do with that.
That said, Intel's products and their performance are indeed quite underrated especially once you consider that they're being generated on a relatively old process node; it will be interesting to see what kind of performance they'll be able to wring out at 5 or 7nm.
I’ve always assumed that Intels terrible corporate culture was responsible for that stagnation (told to me by former Intel employees). What upset me was the pointlessness of having their employees suffer that culture and then the rest of the world suffer the lack of progress.
>The pro-AMD anti-Intel commentary on the internet got completely out of control for a while.
Not really. It was entirely justified, but things have changed. Prior to Alder Lake Intel really was just plain worse than AMD in most respects.
Processor vendors have leapfrogged each other many, many times before. A new architecture is supposed to beat everything else on the market. Recent years have been weird because Intel bungled 10nm at about the same time AMD bungled Bulldozer. We are getting back to normal.
It is clocked 1.6x higher than M1 chips (single-core), but also the CPU architecture isn't from 2015 like Intel's desktop chips were from generation 6 through 10.
A single core at that 5.2GHz turbo can consume ~40W alone, however.
Geekbench doesn't load ATM. Are these numbers legit (both for the 5950X and 12900K)? Does Intel really outperform AMD single-core by 62%? Sniff test says that's way too much, but what do I know. I also doubt the multi-core difference. Yes, the 12900K seems to be a well performing CPU, but then again 5950X is a 16 "P" core part...
Intel has always had a huge lead over AMD on single core. AMD did manage to compete on multi core with the Zen chips but still hasn't gotten close on single core
I don't know why you're being downvoted. What you're saying is, in my experience, true. AMD is the multi core king. Intel is the single core king.
Multi core CPU performance benchmarks push your CPU and all its cores to the limit, but that doesn't reflect the typical real-world use case because the typical real-world use case involves programs that aren't able to effectively utilize all CPU cores. On top of that, gaming is the only typical use case where you are going to be pushing your CPU to the limit (i.e. the place where you actually need your CPU to be fast) and games don't CPU-parallelize well, meaning higher single core performance is generally best in the case of gaming.
Macs aren't made for gaming. They're made for productivity, multitasking, and creative work (the one major area where a multi core CPU can be fully utilized) so it makes complete sense for Apple to go the multi core route, but I can't say the same for AMD and their 8+ core gaming CPUs.
Intel had a dwindling lead in single core perf over the past few years, but when AMD came out with Zen 3, AMD actually had a non negligible lead over Intel. But Intel is now winning again with the 12th gen chips. So they are trading blows these days.
Gotcha. Is the cost efficiency comparable as well? Could I buy an AMD gaming CPU with similar single core performance to an Intel gaming CPU for about the same price?
Out of pure curiosity, this is the comparison of that M1 Ultra Geekbench score against my setup which is VERY CLOSE to comparable: AMD 5950/3090/128gb workstation. Same version of Geekbench. What I would describe as well-thought out hardware configurations (no bling or neon, built with a vague sense of budget/performance in mind for work) and only "reasonable" stock settings (i.e. no manual overclocking, but XMP enabled, RAM chosen for Ryzen).
I think you'd expect the M1 to outperform a bit on the multicore because it has more cores, but the comparison suggests a lot of the current benchmark score is being driven by exceptionally high scores in a handful of benchmark sub-tests.
On single-core I'd describe them as very very comparable at the moment.
There's another issue that was made apparent when I benchmarked my setup which suggests that a significant portion of people running 5950's are leaving a portion of performance on the table (i presume that's some mix of not setting things up right or not having the right mix of hardware).
I'd say this is an issue that doesn't happen with Macs, but...I'm aware that their past laptops have run really really hot (and dropped performance below their apparent specs due to insufficient cooling?), although everything I've seen shows the M1 family to be a a relatively great performer in this regard (to the point where I recommended my wife pick up the new M1 Macbook Air).
It depends a bit depending on how you account for other machines and peripherals (given that you can reuse/replace parts when doing it yourself), but the general price off the top of my head was:
- CPU $1100 AUD
- GPU $3000 AUD
- RAM $1000 AUD
- Motherboard $279 AUD
- NVMe Drive $300 AUD
- SSDs $150 - 300 each AUD
- SATA drives $100 - 300 each AUD
- Powersupply $240 AUD
- Case $250 AUD
- CPU cooler $170 AUD
That prices the full thing as new at about $6000 to $7000 AUD, which is vaguely in line with about what I recall my budget being, though I've had it for a year or so now.
That would put it at roughly 4k - 5k USD by my napkin math, but as a general rule US consumer tech prices recieve a bit of a discount compared to us over here, so you could probably come in on the lower end if you could physically get everything (i don't know what part supplies are like at the moment in the US).
Taking the AU Apple store price, accepting that their CPU probably outperforms mine in some multithreaded performance, and going for the top GPU option, I'd price the Apple M1 Ultra rough equivalent between $7000 to $10,000 AUD depending on where you want to cut it.
So like for like i'd say 4k-5k USD for mine vs 5k - 7k USD for the M1 in my locality.
The thing that kills me with PC builds though is longevity and when they don’t work. I’m getting too old to play RMA when something doesn’t work. I want one single point of contact. I moved from a Ryzen 3700X running windows in 2020 to a bottom end Mac mini after spending two months debugging a random crash.
Yes, I hold communistic point of view in that regard and I do believe that people must earn reasonable wage and not charge anything over that. Extracting super-profits is criminal, even if you're making best computer in the world.
Talking of the new chips, it seems interesting to me to note the difference in cache sizes. I know it is not a 1:1 comparison because of architecture difference but the 16 core 5955WX has
L1: 512 KB, L2: 8 MB, L3: 64 MB
compared to M1 Ultra's from the Geekbench data
L1: 128 KB instruction, 64 KB data, L2: 4 MB
I have mostly forgotten my microprocessor architecture lectures but it seems interesting that even after being able to cache more data near a core, AMD is not gaining much. Maybe packing too much cache increases latency of access or the gains simply go away beyond a certain size.
Edit:
Maybe even it is coming down to the cache layout. Does anyone know if the cache fetch times for the named levels are roughly the same across architectures?
For x86, I remember L1 being a single cycle fetch and L2 being 10-20x slower than L1
I'm almost certain that the L1 caches on the M1 ultra are being reported incorrectly. That 512 KiB on the AMD CPU is the sum of data and instruction cache across all cores. M1 performance cores have 192 KiB of instruction cache and 128 KiB of data cache per core; and the efficiency cores have 128/64 KiB of instruction and data caches, respectively.
16*(192+128)+4*(128+64) = 5888 KiB of L1 cache on the M1 ultra.
Wow, if that's true then that is such a massive advantage. L1 is a single cycle fetch cache if it is like x86. So, individual cores can do compute so much better by fitting more data at once.
Hmm, thanks; that seems interesting. I guess I need to read up more. On searching in Google someone[1] was quoting Xeon L1 fetch as approximately 4 cycles. I don't know if this is an average across branch prediction hit/miss, I will look for source of these numbers and try to read what has changed.
The cycle count is largely irrelevant in a speculative, pipelined core, as long as it can speculate far enough ahead that the delay is masked by other activity. 4-5 cycles is nothing in the scheme of things.
Emphasis on 'largely'. If you have e.g. multiply indirect pointers, then you care. That said, I didn't mean to imply that moving to bigger, slower caches was the wrong tradeoff.
This is part of what makes Apple's design so incredible. Normally, increasing the cache size dramatically increases latency, but they paid the cost and their 128kb D-cache and 196kb I-cache still retain 2-3 cycles of latency.
That doesn't quite match up with what we know about the two M1 Max chips that make up the Ultra.
Here are the resources per M1 Max, so multiply by two.
>On the core and L2 side of things, there haven’t been any changes and we consequently don’t see much alterations in terms of the results – it’s still a 3.2GHz peak core with 128KB of L1D at 3 cycles load-load latencies, [192k L1 instruction cache], and a 12MB L2 cache.
Where things are quite different is when we enter the system cache, instead of 8MB, on the M1 Max it’s now 48MB large
The M1 Ultra has a significant memory bandwidth and latency advantage due to the type of memory used and the way the memory is connected. They use the same memory for the CPU and GPU, so the memory interface was optimized more like a GPU and the CPU benefits in a few memory-constrained benchmarks (machine learning, AES-XT streaming)
The flip side is that you're limited to 128GB combined memory for the CPU and GPU on the M1 Ultra, whereas a comparable Threadripper Pro system will take up to 16 times as much (2TB) and you can upgrade it whenever you feel like.
It will be interesting to see how much RAM Apple offers on the upcoming M1 Mac Pro parts.
Wiring out 32 channels of DDR5 to 16 slots might not be feasible, but latency-wise Anandtech's measurements suggest the M1 Max actually has a bit higher latency to memory than e.g. Icelake-SP
> significant memory bandwidth and latency advantage
This is to cache or the main memory?
I remember x86 based PCs taking 100ns for RAM access. Is it faster in ARM?
> They use the same memory for the CPU and GPU, so the memory interface was optimized more like a GPU and the CPU benefits in a few memory-constrained benchmarks (machine learning, AES-XT streaming)
This would mean each core having some dedicated RAM section better connected. Wouldn't this be more like L3 section in x86 but bigger? Maybe this is the advantage of having everything on the same die.
Beyond a certain size gains should also flat out, no?
Maybe 70% the speed of light. That's a handful of cycles at a few GHz for any SDRAM channel. The latency limitation is caked in to the scanning, strobed nature of SDRAM. It's dense and cheap, but it will never respond faster than ~20 ns (100+ cycles).
I wonder what stops AMD, Intel in trying this, except loss of modularity. It is not like the architecture will change by bringing it into the die. As far as I remember, CPU is the only thing that does RAM access even in x86
I would be surprised if they’re not evaluating this. The only difference I can see is that it complicates the number of SKUs they’d need to provide to their customers. It’s a little easier for Apple in this regard because they’re also building the final machines, whereas Intel/AMD are making chips that are going in a wider range of devices.
Writing this would make no sense, knowing how many (more) transistors M1 has. In very simple terms one bit of SRAM is 6 transistors (+some extra for addressing).
Also comparing M1 Ultra to consumer grade chips is quite pointless, x86-64 consumer grade ones have just 2 memory channels.
M1 Max P-cores have 128 L1 D-cache, 196kb I-cache, and 24mb L2. E-cores have 64kb L1 D-cache, 128kb L1 I-cache, and 4mb L2. There is also a shared 48mb system-level cache that serves as L3.
AMD Zen 3 has 32kb D-cache and 32kb I-cache per core.
That'll depend on the usecase and benchmark. The M1 does really unreasonably well on SPECfp, for instance, which is heavily memory bandwidth dependent.
I'd take the GPU comparisons with a huge grain of salt.
If it's anything like the M1 Max and M1 Pro, the real-world GPU performance doesn't quite extrapolate the same way as the synthetic benchmarks.
From what I've gathered, the performance per watt of the M1 GPU isn't actually that much different than nVidia's 3000-series performance per watt in real-world applications. I'd love to be wrong and discover that Apple has also cracked the code on making GPUs more efficient than established industry players, but I think it's not leaps and bounds better like they've claimed. At least not in real-world applications.
It barely hits a 3060Ti performance wise. That's a $400 GPU.
Also, while the 3090 is $3000 because of shitcoin miners and scalpers profiting of the demand, the MSRP is half that (and you can still get it at this price provided a bit of patience, which, admittedly, sucks to have to do).
The score was 1793 single core / 24055 multi-core. This puts it in the realm of Threadrippers/EPYCs with 24-40 cores, or some of the similar core count Xeon Gold processors.
I'd love to see if Apple can post solid numbers that compete with 128 core EPYC CPUs in a Mac Pro, if they truly go for broke on it.
The interesting thing is that the 24-core threadripper is posting a score of 20k and the 64-core one is posting 25k. Whatever Geekbench is doing doesn't appear to be something you'd normally run on one of these monster core count CPUs.
High-end Threadrippers are severely limited by their thermal envelope. Looking at the latest 5000-series products, the 64-core CPU has the same TDP (280W) as the 24-core CPU. Guess which one can sustain higher clock speeds for longer before it gets throttled.
Indeed, it is a little faster single core and a little slower in multi-core on average compared to my 24-core Zen 3 EPYC (https://browser.geekbench.com/v5/cpu/compare/13330272?baseli... - which I expect to be slightly slower than the new 24-core Threadripper Pro 5965WX) - certainly comparable in terms of performance. But no word yet on if the M1 Ultra supports ECC RAM or how many PCIe lanes.
What? You can't buy the chips separately, but the base system prices for a mac studio with a m1 max is $1999, and the version with the m1 ultra is $3999. You also get 2x the ram, and 2x the disk space when upgrading.
Single-thread performance is barely better than a M1 MacBook Air. It takes apps that are heavily parallelized or need the memory bandwidth to actually make use of the Ultra. Few creative pro apps are that optimized, unfortunately.
Of course it has the same single core performance, they use the same cores. We’re not gonna see single core move until the A15 based Mx comes out, presumably later this year.
Honestly says a lot more about the Air than it does the Studio.
Unfortunately the M1 is limited in professional usage by the lack of a 32 GB option, and the fact that the laptops can only handle one external display. The M1 Pro is really a better option since it provides a lot of flexibility that the M1 doesn't have, without being obscene like the M1 Ultra.
I'm really hoping that the M2 air refresh allows 32GB. I bump the limits with 16GB a lot of the time. I could buy the 14", but I like the portability of the air.
Is there any M1 Ultra GPU benchmark out in the wild? If anyone knows, please share, I'm very curious.
I have a feeling the Ultra could cater to some niches really well.
For now we just have the typical Apple nebulous claims (they have a graph for "relative performance" vs "Highest end discrete GPU"[1] what does that even mean LOL).
I think when the new M1 Pro/Max based laptops came out it was deemed that much of the performance Apple was claiming was due to the media engine helping accelerate the hard stuff in FCPX (and Premiere, IIRC). So it’s really good compared to the 3080 for per watt performance in creative work, but if you aren’t transcoding and especially if you aren’t optimized for using Metal, it’s not as great (but still good, 1050 ish performance with no fan is magical).
If you read the very fine print on those graphs, they compared with an i9-12900K and an RTX 3090. "Relative performance" is extremely questionable though I agree. I guess I'll wait for the Anandtech review to get graphs with actual units.
I saw the RTX 3090 configuration, but that is in the footprint n. 5, which is referenced from another paragraph, whereas the paragraph about GPU is more towards the beginning of that page. I'm not clear if that same configuration was used as a comparison for all the performance claims in the page. If the GPU used for that graph is an RTX 3090, the result is impressive especially considering power consumption. But yeah "relative performance" means nothing. Luckily, I think we will have to wait just a few days for "real world" benchmarking like Anandtech etc.
This is one thing I really love about M1 macs actually. I spend a lot of my time with my face in service manuals in PDFs and the experience is spot on with Preview. Adobe Reader on windows is absolutely horrible to use in comparison.
Does that support document signatures, as that's a reason I've needed to use the bloatware in the past? That was a while back though. All online IME nowadays.
Not really. Exact power consumption will vary wildly based on cache state, the instructions executed, the state of the branch predictor, state of the pipeline, and so forth.
The best you could do is track power during the time an application is scheduled to run on the processor but even that is an approximation because the power consumption of a single core depends on the state of other cores as well. Then you get into arbitrary territory (do you count the CPU time spent inside the filesystem driver or network minifilter as application time? is all called kernel code, which may execute on another core, taken into account? what about data transfers between the CPU and GPU? what about the power spent on cooling the CPU while the code runs?) and you end up with a very arbitrary and difficult approximation.
I think simpler approximations are more than enough, but it's near impossible to track power usage for a single application. Windows has such an approximation based on various factors, not just the CPU, and that's not much more than a "low-medium-high" scale.
That makes sense but I see a lot of places where they estimate power consumption in W or mWs. How do they do it? Is there may be a window of estimation possible in a constrained environment? No network or file I/O I guess is all the restrictions we can put on a process without compromising functionality.
Alternatively how do I measure power consumption of a random process? Assuming I only want to get a number and nothing else.
So single core performance is inferior to a 700$ CPU ( i9-12900k ). The M1 Ultra will be beaten within 6month for multi-core this summer on consumer grade amd / intel CPU with reasonable price.
It's impressive performance but for desktop termal / perf per watt is less relevant.
Funny, one of the reasons I’m considering the new Mac Studio is specifically perf/watt, and I’ve been tethered-laptop for about a decade. I want a good fast computer that also stays cool without an absurd power draw. In that, the specs I have in mind are the same between the Studio and MBP (lowest CPU to get 64 GB RAM), and the Studio is about $1000 cheaper for that config.
NUCs are smaller than the Mac Studio. Desktop PCs are comically large only if you want that comically large amount of extension capability. Which you can't get with any Macs.
You seem to be getting a cold response here but I feel much the same way. I'm tired of loud hot chunky computers. I want a computer where at idle not a single fan is spinning and hardly any power is causing coil whine. I want something where long term I can re-use the box as a neat little server.
All this I/O and a nice bit of speed in this tiny quiet power-efficient little box? Sign me up!
Nope, not if you want to run something like 4x4k and you're extremely sensitive to noise and want to run in SILENCE which I did. My desk setup is like the people in Apple's press conference, I'm the target market for this computer.
The primary issue is that no matter what you do, the GPU you use is never going to run on idle and will constantly be running a bit hot. You can choose to use a professional card (cooler, quieter) or consumer (flexibility, resale, cost). I choose the latter. As for the build, I had 3 options:
1: Air-cooled -
What I went with. High end fans, thick oversized heatsinks on every component, I went for a low airflow low-noise case to minimize high pitched noises (although this increases fan speed requirements and fan hum). I tried to get the fans to stop completely but I ran into too many issues with the card just not staying quite cool enough and the fans suddenly ramping and now let the GPU fans spin at about 900rpm while the case fans run around ~750.
2: Water-cooled -
With very careful building, you can create a custom loop with large thick rads and slow quiet fans and a quiet pump, but this setup is hardly cheaper than the mac studio, or as quiet, or as cool, or as small.
3: passive -
Probably the best option was something like a compulab airtop3, which is probably the closest equivalent to a mac studio before the mac studio aside from it being a bit worse in performance. It uses a 9900k and is a bit obsolete now, and was pretty pricy back in the day, but it's pretty cool engineering.
Ultimately though, regardless of the options, none of these offer everything I need in anywhere close to the power envelope of mac studio, and more power usage = more heat, more size, more noise. The power efficiency of the m1 GPU is bonkers being built on 5nm and all. I have plenty of PC builds under my belt, I spent hours of research, I did an okay job, but I will never do as good as a job as a 3 trillion dollar company can. Why would I ever fuss with any of that ever again when I can just pay a little more to grab an m1 studio off the shelf that's a better computer?
Moore's law is about transistor density, not single-core performance. Even if a new generation of chips obeys the "law", there is no requirement that the designers dedicate the improvement to single-core performance. Alternatives include multi-core performance and miniaturisation.
Most people? If a given application isn't good at multithreading/multicore, it doesn't stop the OS from scheduling other applications on other cores, freeing up cycles. I think most people have multiple processes running at a time on their machines. Even just looking at software devs, the plugins serving your code editor probably run in their own processes.
Absolutely not. Take a look at your CPU usage right now. Pretty much all background tasks can be done on a core or two. I only have 8 cores, and the only time they're all facing high load is when I'm compiling code. Basically the M1 ultra is only useful for professional graphics/video editors, and developers compiling large codebases or training ml models.
Going from 2 to 8 cores made a massive difference to me (although admittedly I also almost doubled single core perf in that switch). Probably for the somewhat obvious reason that I usually have far more than 2 programs open at once, so having each program be able to run on a separate core is a huge win. With my previous machine, if I was doing something intensive (like say compiling Xcode), then the performance of other tasks would be noticably slower. On my new machine I can't even tell it's doing it.
Those benchmarks are similar with pretty high correlation in overall scores. Look at the individual subscores would be more useful for either benchmark.
Apple has a terrible history when it comes to cooling down their laptops. I don't think they'll be able to cool this chip without a massive design overhaul. They'd have to make something the size of those "gaming" laptops to get it cool enough and even then it would probably exceed the noise and thermal limit they've set for their brand.
Besides the thermals of the mbp probably not being suited and the form factor of the chip itself, the Mac Studio has a 370W power supply, which as well is way too much for a laptop to even legally have.
it has a RTX 3090 level GPU, how is Nvidia that bad? at 200watts less? that's insane... why does Apple not just produce discrete GPU with 200watts more power and just straight up destroy the GPU market?
I'd take the RTX 3090 comparisons with a huge grain of salt for the moment.
First of all, Apple's performance comparisons are against the mobile Nvidia cards, not the desktop ones. For example, the M1 Max comparisons with the "RTX 3080" were against the 100W mobile part, not the 320W desktop part. The mobile version is about 40% slower than the desktop one (something that I do find irritating about Nvidia's marketing). The desktop gaming cards are typically trading power consumption for performance. They're clocked well beyond the optimal efficiency point to squeeze everything possible out of them. Cutting the power target in half may only cost you 20% of your performance.
Another thing to note is that Nvidia pretty much holds all the high-efficiency binned dies for the enterprise cards. In terms of GPGPU, the data center cards are nearly twice the performance per watt of the consumer cards by going wider and dropping the clock rate.
Small edit:
The best benchmarks I could find suggested that the M1 Max was drawing ~105 watts with the GPU under under a large synthetic load and doing about 10 TFLOP/s of FP32. Let's assume the M1 Ultra is exactly double this and can do 20 TFLOP/s at 200 watts. The Nvidia A100 GPUs are the same performance but at 250 watts. The RTX 3090 is upwards of 36 TFLOP/s but approaching 400W of power draw.
By comparison, the fastest supercomputer in the year 2000 was ASCI White, which did 4.9 TFLOP/s. It cost $168M in 2020 dollars, took up 12,000 sqft, weighed 106 tons, and used 3 megawatts (+ another 3MW for cooling... = 6,000,000 watts). Granted, it had 6TB of memory and 160TB storage...but with 7,000 disk drives. (https://www.top500.org/resources/top-systems/asci-white-lawr...)
The M1 Max itself is a the size of a postage stamp, 10 TFLOP/s, costs ~$200 to make, and uses ~100 watts (= a single incandescent lightbulb).
And that's assuming their "GPU" is actually performant in those scenarios. All the examples they post are for GPGPU, but how does something like Unreal perform?
I think the sweet point in Apple SoC is now they control both CPU, GPU and memory so they can optimize the price. If they make a discrete GPU, the margin will not be that high and will be less attractive.
They don't want to destroy the GPU market. They want to destroy the PC market. The gap is only going to widen now that they're on the new architecture. I'm sure AMD has more up their sleeve though. Who said Moore's law is dead?
For perspective, AMD's top consumer part is a 5950X, which scores 1686 single-core / 16565 multi-core (reference https://browser.geekbench.com/processors/amd-ryzen-9-5950x).
However, the 5950X really is a consumer CPU and can be purchased for $600 right now, so you could build 2-3 5950X PCs for the price of a single M1 Ultra Mac Studio.
A better comparison would be the Zen 3 based Threadripper Pro parts that were announced today: https://www.anandtech.com/show/17296/amd-announces-ryzen-thr...
There is supposedly a leaked AMD 5975WX (32-core Zen 3) score on Geekbench from a few months ago. It performs very similarly to the M1 Ultra: https://browser.geekbench.com/v5/cpu/compare/10531340?baseli...
The M1 Ultra is an impressive part, but the $4K price of entry is steep. On the other hand, if you need a Mac for Mac software then none of this matters and you're going to buy whatever Apple offers, so it's great to have something like this available.