Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power and efficiency. It tends to go under the radar a bit because it's not easy to run your own software on an iPhone, and most apps on these things are not serious workhorse loads (a few specific use cases are, most are not).

The Apple A13 - even its implementation in the iPhone SE, in microbenchmarks achieves on par single core performance [1] with the Core i7 8086k [2] and Ryzen 9 3950X [3]. That's the highest single core performance you can buy in PCs in principle.

I don't have to explain how insane it is a 5-ish watts smartphone CPU delivers that kind of performance, even if it is in bursts. By sticking to Intel or even x86 in general, there is ample evidence Apple is leaving a lot of performance on the table. Not just in MacBooks - but for the Mac Pro too.

[1]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q... - the iPhone 11 with the A13 has to do as a surrogate while the SE is just out, they benchmark the same.

[2]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q...

[3]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q...

It's worth noting that Geekbench is a pure microbenchmark. The iPhone will not sustain performance as long as the others. The point is that Apple could solve this when moving to bigger devices.



> It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power and efficiency.

Honestly, I'm not convinced they're THAT far ahead. They don't have a lot of the legacy baggage Intel has to contend with, and they're the only company making high end ARM chips (besides Amazon and a few other weird server implementations), but being able to match big core i7s in some benchmarks single threaded is to a large extent something that Intel's own low power chips can also do, at least burstily.

There are a lot of challenges to big many-cored chips beyond single-core performance, and we really don't know where they are with that yet, as there are no publicly-available examples of Apple desktop chips.


> and they're the only company making high end ARM chips

That's, if you'll pardon the pun, an Apples to Oranges comparison. Apple isn't making "high end ARM chips" either if your comparator is powerful servers. You need to look at what Apple is doing within their power envelope and compare that to what everyone else is doing within _their_ power envelopes. The A13 Bionic is an uncooled 6W TDP chip blowing past 95W base TDP chips that require hefty active cooling.


To add to that, if you look up the benchmarks of the i7 8500Y, which is intel’s top shipping 5W offering, you see it is vastly slower than the A13. A third slower in single core, less than half the performance in multicore.

This whole thread reminds me of how passionate the power pc enthousiasts were defending it as superior, right before apple switched to intel and doubled mac performance overnight.


> This whole thread reminds me of how passionate the power pc enthousiasts were defending it as superior, right before apple switched to intel and doubled mac performance overnight.

To be fair, the PowerPCs _were_ measurably better and faster when each one was released. Apple just couldn't get a G5 CPU that would fit into a laptop, and IBM was an unreliable partner with a slow release cycle, so by the time the transition happened they had fallen behind.


The PowerPC G3 and G4 were great chips. The G3 was much better than competing x86 chips, and G4 beat contemporary ones consistently too.

The G5 wasn't great though. When Steve Jobs announced it, Apple already showed it only trading blows with the then-current Pentium 4 (a Pentium 4 - they sucked!). And a few months after the first Power Mac G5s were launched, they were already resoundly beaten by the new Athlon 64s [1].

Add to that the G5 was basically a POWER4 server chip, and IBM was only building server chips in the future, Apple basically had no choice. Nothing really to do with PowerPC vs x86, but more to do with what kind of processors their suppliers were willing to build.

[1]: https://web.archive.org/web/20050605023250/https://www.pcwor...


Apple's volume was not enough to warrant investing on a G5 laptop. IBM offered Jobs the Cell and Jobs went elsewhere to get a better deal from Intel.

POWER is very much alive in the high-end server space, powering IBM's p and i series of machines.


Oddly enough, shortly _after_ the Intel transition was announced, IBM announced a PPC970 that would fit in a laptop power envelope.

I don't think it was ever used for anything.


That was probably the last PPC they did and was probably ready when Apple made the announcement. It just wasn't worth for IBM to invest in workstation-class PowerPC chips with laptop power envelopes. The only other use for PowerPCs, from their PoV, was their own workstations and those could use the higher end POWER chips.


IBM continued making PowerPC G3 derivatives for a while for Nintendo. Nintendo ended Wii U production in January 2017, which would I guess would mark the end of IBM's production of PowerPC as well.

The Wii U for all its flaws is probably the most "practical" Power-based machine you can get nowadays, given relative power, availability, size and price. A 1.2GHz triple core PowerPC G3 would probably still eek out Raspberry Pi 3 like performance. Shame the Linux port to it never really got off the ground (also partially due to IBM's hackjob of an SMP implementation for the G3).


Isn’t that, essentially, why everyone is now expecting Apple to switch away from Intel? The chipmaker is doing too little, too late?


>There are a lot of challenges to big many-cored chips beyond single-core performance, and we really don't know where they are with that yet, as there are no publicly-available examples of Apple desktop chips.

The A12X from 2018's iPad Pro is the sort of chip I'd expect to see in an ARM laptop, and its multicore scores are similar to the top end of 2018 Macbook Pros.

2020's iPad revision didn't get much in the way of processor improvements (just one more GPU core), so the new 16" MBP has pulled ahead with an 8-core i9, but when we get a new iPad based on an A13X or A14X I expect it to be back in that range again.

And these are in thin fanless tablets. With a proper cooling system, there's got to be some extra juice to be squeezed out of them.


I wonder if anyone's ever done A-series chip "overclocking", or at least manual overvolting with active cooling. It'd be cool to see what kind of performance increase might be available.


> Being able to match big core i7s in some benchmarks single threaded is to a large extent something that Intel's own low power chips can also do, at least burstily.

You're not wrong. In general, it's true that a microbenchmark amplifies the Apple A13's strengths due to power limits. The assumption I make is that microbenchmarks indicate the true peak performance of Apple's architecture, and as power limits become a smaller constraint when Apple uses its chips in laptops and desktops they will make available that performance in a more sustained way.

But even low power Intels don't compare that favourably. Intel's new i7 10510U delivers very nice single core performance [1]. But it's worth noting that 1) that still does not quite match the A13's burst performance 2) that chip is still rated for a power profile much larger than the A13's and 3) as always in these discussions - Intel's "TDP" is a marketing term not a power limit. At high turbos the chip is permitted to consume quite a bit more power than the 15W it's rated for.

This particular Intel chip boosts to 4.90GHz. For Apple chips, even stuff like clockspeeds are a matter of conjecture, but Wikichip without a source claims that the A13 tops out at 2.65GHz [2] which if true indicates a lot more thermal and frequency headroom in bigger form factors.

I just benchmarked my MacBook Pro+Safari in Jetstream 2.0 [3] - not a microbenchmark - and it scored nearly 145 compared to the nearly 130 the iPhone 11 scores [4]. That's with a "45W TDP" Core i7 8850H topping out at 4.3GHz. It's hard to benchmark iPhones well, but all evidence points to the fact that they are actually really fast.

[1]: https://browser.geekbench.com/v5/cpu/search?utf8=%E2%9C%93&q...

[2]: https://en.wikichip.org/wiki/apple/ax/a13 - worth noting that high-end Qualcomm SoCs also operate at comparable frequencies.

[3]: https://browserbench.org/JetStream/

[4]: https://www.anandtech.com/show/14892/the-apple-iphone-11-pro...


I tried this benchmark in Chrome on 3 computers:

* 6 year old Macbook Pro i7-4980HQ, Windows 10 - 102

* 5 year old Macbook Pro i7-5557U, OS X - 100

* Threadripper 2990WX desktop - 99

So, uh, I might have some questions about this benchmark's general validity now?! - though maybe it is some evidence in favour of my vague feeling that the Threadripper sometimes doesn't feel as fast as it seems like it ought to feel.


The Jetstream benchmark is made by the WebKit team. Its scores will vary per browser, so you have to compare browser to browser. I compared an iPhone 11 running Safari to a MacBook Pro running Safari. Apples to apples.

Moreover, while the Threadripper 2990WX is a really awesome processor, single core benchmarks (I think Jetstream is mostly limited to a single thread) aren't particularly its strength. Over multiple runs it should beat your Macbook Pro, but not by a huge amount. If not, take a look at how you're cooling that beast :)


I updated a six core i7 3930k with a sixteen core 1950x threadripper on one of my boxes. Biggest reason for the update was the disk IO (m2 drives) and number of cores and I'd had the 3930k since the launch day at Microcenter. On Windows, single threaded at stock speeds, the cores were comparable from a 6-8 year gap between the two CPUs. For hosting virtual machines... the speed did not matter as much as having an abundance of physical cores. Still - I was not expecting the 'core speed' as reported by a video game to be as close as it was.

The Zen2 (3900x) core speed on the other workstation reported as almost twice as fast, with 12 cores. Really wish that TR4 board supported the 39xx threadripper series.


iOS Safari and desktop Safari aren't identical.


Okay, thanks? Two actual apples are never identical either, but they're still more comparable than to oranges.


I don't think a threadripper would be expected to have particularly good single-threaded performance?


Zen 2 was a big uplift in single-threaded performance.

Ryzen and Threadripper 1000- and 2000-series, and Ryzen Mobile < 4000-series are all on Zen or Zen+ architecture.

The current gen Ryzen and Threadripper 3000-series and the Ryzen Mobile 4000-series are the ones running on Zen 2. This is where AMD is competitive with Intel on single-threaded workloads, largely across the board.

Parent mentioned a 2990WX, which is a Zen+ part.


Oh, right; didn't realise Zen 2 ones actually existed yet.


It's the opposite. The latest generation of ThreadRipper is not only ahead in multicore peformance it is also comparable to the highest possible single core performance from the regular Ryzen Lineup.


>besides Amazon and a few other weird server implementations

One of those 'few other' was Scaleway, but they recently buckled up and ended their ARM server lineup abruptly[1]. They were running Marvell ThunderX SoCs(Upto 64 cores, 128GB RAM).

So, Amazon might soon takeover ARM server market i.e at-least till Tim Cook does a Satya Nadella and brings in Apple IaaS with ARM CPUs.

[1]https://news.ycombinator.com/item?id=22865925


> and they're the only company making high end ARM chips

I'm typing this with a Surface Pro X running a ARM64 CPU called SQ1 which a customization of a Snapdragon 8cx. It is quite high end and is not made by Apple. It might not be the amazing custom CPUs they have on iOS devices but it is still a pretty good CPU.


I mean, the Surface Pro X chip is fine, but it would be hard to call it high end; it significantly lags the usual intel chips used in tablets and small laptops on performance, especially single core. The newest Apple ones are competitive with those or beat them.


When your cheapest smartphone is faster than the fastest competitor smartphone, I’d say you are far ahead. But your point on how well that will apply in desktop PCs is certainly valid.


> By sticking to Intel or even x86 in general, there is ample evidence Apple is leaving a lot of performance on the table.

That isn't necessarily true. Having competitive performance at lower power isn't always the same thing as having better performance, even assuming these benchmarks are representative.

Processors designed specifically for low-power make different design trade offs. One of those is to exchange maximum clock speed for IPC (because higher clocks burn watts). The A13 maxes out at 2.65GHz, the i7 8086k hits 5GHz. Chances are you can't just give the A13 a 95W power budget and see it hit 5GHz, it would have to be redesigned and the kinds of changes necessary to get there would generally lower IPC.

Apple is also riding the same advantage as AMD -- they're using TSMC's 7nm process which is better than what Intel is currently stuck with. Even AMD is still using an older process for the I/O die. We don't know what that's going to look like a year or two from now.

Meanwhile the renewed competition between Intel and AMD makes this kind of a bad time to move away. They're both going to be working hard to take the performance crown from each other and Apple would have to beat both of them to claim an advantage. And continue to do so, or they'd have a lot of pissed off customers and developers after forcing a transition to a new architecture only to have it fall behind right after the transition is over.


>Chances are you can't just give the A13 a 95W power budget and see it hit 5GHz, it would have to be redesigned and the kinds of changes necessary to get there would generally lower IPC.

It's not really a difficult concept to understand. If your CPU runs at 5GHz then maximum time a single cycle is allowed to take is 0.2 nanoseconds. CPU designers have to make sure that this limit is never exceeded anywhere on the chip. If you make a even the slighted mistake in some unimportant corner of the CPU you will end up limiting the maximum performance of the entire CPU.

Most CPUs are optimized for a specific clock frequency and going beyond it is not possible without sacrificing stability.


I'm always a bit suspicious about x86 vs. non-x86 micro-benchmarks. I remember all the fun people had with ByteMark back in the day, and while I assume that Geekbench doesn't play those sorts of games with compiler optimizations, I would really like to see data from something a bit more representative of a real-world CPU-bound application (short-lived is fine, just not synthetic).


Geekbench isn't a micro benchmark, it's a comprehensive test of the system using a variety of programs and workloads and aggregates the results. It's not a single program that one can play games with compiler optimizations.


I should have clarified that Geekbench can be more accurately described as a set of microbenchmarks. It does test a lot of different kinds of performance, but it does not test sustained performance.


I don't like arguing semantics but I don't really think any of what Geekbench does is a "micro" benchmark. At least for me that typically refers to running a small snippet of code, like calculating a dot product or something. Geekbench tests whole program performance.

It's not aida64 but it is a pretty decent metric, and consistent.


> It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power...

I agree that Apple's ARM CPUs are very competitive on simple scalar instructions and memory latency/bandwidth. However x86/x64 CPUs have up to 512 bit wide vector instructions and many programs use vector instructions somewhere deep down in the stack. I guess that the first generation of Apple ARM64 CPUs will offer only ARM NEON vector instructions which are 128 bit wide and honestly a little pathetic at this point in time. But on the other hand I am very excited about this new competition for x86 CPUs and I will for sure buy once of these new Macs in order to optimize my software for ARM64.


Also, vector instructions are not doing that well on laptops, but are thermally throttled making them less useful https://amp.reddit.com/r/hardware/comments/6mt6nx/why_does_s...


I am more than a little naive on the subject, but is it possible that the vector instructions could be farmed out to a co-processor that is dedicated to that kind of workload? I suspect that the rich instruction set leads to higher transistor count and density(?true?) and thus higher TDP?

Would love to learn more from sources if people might provide a newb an intro.


The vector instructions can't really be farmed out because they can be scattered inline with regular scalar code. A memcopy of a small to medium-sized struct might be compiled into a bunch of 128bit mov for example and then immediately working on that moved struct. If you were to offload that to a different processor waiting on that work to finish would stall the entire pipeline.


Could the compiler create a binary that had those instructions running on multiple processors? I see now I have some googling/reading to do about how you even use multiple processors (not cores) in a program.


That's what we call the magic impossible holy grail parallelizing compiler.


Good to know before I run off looking for the answer :)


The technological knowledge to do this is years and years away.


> The vector instructions can't really be farmed out because they can be scattered inline with regular scalar code.

If you believe this, you won't believe what's in this box[1].

[1]: https://www.sonnettech.com/product/egfx-breakaway-puck.html

> A memcopy of a small to medium-sized struct might be compiled into a bunch of 128bit mov for example and then immediately working on that moved struct

I'm not sure that's true: rep movs is pretty fast these days.


> If you believe this, you won't believe what's in this box[1].

There's a fundamental difference between GPU code and vector CPU instructions, though. GPU shader instructions aren't interwoven with the CPU instructions.

Yes, if you restrict yourself to not arbitrarily mixing the vector code with the non-vector code, you can put the vector code off in a dedicated processor (GPU in this case). The GP explicitly stated that a lack of this restriction prevents efficiently farming it off to a coprocessor.


> I'm not sure that's true: rep movs is pretty fast these days.

That's only true if you target skylake and newer. If you target generic x86_64 then compilers will only emit rep mov for long copies due to some CPUs having a high baseline cost for it. There's some linker magic that might get you some optimized version when you callq memcpy, but that doesn't help with inlined copies.


I think people with computers more than five years old already know that their computer is slow.

Why exactly do you think seven-years-old is too-old, but five-years-old isn't?


That is irrelevant. The default target of compilers is some conservative minimum profile. Any binary you download is compiled for wide compatibility, not to run on your computer only.


That’s different. Rendering happens entirely on the GPU, so the only data transfer is a one-way DMA stream containing scene primitives and instructions.


There's absolutely no reason it _has_ to be one-way: It's not like the CPU intrinsically speaks x86_64 or is directly attached to memory anyway. When inventing a new ISA we can do anything.

And if we're talking about memcpy over (small) ranges that are likely still in L1 you're definitely not going to notice the difference.


By definition a co-processor won't share the L1 cache with another processor.


Exactly.


Then you will face the same problems that GPUs suffer from. Extremely high latency and constrained memory bandwidth. Sending an array with 100 elements to the GPU is rarely worth it. However, processing that array with vector instructions on the CPU is going to give you exactly the speedup you need because you can trivially mix and match scalar and vector instructions. I personally dislike GPU programming because GPUs are simply not flexible enough. Either it runs on a GPU or it doesn't. ML runs well on GPUs because graphics and ML both process big matrices. It's not like someone had an epiphany and somehow made a GPU incompatible algorithm run on a GPU (say deserializing JSON objects). They were a perfect match from the beginning.


This is not an area of expertise for me, so is there a reason to not offload vector processing to the GPU and devote the CPU silicon to what it's good at, which is scalar instructions?


There are many reasons. The latency of getting data back and forth to the GPU is a pretty high threshold to cross before you even see benefits, and many tasks are still CPU bound because they have data dependencies and logic that benefit from good branch prediction and deep pipelines.

Many high compute tasks are CPU bound. GPUs are only good for lots of dumb math that doesn't change a lot. Turns out that only applies to a small set of problems, so you need to put in lots of effort to turn your problem into lots of dumb math instead of a little bit of smart math and justify the penalty for leaving L1.


Yes, communications overhead. SIMD instructions in the CPU have direct access to all the same registers and data as regular instructions. Moving data to a GPU and back is a very expensive operation relative to that. The chips are just physically further away and have to communicate mostly via memory.

Consider a typical use case for SIMD instructions - you just decrypted an image or bit of audio downloaded over SSL and want to process it for rendering. The data is in the CPU caches already. SIMD will munch it.


For certain professions like media editing vector instructions help. But for your average Facebook / Netflix / Microsoft Word user, a kind of user that 95% users are, there are less benefits on vector instructions.


Are you saying Facebook, Netflix and Microsoft Word don't require media processing? Pretty sure you'd see plenty of SIMD instructions being executed in libraries called by those applications.


AVX is widely used in things as basic as string parsing. Does your application touch XML or JSON? Odds are good that it probably uses AVX.

Does your game use Denuvo? Then it straight-up won't run without AVX.

People are stuck in a 2012 mindset that AVX is some newfangled thing. It's not, it's used everywhere now. And it will be even more widely used once AVX-512 hits the market - even if you are not using 512-bit width, AVX-512 adds a bunch of new instruction types that fill in some gaps in the existing sets, and extend it with GPU-like features (lane masking).


Are you saying that iPhones and iPads are bad at Facebook, Netflix, and Microsoft Word? If they are, the end user certainly can’t tell. If they aren’t, then it doesn’t really matter does it?


Phones are much more reliant on having hardware decoders for things like video while desktops can usually get away with a CPU-based implementation, yes.


Sure but the same is true about performance in general.


That's not really true. Single-threaded scalar performance is still super important for the everyday responsiveness of laptop/desktop systems. Especially for applications like web browsing which run JavaScript.


Your UI is slow because of IO and RAM and O(n^2) code, not CPU. Look at your activity monitor.


> It's worth noting that Geekbench is a pure microbenchmark.

This isn't just a note, it's an important clarification.

Microbenchmarking is used in-lieu of proper benchmarking because you can't do proper benchmarking.


How was Apple able to get here? What secret sauce does Apple have in it's chip that's beating out Qualcomm, Intel and AMD?


Apple is willing to trade die space (i.e. manufacturing cost) for performance in a way that their competitors are not.

Anandtech architecture reviews are helpful, as always. Worth reading for a page or two from the linked page:

https://www.anandtech.com/show/14892/the-apple-iphone-11-pro...

https://www.anandtech.com/show/13392/the-iphone-xs-xs-max-re...

The short of it is: massively wide execution units, massive amounts of SRAM, massive amounts of cache at all levels. There's no real "secret sauce", they're just willing to pay to make an incredibly fat core and Qualcomm and company are not.

They have 16 MB of system cache on A13 for 2 high-performance and 4 low-performance cores, which is as much as a 9900K gets for 8 cores and as much as Zen2 gets for 4 cores. Plus another 8MB per big core on top of that (so up to 24MB per core in single-threaded mode), and 4 MB per small core.

It helps that they're a vertically-integrated company, they don't have to sell their processors on the open market at competitive prices such that an OEM can also make a profit selling a finished product at competitive prices, they just sell the finished product.


It's intelligible that Apple A series CPU wins to other ARM chips. But I can't find why they're competitive with big Intel/AMD chips even though lower core clocks. x86-64 is really a problem?


I don't buy it until I see some more well done tests.

Something like Cinebench ran 10 times in a row and taking the average of results would be more meaningful.

Also, the benchmark has to enable or disable optimization on all platforms. Some people on reddit claim that Geekbench is highly optimized for ARM and less optimized for X86.


That is somethimg I've been wondering too. Is such a huge difference in perf/watt due to the ISA?


I think you are misreading the L2 sizes. It looks like 1x8MB L2, 1x4MB L2, and 1x16MB system cache. So you are correct that a lone thread could get up to 24MB of cache, but that's not per core. It's a total of 28MB of cache on the die.

Zen2 has 2x16MB L3 and 2x4x512KB L2 per chiplet (36MB) so it's not like Apple is throwing down afore-unheard-of quantities of SRAM. It's true a single A13 thread has much more accessible L2 capacity, though.


Apple is arguably making some better design trade-offs. This is possible due to them making money on the whole computer/device, and not just the CPU as Intel/AMD/others need to do. So while the others are doing everything possible to both minimize die space and likely keep around a bunch of compatibility cruft that could probably go, Apple is free to go in a different direction.

The 5775c (https://wccftech.com/intel-broadwell-core-i7-5775c-128mb-l4-...) was a good example of a no-compromises (from a cache standpoint) CPU from Intel that just annihilated their other CPUs at the time... it's not that hard to do, provided you're willing to pay the price for it somewhere else.


Two reason for x86 (IMO) and 3 reasons for ARM.

I've thought for years that the overhead of the extra decoder hardware and legacy cruft was non-trivial (though Intel claims that's not true). The evolution of ARMv8 (where ARM went much closer to it's RISC roots) seems to disagree. This explains the performance per watt issue (and potentially some IPC).

That said, scaling IPC (instructions per clock) seems to have a pretty big limit. x86 has basically hit a wall and it's been lots of time and research for small gains. Additionally, the biggest challenges in large systems is that the cost to do a calculation on some piece of data is often less than the cost to move that data to and from the CPU. As Apple increases cache size, frequency, and starts dealing with bigger interconnect issues, I suspect we'll see a distinct damper on their performance gains.

Qualcomm (and ARM as the designers of the core) has a very different problem to solve. They can't make money off of software. They make money when they sell new chips and they make more money from new designs than from old ones. This means incremental changes to ensure a steady revenue stream. Since Apple having a fast, proprietary CPU doesn't actually affect Qualcomm or ARM, they most likely don't even see themselves as in direct competition. Most people buy Android or iOS phones for reasons other than peak CPU performance and Qualcomm is fairly competitive with a lot of these (esp actual power usage).

A further complication is that they also need "one design to rule them all". They can't afford to make many different designs, so they make one design that does everything. Apple doesn't need to spend loads of time and money trying to optimize the horrible aarch32 ISA. Instead, they spend all that time on their aarch64 work. ARM and Qualcomm however need to add that feature so the markets that want it still buy their chips.

Apple shipped their large 64-bit design only a couple years after the ISA was introduced. Put simply, that is impossible. It takes 4-5 years to make a new, high-performance design. It took ARM 3 years for their small design (basically upgrading the existing A9 to the new ISA) and closer to 4.5 years to actually ship their large design (A57) and another year for a "fixed design (A72, though it's actually a different design team and uarch). Though the gap has been closing, 2.5 years in the semicon business is an eternity.

A crufty ISA and non-CPU scaling problems seem to explain Intel/AMD. A late start, bigger market requirements, and perverse incentives against increasing performance seem to explain ARM/Qualcomm


Its hard to buy that the ISA really has anything to do with it. As you mention apple has a fairly narrow market target for their cores. Both intel and AMD are basically building server cores (can you say threading?) and selling them as client devices. Mostly because that is where the real money for them is. Apple OTOH is building a client/mobile core, and they benefit from a number of "features" that they enable, which are known performance problems in the desktop/etc space but continue for legacy reasons. Combined with Intel basically standing still for the last ~5 years, and the tables have reversed as far as who is ahead on process+microarch.

Basically a lot of apples advantages are:

1: Complete vertical control of compiler+OS+hardware 2: Plenty of margin to spend on extra die 3: More advanced process @ TMSC 4: Very narrow focus, apple has only a few models of iphone+ipad, where as intel has dozens of different dies they modify/sell into hundreds of product lines. So everything is a compromise.

Any of those four give them a pretty significant advantage, the fact that they benefit from all four cannot be discounted.


aarch32 decode is far less complex than x86 and aarch64 is even less complex than that. On the power consumption side, decoders definitely make a difference. They use tons of power and a huge number of instructions means having a huge, power-hungry instruction decode cache.

In addition, complex instruction decoding requires more decode stages. This isn't a trivial cost. Intel can shave off several stages if they have a decode cache hit and that's not including the ones that are required regardless (even the simple Jaguar core by AMD has 7+ decode stages possible). Whenever you have a branch miss, you get penalized. Fewer necessary decode stages reduces that penalty.


OTOH, you have x86 using what is effectively a compressed instruction encoding, and a trace cache (although its advantageous enough arm designs are apparently using them now too) which reduces the size of the icache for a given hit rate. So the arch losses a bit here, and gains a bit it elsewhere. Its the same thing with regard to TSO, a more relaxed memory model buys you a bit in single threaded contexts, but frequently TSO allows you to completely avoid locks/fencing in threaded workloads which are far more expensive.

So people have been making these arguments for years, frequently with myopic views. These days what seems to be consuming power on x86 are fatter vector units, higher clock rates, and more IO/Memory lanes/channels/etc. Those are things that can't be waved away with alternative ISAs.


If it were vectors, clocks, and memory, then Atom would have been a success, but even stripping out everything resulted in a chip (Medfield) that under-performed while using way too much power.

Either the engineers at Intel and AMD are bad at their job (not likely) or the ISA actually does matter.


Atom is a success, just not where you think it is. The latest ones are quite nice for their power profile and fit into a number of low end edge/embedded devices in the denverton product lines. Similarly the gemilake cores are not only in a lot of low end fairly decent products (pretty much all of Chuwi's product lines are N4100 https://www.chuwi.com/), but they are perfectly capable very low cost digital signage devices/etc.

So not as sexy as phones, but the power/perf profiles are very competitive with similar arm devices (A72). If you compare the power/perf profile of a denverton with a part like the solidrun MACCHIATObin the atom is way ahead.

Check out https://www.dfi.com/ for ideas where intel might be doing quite with those atom/etc devices.


Conversely, if the instruction set was the main factor, you'd expect Qualcomm and Samsung also to have ARM processors with a similar power to performance advantage over Intel chips.

The reality is just that Apple is ahead in chip design at the moment.


They are 2 years behind Apple and slowly catching up.

When Medfield came out, Apple didn't have it's own chip and x86 still lost. It was an entire 1.5 nodes smaller and only a bit faster than the A9 chips of the time (and only in single-core benches). The A15 released not too long after absolutely trounced it.


>When Medfield came out, Apple didn't have it's own chip

>It was an entire 1.5 nodes smaller and only a bit faster than the A9 chips of the time

You seem to have the chronology all mixed up here. Medfield came out in 2012. The A9 came out in 2015. Apple was already designing its own chips in 2012. (The A4 came out in 2010.)


> Apple shipped their large 64-bit design only a couple years after the ISA was introduced.

Actually ARM cores were available earlier than that, just nobody wanted to license them until the elephant in the room (Samsung) forced everybody to follow.


Timeline

2011 -- ARM announces 64-bit ISA

2012 -- ARM announces they are working on A53 and A57 and AMD annouces they'll be shipping Opteron A1100 in 2014.

2013 -- The Apple A7 ships doubling performance over ARM's A15 design.

2013 -- Qualcomm employee leaks that Apple's timeline floored them and their roadmap was "nowhere close to Apple's" (Qualcomm seems to switch to A57 design around here in desperation -- probably why the 810 was so disliked and terrible).

2014 -- Apple ships the A8 improving performance 25%.

early 2015 -- Samsung and Qualcomm devices ship with A57. Anandtech accurately describes it saying "Architecturally, the Cortex A57 is much like a tweaked Cortex A15 with 64-bit support." Unsurprisingly, the performance is very similar to A15.

late 2015 -- Apple ships A9 with a 70% boost in CPU performance.

later 2015 -- Qualcomm ships the custo 64-bit kryo architecture as the 820. It regresses in some areas, but offers massive improvements in others for something close to a 30% performance improvement over the 810 with A57 cores.

2016 -- AMD finally launches the A1100. ARM finally ships the A72 as their first design really tailored to the new 64-bit ISA.

Final Scores

Apple -- 2 years to ship new high-performance design

ARM -- 4 years to ship high-performance design, 5 years for new design

Qualcomm -- 4.5 to 5 years to ship new high-performance design

Sorry, something's definitely fishy. Nobody can design and ship that good of a processor in less than 2 years.

https://www.hardwarezone.com.my/tech-news-qualcomm-employee-...


Isn't the current Intel Core line an evolution of the Pentium M (2003), itself an evolution of the Pentium III (1999)? Apple starting a fresh design with up to date constraints may have given them room for improvements I guess.


That doesn't explain why AMD and Qualcomm can't keep up either.


Sandy Bridge was the last really big architectural change. It seems heavily inspired by the Alpha EV8 design (Intel bought Alpha from HP in the 2001) with of course, a very different decoder section (they wrap the x86 decoder around a RISC architecture).

https://www.realworldtech.com/alpha-ev8-wider/


Vertical integration? Even if their chips end up being a bit slower in the end, they'd probably increase their overall profit margins and get increased flexibility aligning their development cycles.


>It's worth noting that Geekbench is a pure microbenchmark. The iPhone will not sustain performance as long as the others. The point is that Apple could solve this when moving to bigger devices.

Is that inherent to the architecture or is this a self-imposed limitation by Apple since it has to sip power and run without any active cooling?

Also, the A-series chips seem to fall down in comparison against the Intel Macs on multi-core performance which seems like it would matter for anyone who needs a desktop.


> extremely far ahead Apple seems to be in terms of CPU power and efficiency

RISC finally coming into is own.

For the longest time, Intel was able to fend off much better architectures simply by being a fab generation or two ahead, more clock speed, more transistors, more brute force.

Not to belittle the engineers wringing out seemingly impossible performance from the venerable architecture, but the the architectural limitations always mean extra work and extra constraints that have to be worked around.

And now that Dennard scaling is dead, Moore's law wheezing and not helping all that much for our mostly serial workloads, they just can't compensate for the architecture any longer, at least not agains a determined, well-funded and technically competent competitor that's not beholden to Wintel.

I remember when the Archimedes came out and just offered incredibly better performance than the then prevalent code-museums, 386 and 68K variants, at incredibly lower transistor counts. The 486 and 68040 were able to compete again, but with vastly larger transistor budgets (and presumably power budgets as well, but we didn't look at that back then).

Oh, and can we have our Tansputers now? Pretty please, Xmos?


I'm not sure how you can call that a victory. ARM and AMD are only ahead because of superior manufacturing processes. That's exactly the thing you accused Intel of doing.


How will this change relying on libraries like Intel's MKL library if Apple is using their own chips?


One of the general downsides of Apple development is how often they change absolutely everything. Like the time they switched from 68K to PowerPC. Or MacOS classic to Mac OS X. Or PowerPC to Intel.

Apple provides brief windows of automated compatibility, but code untouched since 2003 won’t run on a mac today, and code untouched since 1986 wouldn’t run on a 2003 mac.


Apple will say developers should have used Core Whatever instead of Intel libraries.


I'm not sure the difference is so great. It's all about the node about the node... (to the tune of "all about the bass" :)

I just got a 2020 MacBook Air with an Ice Lake 10th generation 10nm X64 chip in it.

It's a quad-core, newer core rev, has AVX2 and a bunch of other stuff.

It runs very noticeably cooler than a 2019 Air with a 14nm Amber Lake chip in it. Battery life is also noticeably better. When I read the spec differences I basically traded my 14nm Amber Lake Air for a 2020 Air (and also because of the better keyboard). It's a great machine.

The difference between the Amber Lake and Ice Lake core designs is not substantial and the Ice Lake has twice the cores and a larger on-board GPU, so how is it so noticeably cooler? I can fully max out all four cores of the 10nm Ice Lake and it doesn't get as hot as the 14nm Amber Lake did at lower loads! The answer is obviously the process node: 10nm vs 14nm. It's a more efficient chip at the physical circuit level.

ARM has some intrinsic power advantages over X64. The biggest thing is that ARM instructions come in only two or three sizes and it's easy to size and decode them, while X64 instructions come in sizes from one byte to 16 bytes and are a massive pain to decode. That decode cost comes in energy and transistors, but it's worth pointing out that this is a mostly fixed cost that shrinks as a percentage of the overall CPU power/transistor budget as the process node shrinks. In other words the cruftiness of X64 remains the same as you go from 22nm to 14nm to 10nm.

Other than the ugly decode path the ALU, FPU, vector, crypto, etc. silicon is not fundamentally different from what you'd find in a high-end ARM chip. A lot of the difference is clearly in the fabrication. ARM chips have been below 14nm for a while, while Intel X64 chips have lagged.

(Tangent: since the actual engine block is largely the same, I've wondered if Apple might not slap an X64 decoder in front of their silicon in place of the ARM64 decoder and make Apple X64 chips?! I am not a semiconductor engineer though, so I don't know how hard this would be and/or what IP issues would prevent this. Probably unlikely but not impossible. They certainly have the cash and leverage to muscle Intel into licensing anything they need licensed.)

If Intel gets its act together with process nodes and/or starts using other fabs who are at tighter lower power nodes, the advantage will shrink a lot. AMD already has mobile chips that are close to Apple's ARM chips in performance/watt, partly because they are fabbed at TSMC at 7nm.

BTW: I'm not claiming Apple's chips aren't impressive, and as long as they don't lock down MacOS and make it no longer a "real computer" I personally don't mind if they go to ARM64. Also: the fact that Apple's chips only get this great performance in bursts is mostly due to power and cooling constraints on fanless thin phones and tablets. In a laptop with better cooling or a desktop they could sustain that performance no problem.


> Core i7 8086k

TIL of this commemorative naming. Sole comment I could find on HN 2 years ago https://news.ycombinator.com/item?id=17409849


> It's worth pointing out how extremely far ahead Apple seems to be in terms of CPU power and efficiency.

Nope. But Apple put a MASSIVE amount of cache on their chips.

It's not rocket science, but it is expensive.


>> By sticking to Intel or even x86 in general, there is ample evidence Apple is leaving a lot of performance on the table. Not just in MacBooks - but for the Mac Pro too.

So we should expect lock-in on Mac Pro chips as well. Exactly what people were asking for. Another brick you can't upgrade without paying the cost of a new machine.


having faster processors, is what's wrong with our computing industry in general. processors on laptops | desktops have become 1000x faster, but has the software kept up.

no software has gotten slower by the ages. Apple, Qualcomm, Intel can make as much faster processors with x cores, but do we have software able to utilize those, eh! run a JS heavy site | app and you see most processors heat up these days whether mobile or desktop.

most programming languages can't easily delegate work to cores with smoothness like how Erlang n Elixir do it. in Python threads, were a nightmare but now with concurrent futures or dask at least we can utilize all cores.

tldr - we need to make faster software


It's not even "faster software" so much as eliminating the culture of Developer Convenience at the expense of User Experience. That's what got us Electron. I've actually seen comments on HN unironically describing the web as "the perfect app platform".

Unfortunately, the majority of users seem conditioned to accept software with awful performance, so there's no impetus for developers to upgrade their skills.


It's not about "developer convenience". It's about "developers are expensive".

I can build an application in electron 4-10x faster (at least) versus building the same application in C. If I'm costing a company $100-200 per hour, would they rather pay me for 4 months (500 hours and $50,000-100,000) or would they rather pay me for 1-2 years (2-4k hours) at a cost of $200,000 - $800,000?

What about when we multiply that by a team of 5-10 people? Don't forget that time to market is often incredibly important. Tell them 2 years and 8 million or 4 months at 1 million and what will they say?


You might build an application faster in electron...

OTOH, C isn't a GUI development environment. If you want to compare a C based environment you compare it with GTK/QT/winforms/etc.

In the end, as someone who has written GUIs in a wide range of tooling i'm not sure there really is that much difference.

I've yet to see an electron application with 1/2 the functionality of similar native applications. Electron maybe gets you bootstrapped faster but then you bog down in basic data manipulation, and functional behavior because it turns out HTML/CSS/Javascript are absolutely terrible for building rich GUIs. Even now 20+ years after people first tried to do it. There are so many things people took for granted in the past (ex: grids with arbitrary sort, editing, and a scrollbar that represents where in the data you are) that are far more difficult in HTML than they are in more native solutions. Plus, the scalability is miserable, take your favorite framework and have it load 10k rows of data into a table. That was something you could do in VB/delphi in the mid 1990's on a 486 in a matter of seconds. This is why pagination is so popular. Half a meg of actual data bloats up into half a gig when you try rendering it in chrome/etc so your forced to leave it on the server and round trip for tiny bits.


It is because you want to hire inexpensive web developers to develop for desktop and get the check box ticked. Your average bootcamp webshit doesn't even know Big O notation. I don't expect them to be as productive as good developers either.

The web ecosystem is a big mess where trends change every month and working in web ecosystem requires looking up things a lot because no one bothers to master the thing. It doesn't help that many web developers don't have solid foundations.

This is a trap managers generally fall into. Cheap developers aren't equivalent to competent developers, and their incompetence will cost you more than what you save by hiring them instead of a competent developer.


Don't worry, the future is wearables and more efficient mobile phones. Electron just won't run on your Apple Watch.


The only reason Electron isn't on your Apple Watch today is because of App Store Review.


"No one will ever need more than 640K of RAM." - Definitely NOT Bill Gates.


The mobile equivalents of Electron (PhoneGap/Cordova/Ionic) are already extremely popular.


Never used electron but I did see threads on how it’s very bloated. Are there other issues?


Electron is Chrome and Chrome itself is pretty tightly optimised for what it does. The issue isn't actually Electron so much as using a platform designed for typesetting for making complex GUI apps, a task for which it was never designed and isn't particularly good at.

But. That said. Whilst I'm no big fan of web apps, there are good reasons they're so prevalent. It's not merely about developer convenience. Native GUI toolkits can appear artificially performant because they're required by the OS and thus almost always resident, vs cross platform toolkits that may be used by only one app at a time. When you open the lid though the gap between an engine like Blink and something like Qt, JavaFX or Cocoa isn't that big. They're mostly doing similar things in similar ways. The big cost on the web is the DOM+CSS but CSS has proven popular with devs, so native toolkits increasingly implement it or something like it.


> Chrome itself is pretty tightly optimised for what it does.

My laptop battery would disagree. I get about an hour and a half with Chrome/Electron, four or five without.


Which laptop is that?

Note: I didn't say the web itself is highly optimal. Just that for a web browser, Chrome is pretty thoroughly optimised.


It appears to me that chrome is perfectly optimized for speed at cost of memory and power.


Sort of reminds me of Braess's paradox: you add a new road and then overall traffic slows. More bandwidth? Larger videos. More roads? More cars. Faster processors? More abstractions.


Also Jevon's paradox, the more efficient you make a process or resource, the more consumption of it will grow.


>tldr - we need to make faster software

Just use compiled, strongly typed languages.

No: Javascript, Python, PHP, Ruby Yes: Java, C#, C, C++, Rust.


Imagine one day we get a computer able to perform operations unimaginably faster than what we have now. It would process Snapchat's dog filters in femtoseconds, firing up and tearing down millions of kubernetes clusters every frame (because it would be easier to write it that way). Would is still make sense to try and optimize software? Wouldn't it be more constructive to solve real-world tasks instead?

Hardware is fast and cheap, and it's getting even faster and cheaper. It's perfectly fine to utilize this power, if it makes developing products faster, easier or cheaper.

Now, there are still cases when you need to send a machine to roam the mountains of another planet. This may justify doing some assembly.


> It would process Snapchat's dog filters in femtoseconds, firing up and tearing down millions of kubernetes clusters every frame (because it would be easier to write it that way).

No, it wouldn't. Those tasks would just become less efficient with time as developers stopped caring to optimize them, as has happened with the overwhelming majority of consumer software for the past several decades.


They would work well enough on the computers of that generation and painfully slow on today's supercomputers. Yes, just like the software we have today.

I was trying to express that Electron (and the like) is not an inherently bad thing. It allows to trade hardware capacity for easier development experience. Those developers who use it create useful software that works. And software that works in a given environment is exactly the point of the industry, is it not?


Not to mention that with the latest few generations of CPUs Intel has also increased burst/decreased base clock while promising unrealistic power draws.

In other words, Intel's CPUs are mostly about "burst" performance these days, too. They don't get anywhere the promised peak performance for any significant amount of time.

https://www.anandtech.com/show/13544/why-intel-processors-dr...


Oh boy, here comes the AMD peanut gallery.

The conclusion of your article is precisely about how motherboard manufacturers don't obey the official/nominal behavior and how that makes it irrelevant:

> Any modern BIOS system, particularly from the major motherboard vendors, will have options to set power limits (long power limit, short power limit) and power duration. In most cases, at default settings, the user won’t know what these are set to because it will just say ‘Auto’, which is a codeword for ‘we know what we want to set it as, don’t worry about it’. The vendors will have the values stored in memory and use them, but all the user will see is ‘Auto’. This lets them set PL2 to 4096W and Tau to something very large, such as 65535, or -1 (infinity, depending on the BIOS setup). This means the CPU will run in its turbo modes all day and all week, just as long as it doesn’t hit thermal limits.

Intel desktop processors will sustain boosts for an arbitrary amount of time. Yes, they will exceed the nominal TDP while doing so, so do AMD processors (AMD's version of "PL2", which they call the "PPT limit", allows power consumption up to 30% higher than the nominal TDP while boosting, and there is no official limit to how long this state may occur).

These limits are of course observed much more strictly on laptops since power/thermal constraints actually make a real-world difference there. But overall, Ice Lake perf/watt is competitive with Renoir and its IPC and per-core performance is actually higher than Renoir. You just get fewer cores.


While I don't know much about phones and suppose you are correct there, I have been disappointed with the hardware that they pushed to their macs the last 5 or so years. For instance, why did they ship newer macs without upgrading to the newest Intel chips?

I guess you can solve anything if you do it yourself, but that was perplexing for me that they would do that. I am not an expert on this, but this happened some time between 2016–2018.

Edit: I stuck with my upgraded 2013 model and even today it's fast and good for the job. Even my 2009 mac is still running. So I am not saying they can't pull it off, I am just wondering whether they give enough attention to their non-iPhone products.


Since I was downvoted, here are some discussions on what happened cerca 2016–2018:

https://news.ycombinator.com/item?id=12816474

https://news.ycombinator.com/item?id=16130872

https://news.ycombinator.com/item?id=13364583

Don't bother with the links, the comments are more informative. In some of the comments people explain why it's Intel's fault and not Apple, in some comments people explain processor speeds. A lot of it is about the keyboard that they changed, but that is not related to this. My point was more that I think Apple should put the amount of effort into their laptops that they did in 2009, whether or not they do is subjective until we see their new processors.


"newest" might be unfair. I imagine there's a lot of product life-cycle / driver testing to ensure high quality.

Whereas other PC shops typically will just throw PCs together and assume they'll fix issues found in patches.

That's not going to fly for "just works" Apple product image - and why it's positioned and priced as a "luxury" product.


Well, I've purchased Macbook Pros, iPhones, and iPads in the past, and every single one got incrementally more sluggish with each OS update until it was unsuable. This was in 2013-2015. It's been 5 years since I've owned an Apple product, but I just have a hard time believing Apple is heads and shoulders above everyone else.


1. ARM-based CPUs have for years (decades) been much more power-efficient than Intel and AMD CPUs. This is regardless of Apple, and is not really about being ahead - it a different path in the design space.

2. If Apple were "far ahead" in terms of performance in general, they would probably have been using these chips in products which aren't smartphones.


wrt point 2. I always assumed that it's not only chip performance or efficiency, but ecosystem and software compatibility that's a larger issue.

It's not trivial to port x86 software to ARM, or even run an energy efficient emulator.

Is that correct?


I believe you are correct. I suspect a big part of Apple's increased popularity since moving to x86 has been the ability to run an alternate OS either via dual-boot or in a VM at native speeds. While it was possible to run Windows in a VM back in the PowerPC days, it was sloooow. While switching to ARM would be relatively trivial for Apple's own software and even OS X applications, the downside would be losing easy/performant/power efficient access to non-OS X applications. That said, I expect Apple to make that trade-off sooner rather than later... x86 compatibility isn't nearly as important today as it was 10 or even 5 years ago.


In most cases it's fine since Linux distros for arm have existed for a while. Getting proprietary vendors to support it is a different question.


> because it's not easy to run your own software on an iPhone

Then why do I need all this performance? Perhaps to run js on shitty websites. Well, maybe for gaming.


Alternatively, when you don't need to be running full throttle, you get better battery life. Which any consumer will appreciate. And is also perhaps an environmental win, if it allows them to get away with smaller batteries.

That said, the experience of trying to browse a local newspaper's website with NoScript turned off on my 2015 MacBook Pro tells me that, yes, JS on shitty websites is a problem. And not one that most people would find to be particularly avoidable. Heck, even GMail is getting to be noticeably slow on that computer.


The extreme slowness of the modern web honestly is a reason. I do a lot of my personal web browsing on an elderly (nearly 6 year old) iPad Air 2. For everything but the web, and for old websites, it's fine. For the "100MB of react crap" type of website, it's getting pretty painful, tho.


My personal laptop is a Lenovo X200s from 2008 or 2009 with 8GB RAM. I use it to write words, read PDFs, do some software development (most often C and "vanilla" web stuff), design electronics in Kicad, and casually browse the web... and I'd be happy with it as my only computing device forever, except for some websites it's starting to be a little slow, especially a well-known site that's meant for exchanging messages under 500 characters. There's some irony there.


Mine is an X220s, so slightly newer than yours, but still a modest i3, upgraded to 8GB RAM, 128GB SSD and dual-band WLAN. With a new 9-cell battery, I get ~6 hours of battery life, and it's fine for general browsing with a decent number of open tabs, even a few games (Darkest Dungeon and some emulated SNES games).

As long as it keeps ticking and I can get whichever spare parts I need on eBay or something, I'm not going to replace it anytime soon.

The only reason I have a more powerful desktop PC (still ~2011 vintage) and don't just use the X220s in a dock, is that it it struggles with a 1440p external monitor (full HD is fine, though) and I sometimes like to play more graphically intense games.


>As long as it keeps ticking and I can get whichever spare parts I need on eBay or something, I'm not going to replace it anytime soon.

Good luck with that as the Javascript crowd are in competition to slow the web as much as they can.


I generally try to stay away from the worst offenders, most sites I use are relatively lightweight, like HN and various forums. Even Youtube isn't too bad, as long as you force it to not use the broken Polymer rendering on Firefox.


Do you use an ad blocker like UBlock Origin? That speeds up web by significant margin. I also used that to disable JS by default and enable only when needed, works well for me.


You will never be able to outrun bad software with good hardware. At best it's a rat race where devs target the n Percebtile machine and performance is effectively auction priced to buy an above "average" machine.

Just say No to bad software, starting with web ads.


>You will never be able to outrun bad software with good hardware.

We are just doing that. And one of the reasons is better software costs more than faster hardware.


Not that these Apple chips will run x86 or x64 code. Unless they bother with a Rosetta 2.0 then gaming is out of the question if Apple makes this jump.


The Catalina move already killed off a lot of games where the developers didn't provide an updated (x86_64) binary. A move to ARM would kill more, of course, but it's not like there's not precedent.

I'm half-convinced that Apple killed 32bit support so early precisely to see how developers coped; if it had been really bad they could have re-introduced it in a point release of Catalina. As it was, it wasn't very bad and most developers complied, which is an argument in favour of an ARM transition being feasible.

The only other reason I can think of to so aggressively move to 64bit is security, but most of the apps that were stuck on 32bit were not that big a security concern.


Having worked at big companies, I start to suspect that many recent Apple deprecations are detached from technical reasons or customer scenarios. They are playing internal games and don't care if it makes sense nor will they have any interest in reversing or revisiting the wrong call later on.

I am not just talking about 32-bit support. It shows up in a lot of random libraries that wind up deprecated and replaced with something less capable. That's a pattern I have seen a lot elsewhere and it's usually a bad sign for overall product quality.


Yes, I think the true reason, that Catalina is so incompatible with old programs, is, that they wanted to have the big compatibility breaking before the announce the ARM transition, which then would look as not so much a big step.


This does not make sense. I can give an counter-example for this: the education market, especially the higher-education market. This market exists so long and aged so well that, in real life, enterprise-level deployment of Macs well-likely exist mostly in this market now. The down-side for this market: they are slow or reluctant to change. Those ones who make research-related or educational software never are quick enough to do the big jump for an architecture change. And they also might not be able hire more people to do this. And the customer also hate to do those type of changes, both the IT department and the researchers, nobody wants to find their code couldn’t run properly anymore on these new machines.


But that is already true. You cannot buy a mac any more which runs 32 bit software. So Apple is obviously willing to make things miserable for a considerably part of their user base. Me included. I will avoid anything with Catalina, because I still have a very few 32bit programs I want to run and can't upgrade. As long as macOS runs on x86 hardware, there is not much justification for such a break. Yes, they clean up the software stack, but at a very high price.

The only good reason I can imagine is, that when Tim Cook announces macOS on ARM, he will claim "runs everything that runs on Catalina".


Doesn’t that invalidate your earlier claim that “it wasn't very bad and most developers complied, which is an argument in favour of an ARM transition being feasible”? It doesn’t matter how nice the experience is for those who upgrade if a significant number of people avoid upgrading because the experience would be terrible. That’s selection bias.

Sure, Coke sales are down 50%, but the customers who are buying New Coke say they like it just as much as the old recipe!


Which earlier claim of mine? Are you mistaking me for another poster? My point was, that they already had the breaking change so the change for the Catalina users - which certainly is only a part of the Mac users, many stayed on Mojave because of the 32bit support - will be smooth.


Well, if you need to run diverse software, macOS isn't the best choice.

The best bet is Windows since it would happily run software made 25 years ago in Windows 95 time.

To use macOS, you have to be satisfied by using the apps which run on macOS: Adobe apps, Apple apps, Microsoft Word and open source software.


Not sure what you mean by "diverse". There is a lot of legit macOS software, which is 32 bit only and no longer updated - for example because the company went out of business or couldn't justify the effort for a port. Cutting support off for these programs is a harsh step. While I can understand that Apple doesn't want infinite backwards compatibility, it hits a lot of users. One reason for this might be the preparation for the bigger switch to a new cpu architecture.


Apple market is: creatives, iOS developers, some other developers and MAC enthusiasts.

Enterprise is Windows territory, most businesses are Windows territory, education is Windows territory, most home users / small businesses are also Windows territory.

So if Adobe apps will have an ARM build, that will satisfy a huge part of their user base. The rest would use Apple tools which will get ARM builds and open source tools which already have or will have ARM builds.


> I'm half-convinced that Apple killed 32bit support so early precisely to see how developers coped; if it had been really bad they could have re-introduced it in a point release of Catalina.

Is it really so easy to introduce 32-bit support back?


I mean, it depends on exactly what they did to get rid of it. For many Linux distros you add 32bit support back by "apt-get install glibc-x86" or similar.

_If_ they were taking the approach of a deliberately early deprecation, which it seems like they were given the timing relative to the rest of the industry, it would only make sense to make it be an easily reversible decision.


Have you looked at the iOS App Store lately? There are tens of thousands of games there. Apple Arcade has a small (100+) but well-curated selection of good games that run on both iOS and MacOS.

Oh, you meant PC games? Sure, the Mac only has a small percentage of (for example) Steam games, but that percentage is steadily rising - it's now over 25%. Switching architecture is unlikely to present a major problem for most developers, especially given that they're probably using Unity or Unreal Engine.


>Switching architecture is unlikely to present a major problem for most developers, especially given that they're probably using Unity or Unreal Engine.

As a former game developer I can tell you that that is an issue. Apple is totally against using cross platform tools. They break compatibility as much as they can.

Framework change, architecture change and so on.

Instead of going with OpenGL ES, Vulkan, OpenCL they made Metal.

If user base is large enough, as is the case with iOS, there's an incentive to go trough the pains of releasing for that platform. But that isn't the case with macOS. Maybe for Adobe is worth it to spend resources to build software for macOS, but for other companies that might not be the case.

Anyway, you make much more money by targeting Play station and Xbox, and the resources needed are the same in money and man hours so it kame sense to target macOS last, if ever.

And no, not everybody is using Unity and Unreal. That is true mostly for indies.


Aren't apps already submitted as bitcode? They don't have to emulate, just recompile.


LLVM Bitcode is still architecture specific. That is, bitcode generated targetting x86 is not compatible with bitcode targetting ARM is not compatible with bitcode targetting aarch. The reason they are distributed as bitcode is to enable additional optimisations based on what exact chip will run the code (i.e. -march=native equivalent)


https://www.highcaffeinecontent.com/blog/20190518-Translatin...

This article appears to successfully statically translate arm64 binary into an x86_64 binary using bitcode.


Using LLVM IR merely solves the problem of being able to compile to a specific instruction. You still need a compatibility wrapper for all the platform specific APIs like wine or Windows on Windows. If Apple puts in the effort then it might work out.


On Mac? I don’t think so, no, because you can move App Store apps to different machines on USB keys.

Also, a considerable number of Mac apps don’t use the App Store for distribution.


No, the Mac compiler target has never supported bitcode. And LLVM bitcode doesn’t work like that anyway.


Or run virtual machines like Android users do.


A more future proof too.


A lot of people who don't need it already buy it for the image and social status. Apple doing this for workstation-type machines too just makes even more sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: