Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why are they comparing it against e.g. Skylake, which is 6 years old?


It's worse than that. There's literally nothing similar about these systems. One is a system designed for the supercomputer MareNostrum 4 and the other for MareNostrom 5 (a completely different system).... So old CPU, but also different network cards, topology, memory (capacity and speed), storage system, operating system (SuSE from 2016 vs Ubuntu 22)... and so on. For example, they went from 10Gb ethernet to 200Gb infiniband.

And then they took all of the performance improvements that each of these contribute... and attributed them to the Nvidia CPU.


This is a misrepresentation. Included in the analyses are single-node runs, which don't care about network cards etc. This is a platform comparison, not a CPU showdown; among the questions here is whether Grace-based nodes are feasible at all for production HPC. The answer is a tentative yes, although I still have concerns about cooling at this density in a general-use (i.e. highly fluctuating) workload.

But mostly, these numbers are for their users, who are aware the system contract has been awarded but want to know what to expect when their workloads hit the new system.

Incidentally, MareNostrum 4 has a 100gbit Omnipath fabric. I'm sure they'd love to test against latest Omnipath, but Intel dumped the tech, so our choices these days are 200/400 gbit Ethernet or similar-throughput Infiniband.


The idealist in me says "that's what they have, and these organizations don't have the cash to buy new stuff for benchmarking".

The cynic in me says "that's the only way for ARM to even appear competitive in HPC".

Note: real HPC not "serve-ads-or-train-ad-serving-ai-models-with-the-highest-flop-per-watt" HPC.


Bear in mind at least one of those notes the code wasn’t optimised for ARM while all the meaningful HPC code in existence has been painstakingly optimised for Intel for decades.


Right the arm stuff is probably in the "it runs" camp. Largely because its SVE, which is barely available, and the code written to utilize it has largely probably been tuned for the a64fx, or maybe the gravaton v1's.

Both of which have considerably different memory and vector size/issue characteristics. So three different SVE variations now, and the previous two show significant uplift when given custom tuning (ex: see gcc -mtune=neoverse-512tvb, vs the custom a64fx compiler benchmarks). Arm put a bunch of effort into creating an instruction set that is microarch agnostic, but then its not exactly worked the first couple tries. Maybe that will be fixed with V2 and all SVE cores going forward.


Indeed. Right now there is about 0% HPC code tuned to Grace and Grace Hopper.

I'd love if Nvidia made reasonably priced Grace and Grace Hopper ATX boards (or a Nvidia Studio stylish desktop, priced like a Mac mini) developers could buy so that we can do our best to optimize code for Grace for free in our spare time.

Same goes for AMD and their MI300 family, in case AMD is listening. There is less to be gained, as the x86 side is pretty well cared for ATM, but, still, I'd love to see such a beast.


Because they are comparing it with the system they have and a new system they might eventually have.

It's their own analysis for their own benefit.


They should compare what they have vs Grace, what they have vs Genoa, and what they have vs Emerald Rapids.


How does that help? Surely they'd still need to compare against buying a new x86 system


They already awarded the contract. The question users will be asking is "how much faster is the new computer compared to the one we've been using?" These are the answers.


If you scroll down to the Stony Brook results, they compare it to more modern CPUs.

I've had access to one of these (interested in it for its massive amounts of IO bandwidth for the power budget), and its stunningly fast. And yes, it runs FreeBSD.


>I've had access to one of these (interested in it for its massive amounts of IO bandwidth for the power budget), and its stunningly fast. And yes, it runs FreeBSD.

Are we going to get Serving Netflix Video Traffic at 1600Gb/s and Beyond anytime soon? :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: