Perfect fidelity isn't there but all popular browsers have a "save page" functio...

oconnor663 · on Aug 10, 2020

That's exactly what the parent is criticizing. The problem with save page is that the HTML you save still contains tons of links to server resources, particularly CSS and JS. Of course those links will work if you look at the saved page immediately after you save it. The problem is that if you come back later, sometimes even just the next day, they no longer work. A lot of JS file names are auto-generated random numbers, produced by packaging systems rather than humans, which change whenever the developers edit their JS. They aren't designed to be stable.

There are tools that try to fetch those links and update the HTML to point to the local copy. But those tools can only go so far. JS is allowed to fetch new files dynamically, and there's no reliable way to look at a piece of code and automatically figure out what it's going to fetch when you run it.

kindofastrawman · on Aug 10, 2020

> JS is allowed to fetch new files dynamically, and there's no reliable way to look at a piece of code and automatically figure out what it's going to fetch when you run it.

You've diverged from the context and are no longer doing an apples-to-apples comparison. The things you're describing are all opt-in and amount to having to deal with an adversarial input. There's nothing inherent to the medium that requires those things.

In other words, a person publishing a PDF is already abstaining from certain things. (Namely, the sorts of things you're bringing up that would make for a pathological case.) If the person who publishes a PDF does a straightforward translation into a web page, then you end up with something that doesn't exhibit any of the downsides you're discussing.

anoncake · on Aug 11, 2020

No, but the medium allows these things. And that's a problem.

oconnor663 · on Aug 10, 2020

Good point, and also relevant user name :)

kevincox · on Aug 10, 2020

No, most browsers will save the resources as well and rewrite the HTML to reference them. You can have problems with dynamically loaded things but I have found that it works very well in practice. I have had maybe one page that was significantly broken saving from Firefox over the years.

znpy · on Aug 13, 2020

Thanks dude, it's nice to see that there still arr people that can read a text and understand the point.

spear · on Aug 10, 2020

I've found the best way to save a page on a browser is to print it ... to PDF.

_Microft · on Aug 11, 2020

Absolutely, depending on how much I care about the content, I either print it directly from the reader mode (which gives pretty bland results) or I touch up the page itself with things like "column-count: 2" and a few changes to headlines, to give it the look of a proper print article. Either way, printing to PDFs is a great way to archive/save web content for later.

stOneskull · on Aug 10, 2020

it's quite nice this way. much better than the old .mht file even. it skips the junk.

znpy · on Aug 11, 2020

This is brilliant... I hadn't thought about it.