Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would argue that PDF is nightmarishly bad - truly one of the worst formats ever created on almost all fronts - but it's the best and most featured option for its use case. Depressingly, I find this is true about a lot of things. I don't know what the answer is.


I’ve met a few people who feel passionately the way you do, and don’t get it.

I worked with archivists on a few projects and never appreciated the dumpster fire that electronic documents presented.

PDF is an amazing thing as you get an expressive format that preserves look, feel and content and will likely do so for the foreseeable future. Just the fact that the US Federal courts standardized on PDF for most filings will ensure that it is a viable format for decades or more.


Problem is that PDF does not preserve content in a machine readable format. It’s a one way street. Once converted to PDF you can’t convert to another format without losing a lot of content and formatting.


That’s like saying that a spreadsheet is no good because it isn’t machine readable.

PDFs are often display focused and difficult to parse, but it’s certainly possible to do so.

It’s success in the market as compared to a edit focused format like ODF underlined how important display consistency is.


That's exactly what I like about it. My ideal PDF is essentially a PNG file with selectable/searchable text.

It's a great WORM format. Every added feature makes it worse.


Why is it desireable for it to not be machine readable? What could possibly be the advantage in that?


Because I don't want anything to try to reflow the text, or adjust the kerning, or modify to use system fonts.

There are great systems for those already.

When I want a PDF, it's because I want a format that I know is always going to look the same.

A PDF is a great archive format. It's perfect for a scan of a document, or a printout.

I never want my viewer to add anything to it, I never want it to detect anything, I never want it to adjust anything.

Just render it exactly the same way, every time.


One thing doesn't imply the other. The format could be machine readable and still be pixel-perfect consistent. It could also allow reflowing, adjusting the kerning, or use system fonts even if it's machine unreadable.


It is machine readable, just not readily machine malleable.

I worked on a project where we were digitizing and cataloging various records. It was less challenging to do this with papers from the British colonial administration from the late 1700s, than to decipher certain 1980s documents written with a defunct word processor. PDF is a compromise that helps address that issue.

I would not recommend maintaining your general ledger in a PDF. But an annual report that may be referenced for decades is a great example of why a PDF is a useful format.


This is true of practically all formats.


It’s much truer for PDF than for other formats. The only format I could think of that’s worse would be plain images.


djvu will continue to be a better format in every conceivable way long into the future.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: