For those interested in this topic, you might want to checked paged.js https://p...

bayesian_horse · on March 21, 2023

At the moment, the combo of pagedjs with a headless browser is my first recommendation when you need something to output pdf automatically.

Weasyprint looks nice, I have yet to use it. It probably has the advantage of not using an actual browser document (with all the relayouting problems), and may use fewer resources in terms of time and memory (I'd have to check though). The disadvantage is also it's not an actual browser document, meaning you need to render the html and css in some form before passing it to weasyprint.

Using a headless browser makes it easier in general to do some fancy stuff like music sheets or embed js chart libraries and so on.

yojo · on March 21, 2023

> meaning you need to render the html and css in some form before passing it to weasyprint

I'm not sure I understand this part. WeasyPrint is a rendering engine; it takes an HTML file as input and outputs a PDF.

I can't speak to resource usage, since this was very ad-hoc. For my needs, it was very simple to set up, and worked on the first try. I was testing with about five pages of content, and on my M1 MBP it rendered out the PDF pretty close to instantaneously, which was nice for the edit/refresh cycle. Preview on MacOS actually live reloaded when the PDF changed, which was a pleasant surprise.

bayesian_horse · on March 21, 2023

With "render" I meant you need to first create the html and css before handing it over to weasyprint. You can't use browser-based chart libraries unless you convert their output to svg or other image formats first.

With a headless browser you can use for example react to generate the document, you can just insert any chart library and so on.

And regarding resource usage I had scenarios of "pdf as a service" in mind, where you need to generate pdfs dynamically. Reporting, invoices, what have you.

yojo · on March 21, 2023

Ah, gotcha. I was focused on my use case of static text.

If your charting library runs server-side and outputs HTML/CSS, then it'd work fine, but most of that stuff requires javascript, and yeah, WeasyPrint doesn't have an interpreter. Once you're already paying the overhead of a browser, you're probably better off using a JS polyfill than piping through another tool, unless paged.js is an insane resource hog (which seems unlikely).

Semaphor · on March 21, 2023

And print-css.rocks [0] for a comparison of tools, with support matrices, and tutorials. We use Weasyprint [1] to create PDF ebooks with HTML and CSS.

[0]: https://www.print-css.rocks/

[1]: https://doc.courtbouillon.org/weasyprint/stable/

yojo · on March 21, 2023

Thanks for pointing this out. I came across paged.js but for some reason thought it wasn't implementing the margin boxes – I think I just searched their documentation page for @top-center and didn't find it. Looking again, I see that they definitely do.

Something to check out next time I do this.

batmaniam · on March 21, 2023

If I wanted to write a book, how does this translate to the publisher's standards when it comes time to hand in the design? Do publishers accept css files? How does that whole publishing pipeline work with this css framework?

ics · on March 21, 2023

That would depend very much on your publisher... that could be anywhere from "use this page size and send a PDF" to "manuscripts only and our designer/typesetter/etc. will work on the rest". As someone who has used CSS in the way described here, I think it's beneficial even if you end up having to redo it in InDesign. It may not be a perfect 1:1 but simply coming up with the rules will make setting up styles and master pages in publishing software much more straightforward.

yojo · on March 21, 2023

If you're self publishing (like I was), your publisher is probably Amazon, and they take a PDF, which is the output of weasyprint.

I haven't looked at wider distribution, but I believe most of the print on demand publishers accept a PDF file. I think some (all?) also take InDesign, which is Adobe's thing.

vivegi · on March 21, 2023

Publishers (traditional ones like Random House, HarperCollins) would take a Microsoft Word document manuscript. Each publisher has guidelines for their manuscript.

They will handle the typesetting themselves.

You need to do it all only if you are self-publishing.

brettermeier · on March 21, 2023

I thought the publisher gets a PDF.

yawnxyz · on March 21, 2023

I used PagedJS to print many abstract books for conferences — it's worked brilliantly!