I've been working on exactly this! After buying a book and having an awful exerience in their native reader I realized they no longer let you download your ebooks. Thus began the war on their web reader, they have some interesting defenses on how they encode the book, its basically a bunch of SVG paths that render to letters that are mapped to IDs that change on every request to fetch book data (the most you can fetch at once is 5 pages) the SVG paths also contain micro variations so an A on one page will never have the same path on another, you have to render and fill in every unique letter in the book and compare it to every other to create a unified mapping and THEN compare it to the letters of the actual font its using to actually find out what letter each character actually is. The only thing I have left to do is get newlines working properly and package it all up nicely for release.
OCR would likely work, however I would need to render everything out with a browser and do (probably) more processing. I also wanted a way to preserve text styling exactly as the web viewer shows it, not sure if OCR supports detecting text alignment.