The Amazon Kindle War Against Piracy

(goodereader.com)

98 points | by kozmonaut 2 days ago ago

129 comments

  • rich_sasha 2 days ago ago

    I can't blame Amazon specifically for it, but it is amazing how the wider tech industry is simultaneously pushing DRM down our throats and scraping/pirating any content they can find for training AI.

    • wuschel a day ago ago

      I did not look toe matter up in detail, but I am astonished as well in regard to this development. A grab for control over basic devices and a mass robbing of the commons.

      There is no strong political pushback from the parties/voters to this large scale collective theft of IP by cooperations, is there?

      I would expect some sort of taxation and redistribution to the content creators like there is with music content etc.

    • a day ago ago
      [deleted]
  • elcapitan 2 days ago ago

    The simple message is "Don't spend your money on things you don't really own, because you'll be at forever war with the lenders".

    • makeitdouble 2 days ago ago

      Last stop on this train is either "seize the means of production" or hiding in the woods

      • harvey9 2 days ago ago

        Funny you should say hiding in the woods. The closing scene in the film of Fahrenheit 451 is a group of people preserving books by memorising them and hiding in the woods.

        • wltr 2 days ago ago

          Why did you mention the film, not the book? Considering the closing scene is the same, iirc.

          • harvey9 2 days ago ago

            I haven't finished reading the book yet. I realise the irony in that.

          • halJordan 2 days ago ago

            If you have trouble letting people mention a movie, you're a part of the problem

            • wltr a day ago ago

              If you have trouble spilling uninvited diagnosis for Internet strangers, start with yourself, yeah.

              I have no issues with the author mentioning the film, not the book. I was rather curious why not the book, but the film, solely because of this ‘wood’ story told in both. The author said they haven’t finished the book yet, so that’s fine with me. I just find it ironic they did not mention the book (and the author too), considering the story it tells.

      • antman 2 days ago ago

        The first stop is seize the things you bought apparently

        • makeitdouble 2 days ago ago

          First step is to not sell anything, and let you pay for an indefinite license.

      • dmitrygr 2 days ago ago

        You already have access to means of production of ebooks... No need to steal

      • quijoteuniv 2 days ago ago

        Please expand on what you mean here

        • makeitdouble 2 days ago ago

          "things you own" is a more complex concept than most people give it credit for.

          If for instance you consider anything that can be legally and uniterraly taken away from you as not owned, that definition becomes really really small, short of you being a diplomat or some special entity.

          Many see "owning" things in a more colloquial way, but that's also how Amazon gets by with their shenanigans, as it still feels like ownership day to day.

          • dh2022 2 days ago ago

            You are on: can you explain how anyone can legally and unilaterally take away my fully paid car? After that can you explain how can anyone legally and unilaterally take away my mortgaged home? Just curious. (If still interested in this game, can you explain how anyone can take away my books? This is closer to the topic at hand)

            • makeitdouble a day ago ago

              > car

              As sister comment points out, civil forfeiture is one. Your car being involved in a criminal investigation and getting saved for perpetuity as evidence would be another.

              Funnier examples: let's say it's a self driving car, as you proudly admire your brand new car delivered car, firmware is overwritten by error and is sent to another home, who also closes the delivery. Factory management software is buggy as hell and your car got reassigned some random info of another lost car, but marked as delivered to you anyway, so the burden of proof is now on your side.

              You might be fighting that car maker in court for the rest of your life without ever seeing any compensation or getting back "your" car.

              > mortgaged home

              If it's mortgaged it's just not yours in the first place. I'm sure you have clauses in your loan contract detailing how the loaner can unilateraly decide that circumstances changed, you're now too much of a risk, and request near inmediate full repay of the rest of loan or else they liquidate the property.

              Land laws have also their funnier clauses, where someone squatting your property for a decade can request legal ownership, depending on the local arrangements.

            • kerningije 2 days ago ago

              Look up civil forfeiture.

  • pavel_lishin 2 days ago ago

    > The 15.18.5 update is also having an adverse reaction to sideloaded books. If you deliver a book using Send by Email or copy it to your computer via USB, a critical issue may arise, where a pop-up appears with an ‘Invalid ASIN‘ number. The new DRM system is attempting to locate the book in the Amazon store to decrypt it, but since it can’t find it, it reports that the book is invalid. Amazon claims they are working on the issue, but preventing sideloading would be downright tyrannical.

    Ouch. This is how I deliver a lot of my DRM-free purchases to the Kindle app.

  • lentil_soup 2 days ago ago

    Honest question. Where can I buy ebooks outside Amazon to use with my jailbroken Kindle? Is there a Bandcamp for books?

    I want properly created files the kindle can render with the options I want, not a pdf that forces a layout.

    • SCdF 2 days ago ago

      There are plenty of ebook stores if you google around, that have a standard range and use Adobe DRM, so off the bat wouldn't work on a kindle. In theory you can remove that DRM using Calibre, but I haven't tried.

      Other than that, not really? There are plenty of ways in which you can buy _a_ drm free book, but not some large range (or even bandcamp-quality range, where there are authors you've heard of, just not Dan Brown / Stephen King sized ones) site.

      I haven't got around to solving this problem so am also interested. I already own a kindle, I don't want to generate ewaste by changing physical device.

      • thomasuebel 2 days ago ago

        Was impacted by invalid ASIN pop up. Got fed up finally. Sold my kindle paperwhite via classifieds to someone who would embrace the Amazon walled garden (so someone new e-reading). Then bought a pocketbook. Now all my ebooks work again. And no waste. I use beam ebooks for drm free books. Bought all my Expanse books there.

    • phantomathkg 2 days ago ago

      I googled, and it seems Kobo can? But I don't own a Kobo to test it.

      https://help.kobo.com/hc/en-us/articles/360019527954-Downloa...

      • fph 2 days ago ago

        I have a Kobo. Yes, you can download books you buy from them to any device. They are Epubs with DRM.

      • frm88 2 days ago ago

        I own an Onyx Poke and I buy my books at kobo. They work just fine. With a jailbroken Kindle capable of processing EPUB this should (!) work. Give it a try with an inexpensive book, maybe?

    • uncircle 2 days ago ago

      I buy ebooks outside Amazon with my non-jailbroken Kindle. They’re usually DRM’d EPUB files, which I unDRM with Calibre and transfer to my device, which I have kept in airplane mode for the better part of a year.

      The bonus part is Calibre keeps a local copy on your PC, so now all my book purchases are backupped and without DRM.

      • 1317 2 days ago ago

        well ok so can you answer the actual question then

    • sotix 2 days ago ago

      Borrow from the library! Kobo has library borrowing built right into the device if you're in the US and can use Overdrive/Libby. Otherwise, you can use the Libby app to send the book to your Kindle.

    • nunez 2 days ago ago

      ebooks.com is what I use. Their books use Adobe DRM, but that's easy to remove. I'll go to the publisher's website directly otherwise.

    • ivanjermakov a day ago ago

      Individual authors often provide alternative purchase options

    • surgical_fire 2 days ago ago

      I switched to Pocketbook. Removing DRM from e-books bought outside of Amazon was trivial.

  • jasonvorhe a day ago ago

    I bought a Kindle Oasis 1st gen when it came out, got it replaced by Amazon with gen 2 and I liked it as long as I could sideload books and strip DRM off of my Kindle "license purchases" but I anticipated the recent moves by Amazon and looked for alternatives, knowing that I'd probably have to love with some compromises, especially in terms of build quality and feel. Looked at Kobo, Onyx, etc but none really convinced me, especially when I saw most of them being stuck on Android 11. When I heard of the Daylight Computer DC-1 something clicked. The smooth display, readability in daylight, none of the cons of rink (except being gray scale), the blue light free display, Android 13 with 16 being on the horizon. Downsides are the uneven display borders, higher battery usage when the display is powered on/no always on display and the rather flimsy feeling of the back cover. DPI is a bit below modern ereaders as well. I'm still happy I went with it. It's now my daily carry for writing in the cafe, scribbling down notes with the included pen and its the best device I ever used to read PDFs and scanned books from archive.org that aren't downloadable.

    Lots of cons still but no one is going to pull your ebooks from your device and it's still their 1st gen and only device they have. I just hope they ship export functionality for their custom app so that you can keep local copies of the documents you send to their backend.

    It's great this was mentioned in a podcast once which got me curious about it in the first place. And without the blue lights you actually can read in bed without ruining your circadian rhythm.

  • Havoc 2 days ago ago

    Made copies of most of mine before their last lockdown kicked in.

    That said still buying. Their daily sales are (occasionally) just too good value for money even of if I know it’s locked.

  • buyucu 2 days ago ago

    I just use libgen / Anna's Archive. No need to pay money to Amazon.

    • ugjka 2 days ago ago

      So how exactly pirating the books give money to the writers?

      • shakna 2 days ago ago

        The more people who pirate my books, the greater my sales across all platforms. That's not hyperboly - its something I track.

        Individuals who pirate my books are also more likely to buy them in the future.

        Piracy is just about accessibility and trust. If the person can't afford to take a chance, they pirate. And if you win them there, they'll buy.

        (Nit: Zero of that applies to corps. Thanks Anthropic, Meta, and everyone else.)

        • SCdF 2 days ago ago

          I am guessing this works for you because more people reading = more people talking = more readers discovering and potential sales?

          It would be interesting to see at what point of notoriety that is no longer true. Like is this still a factor for Stephen King, or at that point is it really just lost sales?

          • shakna 2 days ago ago

            That's my interpretation of it.

            As for scale... There is only a tiny fraction of the industry that can support their life on writer's income, let alone be a household name.

            It probably does become just lost sales at that point, but to reach that, you're probably already beyond most competitive forces, leaving only piracy around.

        • qmr 2 days ago ago

          Are you looking forward to tens of ... dollars from that recent suit?

          • shakna 2 days ago ago

            Some of my publishers are, as they're American. I'm unlikely to see any of that.

            Unfortunately, I'm Australian, and my government saw fit to narrow their interpretation of current laws, to make AI scraping of illegally obtained data, legal.

            You now have to prove direct harm - not the indirect harm happening to the entire industry.

        • gruez 2 days ago ago

          >The more people who pirate my books, the greater my sales across all platforms.

          You think more piracy leads to more sales, but surely this is correlation, not causation? It seems far more plausible that popular books get pirated and bought more, hence the correlation.

          • shakna 2 days ago ago

            I mostly sell by word of mouth. You've certainly never heard of my books before. I am in no way "popular".

            Piracy creates an invested reader. Its not much different than games selling by offering free demos.

            There is a _causation_ there, because the reader likely never would have discovered me, otherwise.

          • thaumasiotes 2 days ago ago

            It could be pure correlation if you, personally, are a household name. If you're secretly Stephen King commenting on Hacker News, then yes, exposure isn't going to help you.

            But if you're not Stephen King, then more piracy is going to make a direct, causal, positive impact on your sales.

      • blagie 2 days ago ago

        The cycle has been:

        Piracy -> Friendly ways to buy -> Unfriendly ways to buy -> Piracy -> ...

        Unfortunately, giving money back to writers involves hopping through piracy. At that point, a new, consumer-friendly service will sprout up. Everyone will use it.

        Over time, the service will want to profit-maximize, and will adopt anti-consumer techniques. Leading people to go to Pirate Bay. Leading to friendly services.

        Rinse, repeat.

        • lotsofpulp 2 days ago ago

          How many times has this happened, such that it can be called a cycle?

          There are other possibilities, such as people simply not writing as much anymore, or higher quality writers existing the market due to lack of sufficient return.

          • blagie 2 days ago ago

            Bad DRM led to Napster led to Netflix lead to a fragmentation of services led to a resurgence of piracy.

            Similar thing happened with music, only rather than piracy, it landed on legal / free (e.g. Youtube). Youtube is just starting to do the consumer-unfriendly thing (but it's got a long ways to go before piracy comes out competitive).

            Similar in books.

            I'll mention: A lot of these are consumer-unfriendly in some ways (e.g. Netflix DRM), but friendly in others. $20/month for all the movies you can watch beats piracy.

          • al_borland 2 days ago ago

            It’s happened to some degree with music, movies, and TV shows.

      • portaouflop 2 days ago ago

        Instead of lining bezos pockets get your ebooks from above sources and go to a real bookshop to buy hard copies of books you like especially - you can give them away and so support the actual author while not supporting bozo

      • keanb 2 days ago ago

        Where in the message did he claim that pirating the books compensated the writers?

      • trcf22 2 days ago ago

        How different is it from going to a public library?

        • Mindwipe 2 days ago ago

          Almost every country in the world apart from the US pays authors for library lending.

          • morsch 2 days ago ago

            I wasn't sure how it works where I live, so I looked it up and apparently in Germany (according to Wikipedia) public libraries pay 3-4c per checkout to a central private body which redistributes it somehow.

            So unless the book is checked out a thousand times over and its lifetime, buying it still dominates overall.

            • layer8 2 days ago ago

              The libraries also bought the book originally.

      • throwbway37383 2 days ago ago

        You should check out Chokepoint Capitalism by Cory Doctorow and Rebecca Giblin. To put it briefly, you've been fooled.

        You're making an argument that empowers the likes of Amazon, not "writers", and it's by design that you've been fed that story.

      • wltr 2 days ago ago

        I pirate all the books, I treat that as a public library. I don’t read most of them. The ones I have read and found good, I talk about them, write about them, and I can buy them. For myself, plus as gifts to others. I just dislike buying highly marketed book that turned out to be useless.

        If I’ll ever to become an author myself, I don’t see any issue with that.

        • rs186 a day ago ago

          I use the local public library from time to time (physical/via Libby) while reading on my kindle otherwise. Libby is something else, but for the physical books, I just see zero difference between going to the library in person, checking it out and returning it later vs just pirating it online. It's not like the publisher does not get any more money. OK there is a difference where there is a limit to the number of copies available, so some people have to wait, just typical of public resources. But I noticed that most books I borrow are always available, especially with interlibrary borrowing. So what difference does it make?

          In the end I pirated more often. I am not proud of that, but I also don't see how any of this makes any difference. It's not like I'll ever buy the book with my own money.

      • buyucu 2 days ago ago

        If they sell a PDF that I can download, then I'll give my money to them. But I'm not giving money for DRM.

      • nashashmi 2 days ago ago

        Book authors should make money from concert ticket sales, not books. /s

    • qmr 2 days ago ago

      How does that work? I used it once to look up an old book and ended up on Discord?

      Reminiscent of #bookz on Undernet.

      • layer8 2 days ago ago

        You post the code you received on the Discord channel, which replies with an actual link.

      • akho 2 days ago ago

        It works fine. You click a link, and the file downloads. I have no idea what you mean. Discord?..

    • thaumasiotes 2 days ago ago

      I was kind of mystified by Amazon's move against downloading Kindle books, since ebooks on Anna's Archive are of much lower quality. Wherever they're coming from, it's not Amazon.

      • mandelken 2 days ago ago

        Could you explain? From my experience, they are exact digital copies of the ebooks on amazon. It’s not like mp3 in lower audio quality, it’s word by word the same book.

        • thaumasiotes 2 days ago ago

          I tried downloading several Dresden Files ebooks from Anna's Archive. They contain various difficult-to-understand errors including word substitutions that tend to ruin the text. AA offers several editions, but they appeared to be the same error-filled book in different file formats. The errors are not present in the Kindle editions of the same books.

  • aftergibson 2 days ago ago

    Just buy an old kindle. Mod it and get koreader running on it. Then put it in airplane mode and load books via USB from whatever source you like.

    Its great hardware.

    • goosedragons 2 days ago ago

      Just buy a Kobo. You can get a new one, you don't have to jailbreak it to run Koreader, it already actually ePubs out of the box, they don't have as aggressive DRM either. And the hardware is good.

    • barnabee 2 days ago ago

      My Onyx Poke 5 is better than every Kindle I’ve owned.

      It’s super compact (great for travel), has a good screen, reads anything I’ve thrown at it, runs Android apps, and is simple to send files to over its (local) web interface.

  • devinprater 2 days ago ago

    Thankfully I deDRM'd most of my Kindle books last year or so, so I can read them in Braille on my Braille display without needing to connect to my phone and use the Kindle app. Also luckily for me it's legal for me to do that since I'm blind. One of the few perks I guess.

  • GardenLetter27 2 days ago ago

    At least e-Ink screens are becoming cheaper and better so there'll be plenty of alternatives if they block side-loading.

    • miohtama 2 days ago ago

      I switched from Kindle to Boox. Boox eink tablet can still run Kindle the Android app, but overall it is much better device.

      • rpdillon 2 days ago ago

        +1. I have two Boox devices and they're both excellent. One is a 10 inch reader that's quite thin, and the other is essentially the equivalent of an e-ink phone. Where to buy books is still a challenge for me, though. So far I've been relying on my decrypted library that I've been buying over the years, but now that Amazon closed that hole, I'm no longer buying from them.

        • miohtama 2 days ago ago

          What are the best alternatives for Kindle store?

          • rpdillon 2 days ago ago

            I don't have a good answer for you, I'm afraid. I'm gonna be looking into alternative stores that still allow you to download the files. I've never worked with Kobo before, but I heard that that could work. Google Play can work. And there's always Anna's archive, that's kind of a last resort for me. I prefer to pay for high-quality copies than get some scanned low-quality version, but publishers don't want to sell me a file anymore.

          • nunez 2 days ago ago

            ebooks.com, kobo.com or the publisher's website.

      • nunez 2 days ago ago

        Did the same. The Kindle app has a page-turn effect that can't be turned off, though, so I switched to KOReader. Unintuitive interface but fantastic for book reading.

      • xtracto 2 days ago ago

        Second this. It's got an Onyx Boox and it's been amazing.

      • lostlogin 2 days ago ago

        Kobo is another option too, and it’s easy to control and manage.

    • pasc1878 2 days ago ago

      I don't see any incentive for Amazon to do that. Even if you do not buy any books from Amazon you still paid them for the Kindle. And if you have a Kindle you might buy the odd book from them.

      Also the Kindle is marketed as allowing you to put your own documents on it.

      Yes Amazon want to stop you reading books licensed from then on anything but a Kindle but not the other way around.

      • NoboruWataya 2 days ago ago

        As I understand it they don't make much money from the sale of the Kindle itself as they want to lock you into their ebook ecosystem which is where they make their money. I don't have hard figures though.

        They may market it like that now, but that can change. If Google can stop you sideloading apps on Android, I have no doubt Amazon could try to stop you sideloading books on your Kindle.

        • jimnotgym 2 days ago ago

          This will be a nice foundation for an anti trust case.

          1) sell device at break even/loss to gain market share

          2) retrospectively lock out all content not from amazon

          • nikanj 2 days ago ago

            Now you just need a legal system that will pursue anti trust cases

            • itopaloglu83 2 days ago ago

              And before that you need a congress that will pass legislation to prevent such cases or even allow existing laws to be applied by not actively intervening.

    • stavros 2 days ago ago

      Where are you going to get the books from, if Amazon doesn't let you download them?

      • goosedragons 2 days ago ago

        Kobo, Google Play Books, smaller niche sites like Baen, Humble Bundle, J-novel club, etc.

        • stavros 2 days ago ago

          They let you download books with no DRM (or with DRM that works with any reader)?

          • barnabee 2 days ago ago

            In the worst case, you can always buy it from Amazon and download it from Annas Archive

            Or buy the physical book, download from AA, and give the book to charity when you’re done.

            • stavros 2 days ago ago

              I don't want to give Amazon money for its anticonsumer practices, though, so I'll need to find something else.

          • goosedragons 2 days ago ago

            Some have no DRM, like Baen, most Humble Bundles and J-novel club. Most books on Kobo/Google Play have DRM but it's really really easy to remove.

  • djoldman 2 days ago ago

    Like my "smart" TCL Roku TV, I almost never allow my Kindle to connect to the internet.

    When I do, I subsequently remove wifi passwords and delete the networks.

    Additionally, I disable software updates if possible.

  • QuiEgo a day ago ago

    I’ve personally found reading on an iPad mini to be great. You can use Libby, Apple Books, Kindle, or whatever else floats your boat.

    Get that it’s not e-ink, but I’ve found a setting that does not fatigue me (black text on tan background - just like hacker news!) and have never looked back.

  • 1vuio0pswjnm7 2 days ago ago

    What is the role of "software updates" in this "war"

    This phrase "software update" is mentioned several times

    Usually this refers to soneone other than the comuter owner being able to remotely access and install software on the owner's computer, often without the explicit, non-compulsory consent of the owner

  • heelix 2 days ago ago

    Not sure what they were thinking, but for a long time if you mixed side loaded books with kindle ones, the front cover would go missing from the side loaded when it synced. I've never let my kindle hit the web after I converted by Amazon purchases.

  • explorigin 2 days ago ago

    I bought some ebooks from other vendors to avoid lock-in and side-loaded them on my kindle. Last year, if Amazon sold one of these titles it would dissappear if I turned on wifi. I now have a kobo.

  • echelon_musk 2 days ago ago

    Still happily using a 4th gen D01100 with physical page turn buttons. Works fine with Calibre.

  • nunez 2 days ago ago

    Precisely why I stopped buying eBooks from Amazon. DRM Kindle Unlimited books? Sure; makes sense. Books that I paid full price on? No way.

  • elorant 2 days ago ago

    That's why I always have mine on airplane mode.

    • notesinthefield 2 days ago ago

      My second generation Paperwhite hasnt seen the internet in close to a decade. I use Calibre and source books from all over.

  • daft_pink 2 days ago ago

    Their underlying problem is that the iPad and Android based tablets exist.

    • carlosjobim 2 days ago ago

      Not at all. Completely different use cases.

      • daft_pink 2 days ago ago

        I own both. I'm not so sure.

        If you're going to pirate books, it is easy enough to just do it on your iPad or purchase a Boox tablet.

        I own an iPad and I've often considered getting a e-ink android tablet just for reading pdfs.

  • qmr 2 days ago ago

    "If buying isn't owning, piracy isn't stealing."

    • barnabee 2 days ago ago

      To steal you have to take something from someone, i.e. deprive them of it, as well as acquire it for yourself.

      Piracy can be illegal, may be considered unethical by some, but it definitely isn’t stealing.

  • 2 days ago ago
    [deleted]
  • touwer 2 days ago ago

    Good. Now it's more clear for users that putting all your reading-eggs in the basket of a nasty tech-company, is not ok. Glad I removed my Amazon account a long time ago

  • WalterBright 2 days ago ago

    I'd buy another Kindle if it displayed pages like a book - two pages, side by side.

    • globular-toast 2 days ago ago

      KOReader can do it. But get a Kobo, not a Kindle.

    • wilsonnb3 2 days ago ago

      The Kindle scribe does that

      • WalterBright a day ago ago

        Update: I took a look at a Kindle scribe. The only layout options are portrait and landscape. I could not find a two page option. The Scribe is updated to the latest software.

        • wilsonnb3 10 hours ago ago

          It only does it for books in KFX format, that could be why. Rather annoying because you can’t convert books to KFX in Calibre on Linux.

      • WalterBright 2 days ago ago

        It doesn't mention that in the Amazon page for it. Thanks for the info!

  • charcircuit 2 days ago ago

    >Draconian measures are being taken to prevent Kindle books from being downloaded

    Using keys stored in hardware for DRM is industry standard. It's not draconian.

    • azalemeth 2 days ago ago

      I have never bought a system that requires DRM. It may be a standard amongst those who use DRM, but DRM itself is far from universal.

    • xethos 2 days ago ago

      Presuming it is industry-standard (not my industry, so I'm not about to concede the point), that doesn't make it less draconian.

    • OtherShrezzing 2 days ago ago

      Draconian doesn’t mean “not widely adopted”. It means “heavy-handed”.

    • carrychains 2 days ago ago

      The industry standard is draconian.

    • qmr 2 days ago ago

      Argumentum ad populum.

  • spwa4 2 days ago ago

    Luckily one thing LLMs with image input are ridiculously good at is piracy. You want to get a book off a kindle? Easier than with a real book, easily.

    What amazon could block is getting books from other sources onto a kindle. But there's plenty of devices. I use an iPad.

    • duskwuff 2 days ago ago

      An LLM? Just what I always wanted - an OCR tool that hallucinates.

      • spwa4 2 days ago ago

        You don't? Think about it. If your picture/source data is not perfectly clear ... what do you want? We all want perfection, but if you can't have that ...

        Would you prefer what current OCR does and just suddenly sentences go 2#!@%7Q&*@3 ladfk !@$?

        Or would you rather have a reasonable completion of a sentence that is nearly always (but not quite always) correct, that even actually takes the context into account?

        • duskwuff 2 days ago ago

          > Would you prefer what current OCR does and just suddenly sentences go 2#!@%7Q&*@3 ladfk !@$?

          Yes, actually. I'd rather be aware that the OCR tool failed somewhere than have the tool silently fabricate part of the text, or "correct" perceived errors which were present in the source document.

          • boredhedgehog 2 days ago ago

            But you aren't aware, because the OCR doesn't know that it failed. You would have to go through the entire text by hand to fix the corruptions, but that's too much work, so you won't, and the corruptions stay in.

            In practice and at scale, the guesses of the LLM are the superior outcome.

            • thaumasiotes 2 days ago ago

              > But you aren't aware, because the OCR doesn't know that it failed. You would have to go through the entire text by hand to fix the corruptions, but that's too much work, so you won't, and the corruptions stay in.

              Well, if you assume that you're never going to read the book, then sure. But in that case it's even more efficient to not OCR the book either. You'll never know the difference.

              If you do read the book, you'll know where the failures are. And they're easy to correct if you can edit the document. I usually file reports of printing errors in Kindle books when I encounter them.

              (Do the errors get corrected? No.)

        • akho 2 days ago ago

          Your picture of a ebook is perfectly clear.

      • spookie 2 days ago ago

        While dictionaries are often reliable enough to get the correct words back when some glyphs are misrecognised I wonder if some type of LLM would help in some cases. Not a worry in modern digital first documents though.

    • qmr 2 days ago ago

      I've tried to coax ChatGPT to do this but have not been successful beyond cover shots random page views.

      • spwa4 2 days ago ago

        I have. Hell, these days, both ChatGPT-5 and GPT-OSS are very good at taking my writing (as in paper writing) and, as long as you specify step-by-step what to do, get it through. Either discussing those with me in voice, or correcting assignments I make on paper. I use it to practice language and math.

        Oh, and to pirate textbooks. The issue is that an LLM-entered (as in in context) version of part of a textbook is something that I can talk to, write to, and have it judge my skills. Normally I'd have to find someone who'd be willing to spend a short time talking to me about a subject, and correct me, and who's willing to spend hours correcting assignments from me. Even when paying, that's essentially unavailable.

        Now I take a few pages, let's say up to a chapter but usually less, load it into ChatGPT-5, tell it to ask me progressively harder questions when I activate voice mode. Or I take one of those for-teacher "how to grade X" notes, write an assignment, scan the whole thing into ChatGPT, and tell it to correct my assignment, justifying everything on the teacher note and deliver a final grade. I tell it to be way too strict, and this has helped me, among other things, get very good, and one perfect score on language certs. I can prove that I am fluent in 4 languages (en, fr, de, and my mother tongue). If we're talking anything but specialized language it's even true.