Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Apple

Apple Books Quietly Launches AI-Narrated Audiobooks (theverge.com) 29

Audiobooks narrated by a text-to-speech AI are now available via Apple's Books service, in a move with potentially huge implications for the multi-billion dollar audiobook industry. From a report: Apple describes the new "digital narration" feature on its website as making "the creation of audiobooks more accessible to all," by reducing "the cost and complexity" of producing them for authors and publishers. The feature represents a big shift from the current audiobook model, which often involves authors narrating their own books in a process that can take weeks and cost thousands for a publisher.

Digital narration has the potential to allow smaller publishers and authors to put out an audiobook at a much lower cost. Apple's website says the feature is initially only available for romance and fiction books, where it lists two available digital voices: Madison and Jackson. (Two more voices, Helena and Mitchell, are on the way for nonfiction books). The service is only available in English at present, and Apple is oddly specific about the genres of books its digital narrators are able to tackle. "Primary category must be romance or fiction (literary, historical, and women's fiction are eligible; mysteries and thrillers, and science fiction and fantasy are not currently supported)," its website reads.

This discussion has been archived. No new comments can be posted.

Apple Books Quietly Launches AI-Narrated Audiobooks

Comments Filter:
  • Robots are taking all the jobs of hardworking Voice Actors across this great nation!
  • by Budenny ( 888916 ) on Thursday January 05, 2023 @10:09AM (#63182022)

    If AI can now imitate perfectly the style of a given painter, and paint perfectly acceptable pictures in a neutral style, its not long before it can perfectly imitate a human reader and record audio in a neutral and acceptable style.

    Think positive. More accessible books for more people, cheaper.

    Yes, less paid work for actors, and less demand for librivox.org volunteers. True. But a lot more librivox books, and no waiting for them.

    • by caseih ( 160668 )

      Librivox is about reading public-domain books. While Apple could do AI-generated readings of public-domain works, their goal is to make lots of money so no doubt they will concentrate on current, copyrighted books.

      However if Apple did release an AI reading of a public-domain work, at the moment AI-generated works have been deemed not possible to copyright. Wonder how long that would last.

    • I can imagine a large market for women's "clit lit" read out in the voices of Morgan Freeman, George Clooney, et al.. I have no doubt AI could do it... the question is, could publishers get away with it?
    • by carterhawk001 ( 681941 ) on Thursday January 05, 2023 @01:20PM (#63182534) Journal
      The problem with machine narration is that listeners don't want "neutral", the best audio books are more like Patrick Stewart performing a one-man Christmas Carol than Alexa reading out words at you. Great audio book performances give each character their own voice, not just a pitch or accent but tone, inflection, nuance, improvised subtleties that give them life. There's a reason someone like R.C. Bray is in such high demand as a narrator, the top tier of the profession are up there with the likes of Mel Blanc and Tress MacNeille when it comes to voicework. Truly supplanting human narrators will take not just the best speech synthesis software ever created, but an entire markup language to assign meaning to every word that is spoken and having someone fine tune an entire books worth of synthesis. Listening to Apple's samples I think we are still some ways off of that.
      • by AmiMoJo ( 196126 )

        That's what the AI is here. It does the different characters, adds all the inflections. It's not simple TTS.

        • Nah, I listened to the sample on the actual product page, two characters are talking to each other and I couldn't tell them apart, they sound the same. I could barely even catch when the ai was shifting from narrative/description to character talking. Even with a first-person perspective book you want to be able to tell when the POV character is speaking vs when you are 'hearing their monologue'.

          Compare this: https://books.apple.com/us/aud... [apple.com]
          To this: https://www.audible.com/pd/Hea... [audible.com]

          The differen
  • Maybe the new speaker could be an AI chatbot with an AI voice. Would be an improvement.

    From the use of the word "AI" I assume that Apple is focusing getting natural-sounding inflections and tones, which is actually pretty cool. Hopefully the rest of the text to speech world will catch up. We could sure use something better on Linux for the screen reader.

    I've been listening to books using text to speech for years. Quite listenable if you get a good voice. Sadly the best Android voices out there were from

  • The original Nook could do basic text to speech - which was hailed as a real boon to people with sight challenges. Although I'm certain the AI does a better job with inflection and such, but the fundamental problem remains.
    The reason almost no one knows or remembers this feature is that the moment it was revealed, book authors invoked their copyright authority to prohibit "derivative works". Barnes and Nobel were forced to disable this feature. Of course, Apple is probably too big to be dissuaded by a fe

    • by EvilSS ( 557649 )
      This is a service Apple is offering to Authors, not something they are enabling on ebooks like B&N did.
  • by MDMurphy ( 208495 ) on Thursday January 05, 2023 @11:16AM (#63182226)
    I listen to audiobooks quite a bit. While I'll muddle through poorly narrated ones if the content warrants it, there are some things in narration that can make a huge difference in how you enjoy or comprehend the book.

    Some non-fiction books can be very technical. Not all narrators can do what is essentially a nine-hour continuous lecture and not bore you so much that your mind wanders. With no multiple characters or varying emotions from the narrator you might think this would be the easiest type of book to narrate, but I've developed an appreciation for narrators who do a better job than just dry information transfer.

    In a fiction book, multiple characters can be interesting. In a conversation between multiple speakers the narrator needs to find ways to let the listener know who's speaking. Some do it very well with very subtle shifts in tone so you always know which character is speaking. There might be male and female characters or ones with different accents. The samples on Apple's page didn't have multiple characters, but I'd be very curious to hear how it handles those cases.

    I wonder what the editing process will be for these books. Will a person listen to the entire book to make sure that it is all understandable and all the characters come across properly? It doesn't do science fiction which might be due to new words and terms that are made up for the book. How many unusual words or names can be in a book before it fouls the whole thing up?
    • by EvilSS ( 557649 )
      I suspect we will eventually end up with a (hopefully standardized) markup language for authors/editors to use to let the software know when to make those tonal shifts and other things like how to inflect certain words, what tone (cheerful, serious, etc) to use in different parts of the text, etc.
      • by bartle ( 447377 )

        I suspect we will eventually end up with a (hopefully standardized) markup language for authors/editors to use to let the software know when to make those tonal shifts and other things like how to inflect certain words, what tone (cheerful, serious, etc) to use in different parts of the text, etc.

        That raises the question, is it more cost effective to have editors poor over the text, add markup, and run it through QA versus simply hiring someone to read the text? Perhaps a self published author would spend the time on their book to optimize it for audio, but it seems like there is still plenty advantage to hiring an experienced narrator.

        • by EvilSS ( 557649 )
          Probably, yes. Good VA's aren't cheap, and the pool of people who could edit a script for a AI VA is probably a lot larger than the pool of good VAs. A VA can run $200/finished hour or more for in demand or celebrity voice actors. Then figure another $200/finished hour for the producer. Of course that is if you pay up front and don't do a royalty sharing deal with 30-40% of the audiobook profits going to the VA/Producer. Or you can hire a audiobook production service like who will handle everything for you,
          • Probably, yes. Good VA's aren't cheap, and the pool of people who could edit a script for a AI VA is probably a lot larger than the pool of good VAs. A VA can run $200/finished hour or more for in demand or celebrity voice actors. Then figure another $200/finished hour for the producer. Of course that is if you pay up front and don't do a royalty sharing deal with 30-40% of the audiobook profits going to the VA/Producer. Or you can hire a audiobook production service like who will handle everything for you, for around $500-750 per finished hour.

            Your comment about someone editing a script for AI being cheaper, and a larger pool, got my attention. I got this CD years ago: https://en.wikipedia.org/wiki/... [wikipedia.org] In the liner notes it told that Gershwin had actually played the music on a recording piano which cut the rolls that were scanned and played on a new Yamaha Disklavier. So Gershwin was performing the music. In the notes, it also mentioned that someone could just cut the piano rolls by hand with no original piano player needed. Those cutters h

        • I suspect we will eventually end up with a (hopefully standardized) markup language for authors/editors to use to let the software know when to make those tonal shifts and other things like how to inflect certain words, what tone (cheerful, serious, etc) to use in different parts of the text, etc.

          That raises the question, is it more cost effective to have editors poor over the text, add markup, and run it through QA versus simply hiring someone to read the text? Perhaps a self published author would spend the time on their book to optimize it for audio, but it seems like there is still plenty advantage to hiring an experienced narrator.

          That was my thought. If editing takes more time and effort than the initial creation, is it worth it? I've noticed over the last 20+ years of listening to audiobooks that the editing has improved quite a bit. I could tell by the sound of the room if a new chapter was recorded in a different session than the previous one. The really bad thing was if there were two or more narrators. They would sound like they were recorded separately and when merged together fell into a sort of "uncanny valley" of speech

  • Pretty sure Kate Reading and Michael Kramer are just fake names for AI-narration, they certainly sound that way.

  • Speechify and I imagine some other AI voice companies already offer audio book narration to authors. I am curious to see how much market Apple sees for this. For once they are not even close to the market leader, Audible (AKA Amazon).
  • AI narration sucks. #realvoices
  • I've have heard all manner of TTS synthesizers, some sophisticated AI/deepfake models from researchers. They all sound terrible. You can't replace a human voice yet.

  • Should be added to the OS, so I donâ(TM)t have to download massive audio books but just the eBooks. And it should be able to read any website or any text document at better quality.

    Now what would be great would be if eBooks could be annotated so that the software knows who is the narrator, and who are characters. Plus an annotation about characteristics of the characters, like one is a 16 year old girl and another is a toothless 80 year old man.
  • So, the text-to-speech software we've had since the mid '80s isn't A.I. ? What's different about Apple Book's tools that it merits the name "A.I." ?
  • I seem to recall some of the earlier Kindle models came with a text-to-speech feature in order to be more accessible to (and to sell Kindles to, of course) the vision impaired. And the book publishing industry flipped its collective shit, sued Amazon over it, and got the speech function pulled from subsequent Kindles. I wonder how Apple will pull this off considering not just Amazon's own difficulties, but the fact that the last time Apple tried to take on Amazon in the eBook business, the *government* fl

"The vast majority of successful major crimes against property are perpetrated by individuals abusing positions of trust." -- Lawrence Dalzell

Working...