Will AI replace voice actors? | Papercup Blog

Will AI replace voice actors?

by Team Papercup

July 10, 2024

5 min read

Will AI replace voice actors?

No, nor should it. While the fears are not totally unfounded—AI is transforming the dubbing industry in ways unimaginable only years ago—it is unlikely to replace human voice acting fully. Advancements in speech technology over the past few years have made AI voices more realistic than ever, fueling voice actors’ fears that they could be eclipsed by the technology altogether. However, as generative AI transforms the way we all work, the voice acting industry, alongside its fellow creatives, is more likely to harness its powers to augment its existing role rather than automate it away.

What voice actors do

Voice actors record their voices in a dubbing studio to create voiceovers, characters’ voices and localized versions of a wide range of content, including video games and animated films, audiobooks, adverts, explainer videos, films and television series. The range of expressivity that voice actors can achieve in the traditional dubbing process cannot yet be fully automated by AI voice technology scalably, meaning that voice acting remains the go-to dubbing method for highly expressive content.

Artificial intelligence does, however, play a valuable role in the media and entertainment space. First, it increases accessibility to content that wouldn’t ordinarily qualify for investment in voice actor dubbing. Second, as video content proliferates and outpaces the supply of traditional dubbing methods, AI dubbing means that the industry can meet demand – particularly when it comes to localizing content.

How is AI changing the voice acting industry today?

In certain areas of media and entertainment, companies are turning to AI voices to make voiceover creation more affordable, less time-consuming and complex. AI dubbing solutions are also used to translate and localize content across different languages, making it accessible to a global audience efficiently and affordably. Why hire an expensive voice actor for dubbing when AI can do it twice as fast for half the cost? For many reasons, the chief among them – not all dubbing methods can tackle all content types.

While AI voices lend themselves particularly well to single-speaker content in simple environments – like audiobooks, corporate work and news, they are less able to scalably tackle the complex storylines of movies or television series involving multiple characters in many different settings.

It's all too easy to say that AI is taking single-speaker work away from voice actors right now. In actual fact, the availability of AI for narration and localization is increasing content production across the board. In tandem, use cases for voice actors and AI are becoming increasingly distinct. For example, in audiobook narration, the main narrator may be human to maximize audience engagement, but the voice of secondary characters could be voiced using AI.

In addition to the suitability of the dubbing method, there are billions of hours of content locked in a single language. Much of which would never have qualified for dubbing by traditional methods due to the cost, time and complexity of the process. Daily news, for instance, cannot be dubbed into other languages by traditional methods because of its quick turnaround times.

Similarly, the ROI for corporate and social media videos dubbed with voice actors rules it out. So, while it’s true that voice actors' jobs are being affected by the rise of realistic AI voices, the situation is more nuanced than is widely reported.

The technology behind AI voices

Companies like OpenAI, Papercup, ElevenLabs and many more have developed sophisticated machine learning systems that generate human-sounding synthetic voices that can narrate text (a process known as text-to-speech) or take human speech and convert it into a new artificial voice (a process known as speech-to-speech, or voice conversion).

However, it’s voice cloning technology specifically that has given rise to voice data consent and misuse fears among voice actors who are concerned that their likeness could be used without their knowledge, consent or remuneration.

The regulation of artificial intelligence in the acting world was a key topic in the SAG-AFTRA strikes of 2023, and following negotiations, specific guardrails were implemented around consent and usage. At Papercup, we never use a voice likeness unless we have explicit consent to do so.

What voice actors can do that AI cannot

Deliver complex emotions

AI is increasingly capable of producing complex emotions, but voice actors naturally convey nuanced elements of emotion like emphasis and hesitation and produce complex emotions like sarcasm and flirtatiousness. This, coupled with their ability to adapt delivery, ad-lib and draw on their own emotional experience, is what sets them apart.

Convey subtleties

Language has many different variables—accent, turns of phrase, timbre, and intonation all come into play. AI can replicate everyday speech to an impressive degree, but learning idiosyncrasies requires a lot of data, time, and fine-tuning.

Creativity and spontaneity

Actors’ ability to take on feedback and adjust their delivery in real-time is something that AI cannot do. Their ability to interpret a script and deliver a performance based on their own lived experience is unique.

So, what are AI voices good for?

Enhancing accessibility

AI voices are instrumental in making content more accessible to diverse audiences. They can quickly and cost-effectively produce voiceovers for various media, ensuring that content is available in multiple languages and formats, including for those with visual impairments or reading difficulties.

Affordable content creation

One of the major benefits of AI voices is their ability to create content affordably; AI voices reduce the need for expensive studio time and the associated costs of hiring voice actors. This is invaluable for small-scale video productions, educational materials, corporate training modules and video content, like news, that wouldn't qualify for the time and cost investment of dubbing with voice actors. Fast turnaround times

AI voices provide rapid turnaround times in industries where speed is crucial, like breaking news. AI can generate voiceovers almost instantly for these use cases, ensuring that information can be quickly distributed across different platforms and in different languages.

Scalability

One reason much of the world's content remains locked in a single language is that before AI dubbing, there was no scalable way to localize. With the introduction of AI voices for video, there is an opportunity for increased content output that isn't reliant on the turnaround times or budgets commanded by dubbing in the studio with voice actors.

Customization and personalization

AI voices can be customized to fit specific needs, including adjusting tone, accent, and style to match the desired output. In studio dubbing, this involves a lengthy casting process. This level of personalization is beneficial for brands looking to create a unique auditory identity or for tailoring content to specific audiences.

What is the future of voiceover?

AI technology and voice actors can form a powerful alliance, drawing on each other's strengths to create high-quality content more efficiently. Here’s how.

Script previews and corrections

AI can assist voice actors by providing previews of scripts with synthetic voices. Before the final recording, actors can listen to AI-generated versions to understand the pacing, tone, and flow. This helps identify potential issues and make necessary corrections beforehand, ensuring a smoother and more efficient recording session and fewer returns to the studio for re-recording.

Ability to work on multiple projects at once

Voice actors can currently only work on a limited number of projects simultaneously. By using AI to replicate their voice likeness (which doesn’t necessarily mean outright cloning it), actors can give their consent for this likeness to be used for multiple pre-agreed projects at once.

Generate passive income

When voice actors consent to their voice likeness being used for specific pre-approved projects, they can generate income while not physically working or while working on a different project.

Expanding work range

By collaborating with AI, voice actors can expand their range and take on projects they might not have previously considered. For example, a voice actor could use the technology to produce different accents, or they could permit their voice data to be used to create younger or older characters for secondary characters. This diversification can lead to new opportunities and a broader portfolio of work.