Audio platform Pocket FM, backed by Lightspeed Ventures, has now announced that it will partner with voice-cloning firm ElevenLabs to make quick audio series from text content, like script, through AI.
Pocket FM, which pulled in $103 million in Series D funding in March, said at the time to TechCrunch that it was already experimenting with the ability to convert text content into audio using ElevenLabs' tech. Now, the India-based company expands that partnership to make the conversion tool available to all creators over the next few weeks.
Now, in the testing phase, Pocket FM has already created 30,000 hours of audio series using ElevenLab's AI tech. With this roll out, the startup believes it can triple its content library from over 100,000 hours of audio content into this year. Pocket FM added that it had also been able to cut the cost of producing audio by 90 per cent during the experimental phase with AI-powered tools.
According to TechCrunch, on the call, Pocket FM's co-founder and CTO Prateek Dixit said, "This partnership helps the company in making it easier for writers to convert their writings into audio series."
"We have more than 250,000 writers (including the ones on the company's Pocket Novel writing plaform) and this partnership lowers the cost of setting up and recording audio for them," he said.
"With a good set up of recording tools and equipment, writers can produce about 30 minutes of good quality audio content in a day. With the AI tools, that can easily be 10 times more," he added.
Pocket FM has created a tool which uses ElevenLabs technology to give writers 50 voices with which to translate their content. According to co-founder at ElevenLabs Mati Staniszewski, the tool automatically infers emotions through the voice that is used according to the context in which it was written.
"Working with Pocket FM we are deploying our newer models that understand the genre of writing and are emotionality better," said Staniszewski.
Dixit mentions that the company will also recommend voices that adapt well to writers of a particular genre through data received from user interactions with this type of content.
Pocket FM isn't the first audio series platform trying out its AI-powered tools. Google-backed Kuku FM is using GPT-4, Claude, BandLab, and even ElevenLabs to assist the writers at different stages of its creative process be it perfecting the script, generating thumbnails, adding sound effects, or converting texts into audio.
Kuku FM also told TechCrunch that they're looking into applying visual generation tools like Midjourney and Runway in material for advertising content.
Quality of the content and effect on artists
While AI promises to generate more content faster, it doesn't promise good content. Pocket FM's approach to solving the problem of discovery and surfacing quality content is building its discovery algorithm sophisticated and experimenting with user engagement.
If a writer is publishing an audio series, we surfacing that content to a small number of users to see how they're engaging with it. If the engagement metrics are good, we propagate further. That's basically what Dixit said.
Kuku FM said it has partnered with its quality control team and now verifies that content on the app meets the best standards in place, even if creators have used AI as part of their process.
"We understand that a human Quality Control team has to be the core of our decision-making when it comes to audio content production. We have established a core team of Content Producers who have great ownership & authority on the artistic standards," said the company's co-founder & CEO Lal Chand Bisu.
It would mean faster results and a larger content library for these platforms, though it also cuts down the roles of voiceover artists working with them. The Association of Voiceover Artists in India has voiced their concerns about AI taking over.
According to Amarinder Singh Sodhi, general secretary of the Association, "If AI takes over, we are finished. As voice artists, we need to get some regulation in place so that our livelihood is protected."
Sodi even spoke to Scroll about experiences in which voiceover artists were ushered into a studio to record samples in order to train the AI without even their consent or informing them.
The Delhi-based voiceover artist, who spoke to TechCrunch, said, "On an emotional level, it scares me. You're diluting the human experience of storytelling by using AI. You lose out on an emotional connection."
He said that this access to premium voices by people who do not have the taste and the skill to produce quality content will flood the market with bad content.
Other voice artists across the globe have voiced their fears regarding AI that could potentially overshadow them in terms of employment. Despite this, they work with such AI companies and are disturbed by the distorted version of their voices.
We asked the company about the impact of AI voice generation on Pocket FM. The company was mum about it, but its public policy head Dixit said the engagement with AI-generated content of its experiments is "as good as human voiceover production." The company also happens to be working on the technology to have more than one voice inside one output of audio.
Both Pocket FM and Kuku FM do not have any labeling of the content to indicate whether or not AI has been used in the production process.