Zoom wants to make you an AI-animated, photorealistic avatar- but not before sometime next year.
Today, the company announced at Zoom's annual dev conference a new feature that will turn a video clip users shoot of themselves into a digital clone--armed with a head, upper arms, and shoulders--that can say or do anything. Users can type up a script of what they'd like their digital double to say, and Zoom will generate audio that happens in sync with the avatar's lip movements.
Custom avatars, according to Zoom chief product officer Smita Hashim in an interview with TechCrunch, were intended to enable folks to communicate "asynchronously" with colleagues in a "faster, more productive" way.
"Avatars save users precious time and effort recording clips, and enable them to scale video creation," Hashim said.
They might also pose a deepfake risk, however.
Several companies have developed AI technology that generates a digital "clone" of someone's face and combines it with pretty natural-sounding synthetic speech. Tavus, for instance, lets brands create virtual personas to use in personalized video ads; last year, Microsoft rolled out a service capable of generating convincing digital stand-ins for a person.
But many of these services put in place certain, stringent safeguards against their abuse. Tavus demands that consent statements are made verbally while Microsoft oblige customers to seek written permission and consent from any talent that might feature avatars.
Zoom was not much clearer about its measures to prevent misuse.
zoom's usage policy prohibits misuse, said Hashim, pointing to such features like "advanced authentication" and watermarking that the company is "engineering into" its custom avatar feature.
"We will ongoing monitor and update security controls as required into the future," said Hashim. "We use (….) tech to clearly indicate when a clip was created using an avatar, and (….) to support the trustworthiness of avatar-created content." Zoom's digital doppelgangers fall under a larger aim by Zoom CEO Eric Yuan to create AIs that one day will be able to speak on your behalf during Zoom meetings, answer your emails and even answer your phones.
But the deepfakes come at a time when the spread across social media has reached fire speeds, hardening it to distinguish between truth and falsehood.
So far this year, deepfakes featuring President Joe Biden, Taylor Swift, and Vice President Kamala Harris have received millions of views and reshares. Recently, fake generative AI images of destruction and human suffering flooded the web in the aftermath of Hurricane Helene.
Deepfakes have also been employed against people-by posing as loved ones, for instance. Impersonation scams cost people over $1 billion in losses last year, the FTC said.
How will Zoom prevent hackers from using its service to create phony clips of people saying things they never actually said for nefarious purposes? Too early to tell. The company today unveiled a version that shows an overt watermark in the upper-right-hand corner of a custom avatar video. But such watermarks are easy to crop out with screen-recording tools.
We're going to have to wait a bit longer, closer to the first half of 2025, when Zoom plans to roll out custom avatars for Zoom Clips, its asynchronous video tool, as part of a $12 per user per month premium add-on.
Whatever steps Zoom does — or doesn't — end up taking, there are ongoing regulatory efforts to attempt to beat back the deluge of deepfakes.
In the U.S., there isn't a federal law that criminalizes deepfakes. Still, more than 10 states have passed statutes against AI-aided impersonation. California's law – currently stalled – would be the first to empower a judge to order posters of deepfakes to take them down or potentially face monetary penalties.