These 'Expressive Avatar' Deepfakes From a Billion-Dollar AI Startup Look Scary Real — Here's Who's Already Using the Technology Is that a real person or an AI clone? New technology makes it nearly impossible to tell.
By Sherin Shibu
Key Takeaways
- Billion-dollar AI startup Synthesia unveiled Expressive Avatars on Thursday.
- Expressive Avatars are the world's first AI digital clones capable of producing human facial expressions from a written prompt.
- Synthesia markets the technology as a way to revamp trainings and presentations.
Is a human speaking behind the camera or an AI clone? A startling innovation from a unicorn startup backed by Nvidia makes it almost impossible to tell the difference.
AI startup Synthesia, which achieved unicorn, billion-dollar-valuation status last year, released new technology on Thursday called Expressive Avatars; the world's first AI digital clones capable of producing human facial expressions and the right tone of voice from written prompts.
The technology starts with an AI avatar, which can be customized to reflect real faces.
Photo Credit: Synthesia
The AI makes a digital copy of a person based on footage recorded through their webcam or at a certified studio. It can also clone the person's voice to infuse into their digital likeness.
Those wary about creating an AI avatar that takes on their face and voice can opt instead for one of the more than 160 preloaded AI avatars that Synthesia has in its database.
Related: 'This Is a Serious Problem': Mr. Beast Slams AI Deepfakes
Once a user creates or selects an AI avatar, they just have to do one more thing: Write what they want their digital selves to say.
In a demo seen by CNBC, a user wrote "I am happy. I am sad. I am frustrated." and had the AI-generated digital clone read the text. The avatar conveyed facial expressions and tone associated with happiness when speaking the text prompt "I am happy" and changed its inflection appropriately when saying "I am frustrated." The tone matched the words.
With an AI clone and a written prompt, a free user can generate 36 minutes of personalized videos in more than 120 languages every year. Paid plans go up to $67 per month for up to 360 minutes of video per year or unlimited minutes of video for businesses that opt for an enterprise plan.
Synthesia is a startup that major companies are using behind the scenes. Zoom, Xerox, Microsoft, and Reuters are all internally using Synthesia's programs. Synthesia CEO Victor Riparbelli told the MIT Technology Review that 56% of the Fortune 100 were using the technology.
Synthesia markets the technology as a way to create expressive digital avatars for corporate training and presentations. For example, Zoom designers created sales training videos in Synthesia in 90% less time than it took human beings to create the videos.
Related: JPMorgan Says Its AI Cash Flow Software Cut Human Work By Almost 90%
"Zoom's subject matter experts no longer need to record themselves, freeing up 15-20 hours each month to work on their actual job," the Synthesia website reads.
Still, the ability to create scary-good deepfakes, or AI that clones and manipulates voices, likenesses, or other aspects of a human being without their permission, can lead to misuse.
Last month, Tennessee became the first US state to pass legislation protecting music industry professionals from deepfakes.