HEAR ME OUT

Tech experts optimistic about voice cloning AI – even as they sound alarm over devastating ‘deepfakes’

There is a thin line between AI's helpful applications and the danger of when it falls into the wrong hands

Exclusive

By Mackenzie Tatananni

4th August 2024, 2:59 pm

Updated: 8th August 2024, 1:22 pm

DESPITE the enormous risk associated with AI voice cloning tools, some experts believe the emergent technology could be used for good.

Podcastle, a podcast platform powered by artificial intelligence, aims to rewrite the conversation surrounding artificial intelligence.

Podcastle is an all-in-on content creation platform that produces and edits audio with the help of artificial intelligence

The company aims to simplify content creation through the assistance of.

CEO and founder Artavazd Yeritsyan spoke to The U.S. Sun to shed light on the company’s mission.

“We are changing the way audio and video content is created, making it a lot easier for creators and teams by natively integrating AI technologies,” Yeritsyan explained.

“We basically want to make content creation radically simple and accessible to everyone.”

In simple terms, users can record audio and video using the Podcastle infrastructure and make adjustments using AI.

This means removing pauses, cutting out words, or simply improving the quality – all with the help of an artificial intelligence model that is trained in-house.

Users can even clone their voices and use a text-to-speech function in case they don’t feel like recording something.

But artificial intelligence remains a highly divisive issue. As models are trained on vast data sets, critics question where exactly this information comes from.

To add insult to injury, tech behemoths like Meta have admitted to scrubbing data from public social media profiles to train AI.

This revelation sparked concern among data privacy experts and even triggered an inquiry led by the Information Commissioner’s Office in the United Kingdom.

Terrifying AI test reveals most Americans can't pick out fake voice when asked – try your luck against eerie 'deepfake'

There is also the chance people won’t use AI tools for their intended purpose. Voice cloning technology, in particular, has a high potential for abuse.

The risk is so great that Microsoft has refused to release its latest text-to-speech generator, VALL-E 2, citing the fears of “misuse.”

The tool can replicate voices after being trained on just a few seconds of audio and prevents recurring sounds or phrases during the decoding process to make the output sound more natural.

CEO Artavazd Yeritsyan believes voice cloning technology could play a role in accessibility and translation despite the dangers

Microsoft says VALL-E 2 is the first of its kind to achieve “human parity,” meaning it meets or surpasses benchmarks for human likeness.

But the company has no plans to incorporate the tool into a product or expand public access, as spelled out on its .

“It may carry potential risks in the misuse of the model, such as spoofing voice identification or impersonating a specific speaker,” reads a separate .

Stoking developers’ fears is the rise of so-called “vishing” attacks where scammers pose as a victim’s friends or relatives by replicating their voices – a process commonly assisted by AI.

Getty

The risk of misuse is so high that Microsoft refused to release its text-to-speech generator VALL-E 2 to the public

The results are so convincing that a victim may willingly surrender information like credit card numbers or bank account details.

Podcastle has checks in place to prevent the creation of deepfakes, or synthetic audio that portrays a person saying something they didn’t say.

“When we started building the technology, we really wanted to be the most ethical and safe platform doing voice clones,” Yeritsyan said.

To discourage improper use, Podcastle implements “roadblocks” in the content creation process.

“In order to clone your voice, you need to actually record the sentences that we give you,” Yeritsyan explained.

“Based on how you pronounce them, how you create them, we understand that it’s you and only you can use it.”

Analysts have witnessed the explosion of vishing, or voice phishing, attacks in the past year alone, as AI tools can allow scammers to replicate voices with ease

A user’s content is then encrypted, or scrambled so hackers can’t interpret it.

“That’s why we don’t have a single case of deepfakes in our platform,” Yeritsyan concluded.

The CEO’s optimism is a reminder that some tech pioneers have found a silver lining amid the doom and gloom.

Yeritsyan anticipates the expansion of voice cloning technology in the near future, especially as it concerns accessibility and translation features.

“People with disabilities who can’t speak can easily use text-to-speech to get content out there,” the CEO said.

He is also a proponent of AI education to encourage responsible use of the technology.

Getty

The pressure against tech giants continues to build – Meta, for one, has faced criticism for training its AI on the public profiles of Instagram and Facebook users

Podcastle offers discounted subscriptions to students on the belief they will be required to use similar tools once they enter the job market.

And Yeritsyan isn’t alone in his hopes that voice cloning tech could be used for good. Companies like Microsoft have similar aspirations for tools like VALL-E.

As is stands, the addition of subtitles on streaming platforms is seen as an inconvenience due to the requirement for “manual labor,” Yeritsyan said.

“And it’s very expensive for the company, so unless there is regulation by the government, a lot of companies will just not do it because of the cost.”

However, AI voice cloning technology can reduce time and money spent, leaving it up to companies to exercise goodwill.

“I think we are at a stage where those technologies really can be helpful in a more tangible and practical way,” Yeritsyan said.

What are the arguments against AI?

Artificial intelligence is a highly contested issue, and it seems everyone has a stance on it. Here are some common arguments against it:

Loss of jobs – Some industry experts argue that AI will create new niches in the job market, and as some roles are eliminated, others will appear. However, many artists and writers insist the argument is ethical, as generative AI tools are being trained on their work and wouldn’t function otherwise.

Ethics – When AI is trained on a dataset, much of the content is taken from the Internet. This is almost always, if not exclusively, done without notifying the people whose work is being taken.

Privacy – Content from personal social media accounts may be fed to language models to train them. Concerns have cropped up as Meta unveils its AI assistants across platforms like Facebook and Instagram. There have been legal challenges to this: in 2016, legislation was created to protect personal data in the EU, and similar laws are in the works in the United States.

Misinformation – As AI tools pulls information from the Internet, they may take things out of context or suffer hallucinations that produce nonsensical answers. Tools like Copilot on Bing and Google’s generative AI in search are always at risk of getting things wrong. Some critics argue this could have lethal effects – such as AI prescribing the wrong health information.

Exit mobile version