OpenAI Can Reproduce Human Voices—But It’s Still Not Bringing Out the Tech


Synthesis has come a long way since the 1978s Speech & Speaking the toy, which initially wowed people with its incredible ability to read words aloud using an electronic voice. Now, using deep learning Types of AIcomputer programs can produce sound that is clear and convincing imitate the existing words using small voice samples.

In line with this, OpenAI this week announced Voice Engine, a type of AI that uses voice and speech to create a speech made from a 15-second segment of recorded speech. It has provided audio samples of the Voice Engine in action on his website.

Once a voice is generated, the user can input the voice into the Voice Engine and get AI-generated voice results. But OpenAI is not ready to release its technology. The company originally planned to launch a pilot program for developers to sign up for the Voice Engine API earlier this month. But after much ethical consideration, the company has decided to scale back its ambitions for now.

“In line with our approach to AI security and our voluntary commitment, we are choosing to preview but not release this technology at this time,” the company writes. “We hope this preview of the Voice Engine underscores its potential and reinforces the importance of building people’s resilience in the face of challenges posed by high-fidelity models.”

Voiceover technology is often not new – it’s been around several AI voice types since 2022, and the technology is working on an open source and package like OpenVoice and XTTSv2. But the idea that OpenAI wants to let anyone use its own version of voice technology is popular. And in some ways, the company’s failure to fully release it can be a big issue.

OpenAI says the benefits of speech technology include providing reading assistance through natural speech, enabling developers to reach the world by translating content and preserving native speech, helping people who are speechless, and helping patients regain their hearing. speech impediments.

But it also means that anyone with 15 seconds of recorded human voice can make a good comparison, and that has implications for abuse. Although OpenAI will no longer release a Voice Engine, the ability to integrate voice has caused problems in the community fraudulent calls where one imitates the voice of one’s lover and Election campaign robocalls featuring quotes made by politicians like Joe Biden.

Also, researchers and journalists has shown Voice integration technology can be used to log into bank accounts that use voice authentication (such as Chase’s Word ID), which prompted US senator Sherrod Brown of Ohio, chairman of the US Senate Committee on Banking, Housing, and Urban Affairs, to post. a letter for CEOs most major banks in May 2023 to ask about the security measures banks are taking to deal with AI-driven risks.

OpenAI recognizes that technology can cause problems if it is released too deeply, so it is trying to deal with those problems with a number of rules. It has been testing the technology with selected partner companies since last year. For example, a video production company Hey Gen has been using this model to translate the speaker’s words into other languages ​​while keeping the same words.


Source link

Leave a Reply

Your email address will not be published. Required fields are marked *