OpenAI says it can clone a voice from just 15 seconds of audio

OpenAI just announced of a new tool called Voice Engine. This is voice technology that can mimic any speaker by analyzing 15 seconds of audio. The company says it creates “natural sounding voices” with “emotional and realistic voices.”

The technology comes from the company and has been active since 2022. OpenAI has already been using the tools to use the speech tools available in the current API-to-speech and the Read Aloud feature. There are several examples on the official blog of the company and they sound very close to reality. I encourage you to listen to them and think about the possibilities, good and bad.

OpenAI says it sees the technology as useful for reading, interpreting languages ​​and helping those with sudden or severe speech impairments. The company released a which helped a patient with speech impairment by creating a Voice Engine model derived from audio recordings for a school project.

Despite the benefits, bad actors can use this technology to engage in more serious activities, . With this in mind, Voice Engine is not ready for the big time, as there are significant privacy concerns that need to be met before it can be fully released.

OpenAI acknowledges that this technology has “significant risks, which are particularly high in an election year.” The company says it’s incorporating feedback from “US and international partners from government, media, entertainment, education, civil society and beyond” to ensure the product launches with minimal risk. All of the preview testers agreed to OpenAI’s terms of use, which prohibit copying without permission or legal rights.

In addition, any user of the technology will have to disclose to the audience that the speech is AI-generated. OpenAI implemented security measures, such as watermarking to identify the source of each voice and “rapid monitoring” of how the system is being used. When the product is released there will be a “no-go list” that identifies and blocks AI-generated speakers that are too similar to celebrities.

As for when this release will happen, OpenAI remains tight-lipped. Results TechCrunch and it seems to be getting closer . Voice Engine can cost $15 per million characters, which is about 162,500 words. This is almost as long as Stephen King’s Light. It sounds like an easy budget way to do an audiobook. The marketing materials also mention an “HD” model that costs twice as much, but the company did not explain in detail how this will work.

OpenAI has been doing big things this week. It just announced a partnership with Microsoft to build an AI-based supercomputer called “Stargate.” The project will cost $100 billion, .

This article contains affiliate links; If you click on such a link and make a purchase, we may find a service.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *