OpenAI had originally planned to roll out the new features to some paying subscribers in late June, but the company announced that it would delay the initial release by a month. statement The company says the feature will be available to all paid users in the fall, but adds the disclaimer that “the exact timeline will depend on meeting our high safety and reliability standards.”
OpenAI first added the ability to speak in one of several synthetic voices, or “personas,” to ChatGPT late last year. In a demo in May, it used one of those voices to show off a newer, more capable AI system, GPT-4o, in which the chatbot speaks in an expressive tone, responds to a person’s tone of voice and facial expressions, and has more complex conversations. One of the voices, which OpenAI named Sky, sounds similar to the AI bot played by Johansson in “Her,” the 2013 film about a lonely man who falls in love with his AI assistant.
OpenAI CEO Sam Altman denied that the company trained its bots with Johansson’s voice, and The Washington Post reported last month that the company had hired a different actor to provide the training audio. According to internal records and interviews with casting directors and actors’ agents.
Get caught up in
Stories to keep you up to date
As the world’s largest tech companies and startups like OpenAI race to develop generative AI, some projects are hitting unexpected roadblocks. Last month, Google reduced how often it showed AI-generated answers at the top of search results after the tool made bizarre errors, such as telling people to put glue on pizza. In February, it pulled an AI image-generating tool that had been criticized for creating images of a female pope and other images. Microsoft made changes to its AI chatbot last year after it sometimes gave strange and offensive answers.
OpenAI said Tuesday it needed more time to ensure the new voice version of its chatbot could better detect and block certain content, but gave no details. Many AI tools have been criticized for making up false information, spitting out racist or sexist content, or showing bias in their output. Designing a chatbot that tries to interpret and mimic emotions complicates interactions and creates new opportunities for problems.
“ChatGPT’s advanced voice modes can understand and respond to emotions and non-verbal cues, bringing us closer to having a real-time, natural conversation with an AI,” OpenAI said in a statement. “Our mission is to deliver these new experiences thoughtfully.”