OpenAI boosts ChatGPT: nearing AI parity, add voice/image recognition.
With today’s update to the iOS and Android ChatGPT mobile apps, users will be able to address questions to the chatbot and hear its own synthesized voice respond. The updated ChatGPT also includes the addition of visual intelligence: When you upload or take a picture using the app, it will immediately respond with a description of the picture and more information, much like Google’s Lens function.
The brand-new ChatGPT features show how OpenAI sees its decades-old AI models like products that go through periodic incremental updates. The unanticipated success of the startup, ChatGPT, is starting to resemble a consumer app like Apple’s Siri or Amazon’s Alexa.
The capacity to communicate with ChatGPT is based on two different models. What you say is translated into text by Whisper, OpenAI current speech-to-text algorithm, and supplied to the chatbot. And a fresh text-to-speech algorithm speaks the responses from ChatGPT.
Product manager Joanne Jang offered me a demonstration of ChatGPT’s selection of artificial voices last week. These were produced by using the voices of actors that OpenAI had hired to train the text-to-speech model.
Users might even be able to develop their own voices in the future. If this is a voice you could listen to all day, that was the main consideration while creating the voices, according to the author.