Researchers at computer chip giant Nvidia have developed an AI audio generator that they claim can produce sound that hasn’t been heard before.
The new generative AI model is called Fugatto and the team at Nvidia said they wanted to create a “Swiss army knife for sound”.
Fugatto can transform or generate any combination of music, sound or speech based on text prompts given by the user.
This includes making a soundscape based on a text prompt, adding or removing instruments from an existing piece of music or changing the accent or emotion of a voice.
In a blog post by Nvidia, the manager of applied audio at Nividia and orchestra conductor and composer said they wanted to create a model that understands and generates sounds like humans do.
“Fugatto is our first step toward a future where unsupervised multitask learning in audio synthesis and transformation emerges from data and model scale,” Valle said.
Fugatto took more than a year to develop and required millions of audio samples to train the AI.
While the team at Nvidia said the new AI will revolutionise music, the exponential growth of generative AI products has led to concerns about how to impact people working in creative industries.
Earlier this year the Australian Association of Voice Actors shared their concerns to a parliamentary committee that an estimated 5,000 local voice actors could soon be put out of a job due to audio AI.
Similar concerns have been raised in the music industry about how generative AI might infringe on copyrighted materials with the Recording Industry Association of America filing lawsuits against AI for allegedly replicating their artist's music.
Despite the pushback, there are many artists who believe AI could prove to be an asset in their creative work.
“This thing is wild,” said Ido Zmishlany, a multi-platinum producer and songwriter — and cofounder of One Take Audio, a member of the NVIDIA Inception program for cutting-edge startups.
“The idea that I can create entirely new sounds on the fly in the studio is incredible.”
Related content
