New AI music generator makes songs from text prompts

Udio lets you create 1200 songs per month for free.

This article is an installment of Future Explored, a weekly guide to world-changing technology. You can get stories like this one straight to your inbox every Thursday morning by subscribing here.

On April 10, a new release shook the world of music. 

No, it wasn’t a new Frank Ocean album or even another Drake diss track (more on those later). This drop wasn’t actually a new piece of music at all. It was Udio, an app that uses AI to generate music from users’ text prompts — think ChatGPT for instrumentals.

Though not the first product of its kind, Udio is arguably the best of the bunch — so how exactly does it work, and what could it mean for the future of music?

Music-making AIs

Udio, the startup behind the app, was founded in December 2023 by a group of former researchers for Google Deepmind, the tech giant’s AI laboratory, and after operating in closed beta mode, the platform launched publicly earlier this month.

“It’s truly unbelievable.”

David Ding

Now, anyone can use Udio to generate up to 1,200 songs a month for free — all you have to do is sign in with your Google, Twitter, or Discord account and choose a username. 

Once logged in, you type a description of the song you want to create into a text box — you can include style descriptors here or click on suggested genre and mood tags. You then choose whether you want the song to have auto-generated lyrics, be instrumental, or use lyrics that you enter manually.

Click “Create,” and within about five minutes, Udio’s AI generates two songs, each 32 seconds long, based on your input. 

If you like either, you can download the file, share a link to it, or publish it to the Udio community. You also have the option to extend the track — adding an intro, an outro, or a section before or after the one you already generated — or remix it, generating a slightly different version.

“Udio enables everyone from classically trained musicians to those with pop star ambitions to hip hop fans to people who just want to have fun with their friends to create awe-inspiring songs in mere moments,” David Ding, co-founder and CEO of Udio, told Freethink. “It’s truly unbelievable.”

Udio isn’t the first platform of its kind — OpenAI has developed an AI music generator (Jukebox), as have Google (MusicLM) and Meta (Audiocraft), and the number of startups releasing the tools is seemingly ever growing — but tech reviewers are praising Udio as one of if not the best of the lot.

“We are higher quality across the board in instrumental and vocals,” Ding told Freethink. “While others are good for AI music, we are comparable to the best human created music.”

As for how it created the platform, Udio isn’t saying, but OpenAI, Google, and other developers have revealed how their AI music generators work, and it basically comes down to feeding the model a lot of samples of music — 1.2 million songs, in the case of Jukebox — along with written lyrics and text descriptions of the audio.

The AI breaks the music into discrete units, called “tokens,” and then learns to predict which tokens should be combined in what order to create brand new compositions that satisfy a text prompt.

The controversy

Because AI music generators are trained on real songs, created by real artists, their outputs can sound very similar to music created by those artists, and some platforms, like Jammable and Uberduck, even let users choose AI clones of specific artists’ voices to use in their creations.

Some people are taking advantage of these tools to create and release tracks that sound like they were made by real artists, but that are actually AI-generated. 

Just this week, two newly leaked diss tracks in the ongoing beef between Drake and Kendrick Lamar were confirmed to be AI-generated, and tracks mimicking Eminem, Frank Sinatra, and many other artists can be found online — sometimes labeled as AI-generated, sometimes not.

The legality of all this is… TBD. In January 2024, US lawmakers put forth the No AI FRAUD Act, which is designed to prevent unauthorized voice or likeness cloning, and another bill, introduced in March, would require that all AI-generated content (audio, images, etc.) be labeled as such.

“An artist’s voice is often the most valuable part of their livelihood and public persona, and to steal it, no matter the means, is wrong,” Jeffrey Harleston, general counsel for Universal Music  Group, told Congress in July 2023 while arguing for new laws to protect artists’ rights.

Even if an AI music generator’s output doesn’t sound like a specific artist, the systems need to be trained on a lot of music in order to work, and tech companies typically don’t ask artists or labels if they’re OK with having their music included in the training data, let alone compensating them for its use.

This has led to several pending lawsuits against the developers of generative AIs (not just music-making ones), with plaintiffs arguing that tech companies’ use of copyrighted data to train the systems is illegal. AI developers’ general defense, meanwhile, has been that training data falls under “fair use.”

“Our very culture depends on getting this right.”

Jen Jacobsen

Regardless of the outcome of those lawsuits, many musicians are not happy about having their work used to train generative AIs — on April 1, more than 200 artists, including Billie Eilish, Nicki Minaj, and Metro Boomin, signed an open letter, released through the Artist Rights Alliance (ARA), decrying this practice and urging tech platforms to “stop devaluing music.”

“Unethical AI use not only impacts the ability of artists to make a living, it threatens to destroy the music listening experience for all of us,” Jen Jacobsen, executive director of the ARA, told Freethink.

“If AI developers and online platforms choose to engage in practices that devalue human artists, we could end up limiting the quality, diversity, and accessibility of music for generations to come,” she continued. “Our very culture depends on getting this right.”

“We are deeply committed to ensure all works created on our platform are novel.”

David Ding

While Udio hasn’t said what data was used to train its AI music generator, it does prohibit users from including a specific artist’s name in prompts, which makes it very difficult to intentionally create a musical deepfake using the platform.

Ask for a song “in the style of Taylor Swift,” for example, and you’ll get a message saying that Udio does not “generate artist likeness without permission” and that it is replacing the phrase “Taylor Swift” in your prompt with “female vocalist,” “country pop,” and a few other related descriptors.

That doesn’t mean the AI doesn’t occasionally create music that sounds a lot like a real person — Rolling Stone pointed out that two tracks it generated with Udio included vocals that sounded a lot like Tom Petty — but the company says that’s unintentional, and that it’s working to prevent it.

“We have extensive automated copyright filters in place to ensure our outputs do not infringe copyrighted material,” Ding told Freethink. “The team is continually refining our safeguards, but we are deeply committed to ensure all works created on our platform are novel.”

The path forward

Like Udio, other generative AI developers seem to want to figure out a way to peacefully coexist with the music industry, rather than fight against it, and some musicians seem to at least want to be involved in the conversation as the tech develops. 

“AI is going to transform the world and the music industry in ways we do not yet fully understand.” 

Charli XCX

In 2023, for example, Google Deepmind teamed up with YouTube for Dream Track, a project that let select YouTube creators make 30 second songs for Shorts using Google’s AI music generator Lyria and the voices of nine major label musicians, including Charli XCX, Demi Lovato, and John Legend, with the artists’ permission.

“When I was first approached by YouTube, I was cautious and still am; AI is going to transform the world and the music industry in ways we do not yet fully understand,” said Charli XCX. “This experiment will offer a small insight into the creative opportunities that could be possible and I’m interested to see what comes out of it.”

Udio, meanwhile, says it is in discussions with artists interested in leveraging its AI to make money, and while building the platform, it sought feedback from artists and music producers, including Will.i.am, Common, and Tay Keith, all of whom have invested in the startup.

“At every stage of development, we talked to people in the industry about how we could bring this technology to market in a way that benefits both artists and musicians,” said Ding.

“I think that, over time, consenting data will actually be better for artists and individuals, and be better for AI.”

Holly Herndon

Some artists are already embracing generative AIs, using the tech to create their own new music, while others are making it easier for fans to get creative with their voice clones.

Pop artist Grimes, for instance, has built a website where people can create and distribute tracks featuring a clone of her voice — she then splits any royalties made from the songs 50/50 with the uploader. Artists and technologists Holly Herndon and Mat Dryhurst, meanwhile, used AI to create a clone of Herndon’s voice that anyone can use to make music.

“Do you say, ‘Nobody should be able to create as me, and I’m gonna shut this down’?” Dryhurst told Freethink. “Or do you lean into it and say, ‘Let’s acknowledge that this is now a thing, and see how far we can take it’?”

Ultimately, Herdon and Dryhurst see consent and compensation as the keys to a harmonious relationship between the music industry and generative AI developers: Artists should be asked for consent before their music is used for training, and if their voice is going to be used to create something new, they should be compensated, the same way musicians are compensated when someone samples their recordings.

“Now is the time to get excited about it and start having some big ideas because I think that over time, consenting data will actually be better for artists and individuals, and be better for AI, and everyone will want to opt-in — but on their own terms,” Herndon told Freethink.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.

Related
Flexport is using generative AI to create the “holy grail” of shipping
Flexport is using generative AI to read documents, talk to truckers, and create a “knowledge agent” that’s an expert in shipping.
The West needs more water. This Nobel winner may have the answer.
Paul Migrom has an Emmy, a Nobel, and a successful company. There’s one more big problem on the to-do list.
Can we automate science? Sam Rodriques is already doing it.
People need to anticipate the revolution that’s coming in how humans and AI will collaborate to create discoveries, argues Sam Rodrigues.
AI is now designing chips for AI
AI-designed microchips have more power, lower cost, and are changing the tech landscape.
Why futurist Amy Webb sees a “technology supercycle” headed our way
Amy Webb’s data suggests we are on the cusp of a new tech revolution that will reshape the world in much the same way the steam engine and internet did in the past.
Up Next
Exit mobile version