Google MusicLM AI: Turn text to... music!

Google MusicLM AI is introduced, a groundbreaking AI system capable of composing musical pieces in any genre given a text description. Despite its impressive capabilities, the company has opted to err on the side of caution and has no current plans to make it publicly available. While previous generative AI systems for music, such as Riffusion and Dance Diffusion, have made attempts to compose songs, they have been limited by technical constraints and insufficient training data, resulting in compositions that lack complexity and high-fidelity. However, Google MusicLM AI represents a significant leap forward and may be the first system to overcome these challenges.

whoa, this is bigger than ChatGPT to me.
google almost solved music generation, i'd say. https://t.co/s9PQaJ5R6A
— Keunwoo Choi (@keunwoochoi) January 27, 2023
You Might Also Like
Syd Mead, artist known for Blade Runner and Aliens, dies at 86
31 December 2019
Best iOS 17 features for a smarter, faster, and more personalized iPhone
23 June 2023
Apple Music claims to pay 1 cent per playback to artists
20 April 2021

Meet Google MusicLM AI, the first significant text-to-music tool

A recent academic paper highlights the development of MusicLM, an AI system trained on a massive dataset consisting of 280,000 hours of music. The system’s objective was to generate songs with “significant complexity” based on textual descriptions, such as “enchanting jazz song with a memorable saxophone solo and a solo singer” or “Berlin ’90s techno with a low bass and strong kick.” The output produced by Google MusicLM AI, while not necessarily as imaginative or musically unified as a human artist, nonetheless possesses a remarkable resemblance to human-composed music.

It would be an understatement to express how impressive the output produced by Google MusicLM AI is, particularly given the absence of human musicians or instrumentalists in the creation process. Despite being provided with sometimes lengthy and complex descriptions, MusicLM has the remarkable ability to incorporate intricate elements such as instrumental riffs, melodic lines, and emotional undertones in its compositions.

The abilities of Google MusicLM AI go beyond mere generation of short musical snippets. The Google research team demonstrated the system’s ability to construct upon pre-existing melodies, whether they be hummed, sung, whistled, or played on an instrument. Furthermore, MusicLM has the capability to take several sequential descriptions and craft a melodic narrative spanning several minutes, making it suitable for a movie soundtrack.

Google MusicLM AI’s versatility extends to being directed through a combination of images and captions, as well as producing audio that mimics the sound of a specified instrument within a particular genre. The expertise of the AI “musician” can also be adjusted, allowing the system to generate music influenced by various locations, time periods, or specific purposes, such as uplifting music for physical exercise.

You can reach Google MusicLM AI’s GitHub page to listen to some samples!

Is Google AI music composer really good?

However, it is important to note that Google MusicLM AI is not without faults. Some of the output produced by the system can exhibit a disjointed quality, a natural outcome of the training procedure. The system’s ability to generate vocals, including choral harmonies, falls short of expectations. The lyrics generated are often not comprehensible, consisting of broken English or meaningless gibberish, and the synthesized vocals lack the sophistication of a single artist and instead sound like a hybrid of various voices.

Google's new music model MusicLM is the breakthrough of the week.
Here it is in action.
Just describe the music and it'll generate the track: pic.twitter.com/xAhzHfGnMH
— Pete (@nonmayorpete) January 27, 2023

Despite its impressive capabilities, the Google researchers acknowledge the numerous ethical dilemmas posed by a system like MusicLM, including the likelihood of incorporating copyrighted material from the training data into the generated songs. During their experimentation, they discovered that approximately 1% of the generated music was an exact copy of the songs from the training dataset. This high incidence of duplication has led the researchers to decide against releasing Google MusicLM AI in its current form.

“We acknowledge the risk of potential misappropriation of creative content associated to the use case. We strongly emphasize the need for more future work in tackling these risks associated to music generation,” the co-authors of the paper stated.

If MusicLM or a similar system were to be released in the future, it is likely that significant legal issues would arise, regardless of how the system is marketed or positioned, whether as an aid to artists or not. This is a concern that has already arisen in regards to simpler AI systems. In 2020, Jay-Z’s record label filed copyright infringement claims against the YouTube channel, Vocal Synthesis, for using AI to create covers of Jay-Z’s songs, including Billy Joel’s “We Didn’t Start the Fire”. After initially removing the videos, YouTube later reinstated them, determining that the takedown requests were “incomplete”. The legality of AI-generated music remains a grey area.

Google announces MusicLM: a model to generate music from text. Here are some crazy things it can do:
1. Given audio of a melody, it can generate new music inspired by that melody customized by prompts! Here's someone humming bella ciao turned into a cappella chorus, EDM, etc. pic.twitter.com/HKDnXI1C8U
— bleedingedge.ai (@bleedingedgeai) January 27, 2023

As AI technology for music generation continues to advance, questions surrounding its legality remain at the forefront. Eric Sunray, a legal intern at the Music Publishers Association, has authored a whitepaper that argues that systems such as MusicLM infringe on the rights protected under the United States Copyright Act through the creation of “tapestries of coherent audio” from copyrighted material used in their training.

These concerns have been echoed regarding AI systems in other fields, including image, code, and text generation, as their training data is often sourced from the web without consent from creators. The issue of fair use has also been debated following the release of OpenAI’s Jukebox, with some questioning the use of copyrighted material in the training of AI models.

As opined by Waxy’s Andy Baio, it is speculated that music generated by AI systems may be deemed as a derivative work from a user’s perspective, thereby only affording copyright protection to its original components. The definition of what constitutes “originality” in this context remains unclear, making commercial exploitation of such music an uncharted territory. If, however, the generated music falls within the ambit of fair use such as parody or commentary, the issue becomes less complicated. Nonetheless, Baio predicts that the court system would need to take a case-by-case approach to reach a verdict.

As the legal landscape continues to evolve, clarity on the issue surrounding music-generating AI may be imminent. Several ongoing lawsuits, including one addressing the rights of artists whose work is utilized in the training of AI systems without their authorization or awareness, will likely impact the industry. Only time will reveal the outcome of these legal proceedings.

Have you heard OpenAI’s new AI Text Classifier for detecting AI-generated texts?