What is Meta's MusicGen and how to use it?

Based on text prompts, Meta’s MusicGen may create brief new musical compositions that are optionally aligned to an existing tune.

MusicGen is built on a Transformer model, as are the majority of language models used today. MusicGen predicts the next segment of a piece of music in a manner similar to how a language model predicts the following letters in a phrase. The researchers use Meta’s EnCodec audio tokenizer to break down the audio data into smaller pieces. MusicGen is a quick and effective single-stage approach that performs token processing in parallel.

For training, the crew used 20,000 hours of authorized music. They used 10,000high-qualityaudio recordings from an internal dataset as well as Shutterstock and Pond5 music data, in particular.

MusicGen: What is it?

Like the bulk of language models in use today, MusicGen is constructed using a Transformer model. Similar to how a language model predicts the subsequent letters in a sentence, MusicGen predicts the following section of a piece of music.

The audio data is divided into smaller bits using Meta’s EnCodec audio tokenizer by the researchers. MusicGen is a single-stage method that processes tokens quickly and effectively in parallel.

Along with the efficacy of the design and the speed of production, MusicGen is outstanding in its ability to manage both text and musical cues. The music in the audio file follows the basic style established in the text.

You can’t exactly shift the melody’s direction to hear it, say, in other musical genres. It only serves as a broad guide for the generation and is not exactly replicated in the output.

Despite the fact that many other models are running text generation, voice synthesizing, generated visuals, and even small films, there haven’t been many high-quality examples of music production that have been made accessible to the public.

MusicGen: How to use it?

Users may test out MusicGen using the Hugging Face API, but depending on how many users are using it at once, it can take some time to produce any music. You may put up your own instance of the model using the Hugging Face website for much speedier results. You might download the code and run it yourself if you have the required knowledge and tools.

Here’s how to try the online version if, like the majority of people, you want to:

Launch a web browser.
Visit the webpage for Hugging Face.
At the upper right, choose Spaces.
Search for “MusicGen” in the box.
Locate the one that Facebook published.
In the box on the left, type your prompt.
Choose “Generate”.

That’s everything you need to know!

MusicGen beats MusicLM by a hair

Three distinct sizes of the model—300 million (300M), 1.5 billion (1.5B), and 3.3 billion (3.3B) parameters—were tested by the study’s authors. The 1.5 billion parameter model was judged to be the best by humans, yet they discovered that the bigger models generated sounds of greater quality. On the other hand, the 3.3 billion parameter model performs more correctly when matching text input with audio output.

MusicGen scores better on both objective and subjective measures that gauge how well the music matches the words and how believable the composition is compared to other music models like Riffusion, Mousai, MusicLM, and Noise2Music. In general, the models are slightly better than Google’s MusicLM.

The code and models have been made available by Meta as open source on Github, and commercial use is allowed. There is a demo on Huggingface.

Do you know what ChatGPT Shared Links are?

Tags: AI meta MusicGen

What is Meta’s MusicGen and how to use it?

Utku Bayrak

Related Posts

Ubisoft games crash after Windows 11 update disrupts gaming sessions

The best password managers for 2024: Choosing the right one

How to turn off Apple Intelligence

Season 1 of Black Ops 6 is starting

LATEST

The best digital product agencies to collaborate with

Walmart’s new gaming site aims to fill the game informer void

Apple goes Pixel style? iPhone 17 Pro Max features bold camera redesign

Microsoft’s AI chip shopping spree leaves rivals in the dust

Samsung Galaxy S25 leaks are more of the same

Pokémon Go holiday event adds Festive Dedenne and big rewards

Windows 11 adds a game-changing webcam feature

Nvidia stock falls hard: Are thermal chip issues just the start?

Blackmagic’s $30K camera promises Vision Pro like you’ve never seen

AI will transform filmmaking but human actors can still play their part

© 2021 TechBriefly is a Linkmedya brand.