Google has created a program which uses text descriptions to generate complex music that sounds like it very well could have been created by humans. Fearing possible implications, Google is opting not to release the program, called MusicLM, to the public just yet.
While possibly not as unified or cohesive as music composed by the best human musicians, MusicLM is still able to generate tracks that faithfully follow the text input that created them, such as “a calming violin melody backed by a distorted guitar riff,” or even longer, complex descriptions such as “The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.”
The science behind MusicLM is detailed in an academic paper, which explains that the program has been trained on 280,000 hours of music. Furthermore, the paper details that in addition to generating its own tracks from scratch, MusicLM is able to build on existing melodies that are input to it, such as guitar part or a vocal line. The generative software has also proven to function even when not fed specific musical phrases, and can create cohesive pieces based on abstract ideas, such as “time to meditate,” and can even create tracks based on stacks of different ideas simultaneously.
See also: Microsoft launches Teams Premium, powered by OpenAI
Google is opting to not release the program to the public, at least just yet, due to ethical dilemmas associated with this type of software. Google conducted an experiment to determine how much of the music generated by MusicLM was directly lifted from the songs it was trained on. The answer round up being roughly 1%, a result deemed too high and unsatisfactory for the tech giant.
While plenty more samples created by MusicLM are available on the program’s github, here are some demos:
Here’s a track created from the input, “the main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls”:
This one’s input was “a rising synth is playing an arpeggio with a lot of reverb. It is backed by pads, sub bass line and soft drums. This song is full of synth sounds creating a soothing and adventurous atmosphere. It may be playing at a festival during two songs for a buildup”:
Finally, here’s “Slow tempo, bass-and-drums-led reggae song. Sustained electric guitar. High-pitched bongos with ringing tones. Vocals are relaxed with a laid-back feel, very expressive”: