STUDIO

    AI in Music

    AI in Music

    The possibilities of AI-powered music have been murmuring beneath the surface of the music industry for years, but it wasn’t until the release of ChatGPT in 2022 that the broader conversation around AI began to spread into the mainstream. We’re at a point now where some musicians and music industry professionals are fascinated by the possibilities of AI-powered music, while others are wary of the unknown, especially when regulation is still in its infancy. A study by music distribution company Ditto found that nearly 60 percent of artists surveyed say they use AI in their music projects, while 28 percent say they wouldn’t use AI for music purposes.

    Christopher Wears, associate chair of the music business/management department at Berklee College of Music, is a proponent of AI music technology. He even wrote a master’s thesis on why Warner Music should invest in AI, back in 2016 (spoiler alert: they did, along with every other major label). Wares has introduced AI into his courses at Berklee and has seen mixed reactions from students.
    “Some of my students love AI and are already using it in different ways, while others want nothing to do with it,” Wares says. “There’s a lot of heated debate in the conversations, and I try to encourage my students to embrace the technology and find new ways to use it to improve their creative processes.”

    Another course author and instructor with a similar mindset is Ben Camp, an associate professor of songwriting at Berklee College of Music and the author of Songs Unmasked: Techniques and Tips for Songwriting Success . They’ve been fascinated by AI music technology since 2016, after hearing “Daddy’s Car,” one of the first AI pop songs where the AI was trained on The Beatles’ music.

    The camp also gives its students the opportunity to learn AI in the classroom, as long as they fact-check all the information they learn from ChatGPT or any large language model.

    “I think everyone has to make their own choice,” says Camp. “I mean, I have friends who still use flip phones because they’re not comfortable with having all their information on their phone. I have friends who still have landlines. So I’m not saying, ‘Hey, everyone, you need to do this.’ But it’s definitely here. It’s not going away. It’s only going to get better.”

    Whether you’re actively using AI in your music or have some doubts, it’s becoming increasingly clear that AI will play a major role in the music industry in the future. With the expertise of Wares and Camp, we discuss the current state of AI in the music industry, including the tools that are available now.

    What is AI Music?

    Before we define what AI music means, let’s first define artificial intelligence. Here’s Wares’ definition:
    “Artificial intelligence is like the intelligence of a computer; it is a technology that enables machines to imitate human thinking or behavior, such as problem solving, learning, or recognizing patterns.”

    In the context of music, AI technology has reached a point where it can generate, compose, and enhance musical content that was previously performed by humans. AI music can take many forms and types of assistance, from creating an entire song from start to finish, to writing specific aspects of a composition, mixing and mastering a production, voice cloning, and more. We’ll also list some specific AI music tools that can perform these tasks, the capabilities of which have opened up a Pandora’s box of copyright issues.

    History

    Artificial intelligence has its origins in music, with the problem of transcription: accurately recording a performance into musical notation as it is performed. Père Engramelle’s “piano tape” scheme, a mode of automatically recording note times and durations so that they can be easily transcribed into proper musical notation by hand, was first implemented by the German engineers J. F. Unger and J. Holfield in 1752.
    In 1957, the ILLIAC I (Illinois Automatic Computer) created the “Illiac Suite for String Quartet”, a completely computer-generated piece of music. The computer was programmed to perform this task by composer Lejaren Hiller and mathematician Leonard Isaacson. : v–vii  In 1960, Russian researcher Rudolf Zaripov published the world’s first paper on algorithmic music composition using the Ural-1 computer.
    In 1965, inventor Ray Kurzweil developed software that could recognize musical patterns and synthesize new compositions from them. The computer first appeared on the quiz show I’ve Got a Secret.

    By 1983, Yamaha’s Kansei Music System had gained traction, and a paper on its development was published in 1989. The software used music processing and artificial intelligence techniques to essentially solve the transcription problem for simpler melodies, although higher-level melodies and musical complexities are still considered difficult deep learning problems today, and near-perfect transcription is still a subject of research.

    In 1997, an artificial intelligence program called Experiments in Musical Intelligence (EMI) outperformed a human composer at the task of composing a piece of music imitating the style of Bach. EMI later became the basis for a more sophisticated algorithm called Emily Howell , named after its creator.

    In 2002, a group of music researchers at the Sony Computer Science Laboratory in Paris, led by French composer and computer scientist François Pachet, developed Continuator, a unique algorithm capable of restarting a composition after a live musician had stopped.

    Emily Howell continued to improve music AI by releasing her first album, From Darkness, Light, in 2009. Since then, many more AI works have been published by various groups.
    In 2010, Iamus became the first AI to create a piece of original modern classical music in its own style: “Iamus’ Opus 1”. Located at the University of Malaga (University of Malaga) in Spain, the computer can generate a completely original piece of music in a variety of musical styles. In August 2019, a large dataset of 12,197 MIDI songs, each with their own lyrics and melodies, was created to investigate the feasibility of neurally generating melodies from song lyrics using a deep conditional LSTM-GAN method.

    With advances in generative AI, models have begun to emerge that can create complete musical compositions (including lyrics) from simple text descriptions. Two notable web applications in this area are Suno AI, which launched in December 2023, and Udio , which followed in April 2024.

    Software Applications

    ChucK

    Developed at Princeton University by Ge Wang and Perry Cook, ChucK is a text-based, cross-platform language. By extracting and classifying theoretical techniques it finds in musical pieces, the software is able to synthesize entirely new pieces based on the techniques it has learned. The technology is used by SLOrk (Stanford Laptop Orchestra) and PLOrk (Princeton Laptop Orchestra).

    Jukebox

    Jukedeck was a website that allowed people to use artificial intelligence to create original, royalty-free music for use in videos. The team began developing the music-generating technology in 2010, formed a company around it in 2012, and launched the website publicly in 2015. The technology used was initially a rule-based algorithmic composition system, which was later replaced by artificial neural networks. The website has been used to generate over 1 million pieces of music, and brands that have used it have included Coca-Cola, Google, UKTV, and the Natural History Museum in London. In 2019, the company was acquired by ByteDance.

    Morpheus

    MorpheuS is a research project by Dorien Herremans and Elaine Chu at Queen Mary University of London, funded by the EU Marie Skłodowska-Curie project. The system uses an optimization approach based on the variable neighborhood search algorithm to transform existing patterned fragments into new fragments with a given level of tonal stress that changes dynamically throughout the fragment. This optimization approach integrates pattern detection techniques to ensure long-term structure and recurring themes in the generated music. Pieces composed by MorpheuS have been performed in concerts both at Stanford and in London.

    AIVA

    Founded in February 2016 in Luxembourg, AIVA is a program that produces soundtracks for any type of media. The algorithms behind AIVA are based on deep learning architectures. AIVA has also been used to compose a rock track called On the Edge, as well as a pop tune called Love Sick, in collaboration with singer Taryn Southern for her 2018 album I am AI.

    Google Purple

    Google’s Magenta team has published several AI music apps and white papers since their launch in 2016. In 2017, they released the NSynth algorithm and dataset, an open-source hardware musical instrument designed to make it easier for musicians to use the algorithm. The instrument has been used by notable artists like Grimes and YACHT on their albums. In 2018, they released a piano improvisation app called Piano Genie. It was later followed by Magenta Studio, a set of 5 MIDI plugins that allow music producers to develop existing music in their DAW. In 2023, their machine learning team published a technical paper on GitHub describing MusicLM, a proprietary text-to-music generator they had developed.

    Riffusion

    Riffusion is a neural network developed by Seth Forsgren and Ike Martiros that generates music using sound patterns rather than audio. It was created as a fine-tuning of Stable Diffusion, an existing open-source model for generating images from text cues in spectrograms. This results in a model that uses text cues to generate image files that can be inverse Fourier transformed and converted to audio files. Although these files are only a few seconds long, the model can also use the latent space between the outputs to interpolate different files together. This is achieved using a functionality of the Stable Diffusion model known as img2img. The resulting music has been described as “de otro mundo” (otherworldly), though it is unlikely to replace human-made music. The model was released on December 15, 2022, and the code is also freely available on GitHub. It is one of many models derived from Stable Diffusion. Riffusion is classified as a subset of AI-based text-to-music generators. In December 2022, Mubert similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on its own text-to-music generator called MusicLM.

    Spike AI

    Spike AI is an AI-powered audio plugin developed by Spike Stent in collaboration with his son Joshua Stent and friend Henry Ramsey that analyzes tracks and makes recommendations for clarity and other aspects during mixing. Communication is carried out via a chatbot trained on Spike Stent’s personal data. The plugin integrates into a digital audio workstation.

    Music Applications

    Artificial intelligence has the potential to influence the way producers create music by generating track iterations based on cues given by the creator. These cues allow the AI to follow a specific style that the artist is trying to achieve.

    AI has also been used in music analysis where it has been used for feature extraction, pattern recognition, and music recommendations.

    Composition

    Artificial intelligence has had a major impact on the composition sector as it has influenced the ideas of composers/producers and has the potential to make the industry more accessible to newcomers. With its development in music, it has already been used in collaboration with producers. Artists use this software to help generate ideas and identify musical styles by prompting the AI to follow specific requirements that suit their needs. Future impacts of the technology on composition include emulation and fusion of styles, as well as revision and refinement. The development of these types of software can make it easier for newcomers to enter the music industry. Software like ChatGPT was used by producers to perform these tasks, while other software like Ozone11 was used to automate time-consuming and complex tasks like mastering.

    Risks and Harm

    Musicians, producers, and others have been using non-generative AI tools for years. Cher popularized auto-tune with “Believe” more than a quarter century ago, and countless artists have since used it to “correct” their tone. Record labels use AI to scan social media for unlicensed uses of songs they own, and Shazam works in much the same way when it comes to recognizing audio. Engineers use it to streamline the mixing and mastering process. More recently, Get Back director Peter Jackson used the technology to isolate individual tracks from a mixed recording to reconstruct studio conversations and create a lost Beatles song.

    But there’s a key difference between these ancillary tools and generative AI apps like Suno and Udio , which can create entire songs from just a few words. All new music AIs work a little differently and continue to evolve, but they generally operate in a similar way to other generative AI tools: they analyze a huge data set and use the patterns found in it to make probabilistic predictions.

    To do this for audio, developers collect a huge collection of songs (through agreements with license holders and/or by scraping publicly available data without permission) and their associated metadata (artists and song titles, genres, years, descriptions, annotations, anything relevant and available). All of this is usually made possible by low-paid workers in the Global South who annotate this data on a gigantic scale.

    The developers then prepare this data set for a machine learning model, which is (in short) a vast network of connections, each assigned a numerical “weight.” Humans then “train” the model by teaching it to observe patterns in the data set and providing feedback to the model by scoring its predictions. Based on these patterns, the model can take a short piece of audio or text cue and predict what should happen next, and then what will happen after that, and so on.

    Developers tweak the weights to generate more listenable and predictable results from the same inputs. AI-powered music generators combine two strands of technology: the musical tools that professionals have been using in studios for decades, and the large language models that allow everyday users to harness their power. Any AI music generator is only as good as the data it’s trained on. These systems require vast amounts of data, and a model trained on a biased dataset will reproduce those biases in its output. Whose voices are included in this huge box of music, and whose are left out? Today’s AI models tend to exclude huge swaths of music, especially from musical traditions that predate recording technology and are of non-Western origin. As currently designed, they’re more likely to produce stereotypical sounds within a genre or style than anything unusual, let alone innovative or interesting. Generative AI systems are prone to mediocrity, but transcendental music is found on the fringes.

    “What will be lost in human creativity and diversity if musicians start relying on predictive models trained on selective data sets that exclude most of the world’s cultures and languages?” Lauren M.E. Goodlad, chair of Rutgers University’s Critical AI initiative, told me.

    From a legal perspective, musicians watching AI models learn from their work have the same concerns as the New York Times, Getty, and other publishers and creators who are suing AI companies: the provenance of the data. While some companies are careful to train their models only on licensed data, others use whatever they can get their hands on, arguing that anything in the public domain falls under fair use for this purpose. The RIAA, the dominant music trade body in the US, is now suing Suno and Udio for “copyright infringement… on a massive scale.” (Disclosure: Vox Media is one of several publishers that has signed partnership deals with OpenAI. Our reporting remains editorially independent.)

    Polls often show that most people disapprove of AI companies copying public data without permission. But while there are a number of high-profile lawsuits on the table, it’s not yet clear how the legal system will affect companies mining all that human creativity without permission, let alone compensate them. If these practices aren’t curbed soon, the least scrupulous players will quickly gain power and the fancy lobbyists and lawyers that come with it. (Callousness: It’s not just for machines!) These issues are pressing now because they become harder to solve over time, and some in the field are pushing back. Ed Newton-Rex was vice president of audio at Stability AI when it launched Stable Audio , an AI-powered music and sound generator, last fall.

    He left the company just a couple of months later over its stance on data collection: Newton-Rex’s team trained Stable Audio only on licensed data, but the company’s leadership filed a public comment with the U.S. Copyright Office that the AI development was “an acceptable, transformative, and socially beneficial use of existing content protected by fair use.” To combat unlicensed scraping, Newton-Rex founded Fairly Trained , which verifies and certifies datasets used by AI companies. For now, the nonprofit can only certify whether the content in a company’s dataset has been properly licensed. Someday, it will be able to take into account finer details (like whether the artist explicitly consented to such use or simply did not opt out) and other issues like mitigating bias.

    As a musician and composer of choral and piano music, he sees this as a turning point for the field. “Generative AI models are usually competing with their training data,” Newton-Rex said. “Honestly, people only have a limited amount of time to listen to music. There’s a limited pool of royalties. And so the more music that’s created through these systems, the less that goes to human musicians.”

    As FTC Chairwoman Lina Khan noted last month, if a person creates content or information that an AI company copies, and then the content or information produced by the AI generator competes with the original producer “in order to drive it out of the market and divert business… that could be an unfair method of competition” that violates antitrust laws.
    Marc Ribot is one of more than 200 musicians who signed an Artist Rights Alliance statement opposing the practice earlier this year, and he’s an active member of the Music Workers Alliance’s AI steering committee. A practicing guitarist since the 1970s, Ribot has seen how technology has shaped the industry, watching recording budgets steadily shrink for decades.

    “I’m not against the technology itself in any way, shape or form,” Ribot says. Having lost the master recordings he made in the ’90s, he himself used AI to isolate individual tracks from the final mix. But he sees the current moment as a critical opportunity to push back against the technology before the firms that own it become too big to regulate it.
    “The real dividing line between useful and disastrous is very simple,” Ribot said. “It’s all about whether the producers of the music or whatever else is being input [as training data] have a real, functional right of consent. [AI music generators] spit out what they consume, and often they produce things with large chunks of copyrighted material in them. That’s the output. But even if they didn’t, even if the output isn’t infringing, the input itself is infringing.”

    Ribot said musicians have long been indifferent to AI, but in the past few years he’s seen a “seismic shift in attitudes toward issues of digital exploitation,” fueled by last year’s SAG-AFTRA and Writers Guild of America strikes, ongoing lawsuits against AI companies, and a greater understanding of surveillance capitalism and civil liberties.

    While musicians may have seen each other as competitors just a few years ago — even if the pie is getting smaller, there are still a few artists who can get rich — AI poses a threat to the entire industry that may not benefit even the luckiest of them.

    What AI Can and Could Do

    One of the first examples of music created by artificial intelligence dates back to 1956: a piece for string quartet composed by the ILLIAC I computer and programmed by University of Illinois at Urbana-Champaign professors LeJaren Hiller and Leonard Isaacson.

    Following the technological leaps of recent years, artists like Holly Herndon , Arca, YACHT , Taryn Southern , and Brian Eno are now using generative AI to experiment with their creative practices. AI’s tendency to produce “hallucinations” and other nonsensical results, while dangerous in other contexts, could be a source of inspiration in music. Just as other audio technologies have come to be defined by their dissonance—CD distortion, 8-bit compression, the cracked human voice too powerful for the throat that emits it, “events too important for the medium intended to record them,” as Brian Eno writes in The Year with Swollen Appendices—AI-generated music may be most valuable when it’s most distinct. Ivan Paz, a musician with a PhD in computer science, is developing AI systems for his own live performances.

    Starting with a blank screen, he writes code in real time (displayed for the audience to read) and trains the model by responding to the sounds it makes, which can be unexpected, jarring, or just plain catastrophic. The result is a bit like playing an instrument, but also like improvising with another musician. “If your algorithm is operating at a very low level, then you feel like you’re playing a musical instrument because you’re actually tweaking, for example, the parameters of the synthesis,” Paz said. “But if the algorithm is determining the shape of a piece of music, then it’s like playing with an agent that’s determining what happens next.”

    For an exhibition at the Centre for Contemporary Culture in Barcelona earlier this year, Paz worked with singer Maria Arnal to create a timbre-rendering model for her voice. They asked visitors to sing short snippets of songs; the model then mixed those voices with Arnal’s to create a new singing voice. In another project, Paz’s colleague Shelley Knotts trained a model on her own compositions to avoid repetition in her work: it analyzes her music to detect patterns, but instead of suggesting her most likely next move, it suggests a less likely continuation.

    The next step in AI’s musical evolution may come down to processing speed. Live coding is possible with some types of models, but others take too long to render the music to create it in a live show. Electronic instruments like synthesizers were originally designed to imitate acoustic sounds and have developed their own unique character over time. Paz sees the ultimate potential of generative AI as creating new sounds that we can’t currently imagine, let alone produce. In this context — in which AI assists a performer — AI is no more likely to “replace” a musician than a digital tuner or delay pedal.

    However, other corners of the music industry are adopting AI for more disruptive purposes. While AI may not (and never can) create music better than a human, it can now create acceptable music at a much faster speed and on a larger scale — and “acceptable” is often the only bar a track has to clear.

    Most of the time, when you hear music, you don’t know who created it. The jingle you hear in an ad. The ambient score in a movie or TV show, podcast or video game. The loops a hip-hop producer samples into a beat. This is the part of the industry most likely to be upended by generative AI. Bloomberg reports that teachers are using Suno to create music teaching aids. Gizmodo notes that the target audience for Adobe’s Project Music GenAI Control, another AI-powered music generator, is people who want to make background music quickly and cheaply, like podcasters and YouTubers, with the ability to specify the mood, tone, and length of a track.
    Whether you like it or even notice it, these types of music have historically been created by humans. But automated AI music generation could cost these musicians their jobs — and many of them use that income to support their more creatively satisfying, but less financially viable, pursuits. You may never see an AI musician on stage, but you’ll still probably see fewer human musicians because of the technology.

    For their part, influential players in the music industry already believe that AI will become a mainstay of their business — they’re worried about who will reap the benefits. Spotify won’t restrict AI-generated music unless it’s outright imitation, which risks litigation. Universal Music Group (UMG) and YouTube have launched the YouTube Music AI Incubator to develop AI tools with UMG artists. Meanwhile, UMG is also one of more than 150 organizations — including ASCAP, BMI, RIAA, and AFL-CIO — in the Human Artistry Campaign coalition , which seeks to establish ethical frameworks for the use of AI in creative fields. They don’t want to ban the technology, but they do want a stake in the results.

    With more than 100,000 new tracks uploaded to streaming services every day, digital streaming platforms have a strong incentive to reduce the share of human-made, royalty-free tracks their users play. Spotify alone paid out $9 billion in royalties last year, the bulk of its $14 billion in revenue. The world’s largest music streaming company has historically increased the availability and visibility of free tracks, and may continue to do so. AI-powered music generators are an easy way to create free music that could displace real, royalty-earning artists from popular playlists, shifting that streaming revenue away from artists and toward the platform itself.

    There’s a new power — and a new danger — for established artists. After a stroke, country star Randy Travis has trouble speaking, let alone singing, but with the help of AI trained on his existing catalog, he can reproduce his vocals digitally.

    Meanwhile, an anonymous producer can create a believable-sounding Drake/The Weeknd collaboration and rack up millions of streams. In May, producer Metro Boomin came under fire during Drake’s real-life beef with Kendrick Lamar. Metro Boomin released a beat with AI-generated samples for anyone to use, which Drake then sampled and rapped over, releasing the new track to streaming services. King Willonius, who used Udio to create the original track that Metro Boomin remixed, hired a lawyer to retain the rights to his contributions.
    These latest examples show how music made quickly can crowd out music made well. In the streaming economy, volume and speed are everything: artists are incentivized to produce quantity, not quality.

    “[A future AI-generated hit] won’t be something that people go back and study the way they continue to do with the great releases of the record era,” said musician Jamie Brooks. Brooks has released records under her own name and with the bands Elite Gymnastics and Default Genders, and blogs about the music industry in her newsletter The Seat of Loss. “But it still generates engagement, and so a world where whatever’s at the top of the Spotify charts isn’t meant to last, that’s just meant to be entertaining that day and never thought about again, would be a good thing for all these companies. They don’t need it to be art to make money.

    ” So much of today’s technology exists primarily to imitate or simplify, which can foster amateurism. File sharing has made compulsive record collecting accessible to anyone with a hard drive and a modem, cell phone cameras have allowed everyone in the crowd to document the show, and now streaming audio gives us all dynamic playlists tailored to our moods and advertising cohorts. Generative AI could make music creation easier for non-experts, too. This could radically change not just how much music we hear, but our relationship to the form as a whole. If creating a hit song requires no more effort than writing a viral tweet, much of the creative energy currently contained in social media could be redirected toward generating music based on prompts.

    Brooks sees it as a regressive phenomenon, emphasizing the immediate over timeless depth, topping the charts with audio memes and groundbreaking singles aimed at the most sophisticated listeners, just as the airwaves were once dominated by empty songs like “Take Me Out to the Ball Game,” written by two people who had never been to a baseball game.

    “That’s the direction these services are going to push music,” Brooks said. “It’s not going to be about creativity at all. Between the way these models work and the algorithmic feeds, it’s all just a big repository of the past. It’s not going to move records forward in sound. It’s going to accelerate records from the center of American pop culture to the trash can.”

    Copyright and AI Music

    One of the most debated issues surrounding AI in the music industry concerns who makes money from AI-generated work, especially if the algorithm is trained using existing copyrighted material. In March 2023, the U.S. Copyright Office launched an initiative to investigate AI-related copyright issues. Camp is confident that regulators will step in and create a patch, but he worries that the issue is difficult to solve because of the U.S. copyright system that artists operate under.

    “A number of the laws and precedents that ultimately led to our modern copyright system just don’t fit with what’s going on in music right now,” Camp says. “I do believe that creators should have authorship, should be credited, and should be compensated. But again, the entire system through which we do that is very outdated.”

    AI music is still in a legal gray area, raising the question of whether a compromise is possible where artists are credited, compensated, and consent to the use of their work or likeness by AI without limiting the potential for musical creativity using AI technology. To some extent, art is derivative of other art, and what is inspiration and what is theft is currently blurred. Some record labels are starting to fight back.

    In May 2023, Universal Music Group called on streaming services to block the use of AI-generated music, saying it uses their artists’ music to train its algorithm and that they will take legal action if necessary. Spotify responded by removing 7 percent of AI-generated music on its platform, equating to tens of thousands of songs. In July 2023, UMG called on Congress to enact a nationwide policy to protect creators from AI-powered copyright infringement. The record label is one of 40 members to join the Human Artistry Campaign, an organisation advocating for the responsible use of AI.

    In the United States, the current legal framework tends to apply traditional copyright laws to AI, despite its differences from the human creative process. However, musical works created solely by AI are not protected by copyright. In the Copyright Office’s Compendium of Practice, the Copyright Office stated that it will not grant copyright to “works that lack human authorship” and “the Office will not register works created by a machine or by a mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.” In February 2022, the Copyright Review Board rejected a copyright application for an AI-generated work of art on the grounds that it “lacked the requisite human authorship necessary to sustain a copyright claim.”

    The situation in the European Union (EU) is similar to that in the US, as its legal framework also emphasizes the role of human involvement in copyrighted works. According to the European Union Intellectual Property Office and recent case law of the Court of Justice of the European Union, the originality criterion requires that a work be the author’s own intellectual creation, reflecting the author’s identity, evidenced by the creative choices made during its creation, requiring a specific level of human involvement. The reCreating Europe project, funded by the European Union’s Horizon 2020 research and innovation programme, delves into the challenges posed by AI-generated content, including music, offering legal certainty and balanced protection that encourages innovation while respecting copyright rules. AIVA’s recognition marks a significant departure from traditional views on authorship and copyright in the field of musical composition, allowing AI performers to release music and receive royalties. This recognition makes AIVA a pioneer in the formal recognition of AI in music production.

    Recent advances in artificial intelligence by groups like Stability AI, OpenAI, and Google have led to a huge number of copyright infringement lawsuits being filed against generative technologies, including AI music. If these lawsuits are successful, the datasets of the machine learning models that power these technologies will be confined to the public domain.

    Drake and The Weeknd

    While there isn’t much legal precedent for voice cloning, for celebrities it can fall under their right of publicity as a violation of their image, name, and voice. One key example from last year was when a TikToker going by the name Ghostwriter used AI to create a fake duet between Drake and The Weeknd called “Heart on My Sleeve.” The song has since been taken down, but versions are still floating around the internet.

    “On one hand, you could argue that it’s an original work,” says Wears. “On the other hand, it could be seen as a form of infringement, as the AI learned to write lyrics in Drake’s style by analyzing his catalog, without his express permission. Another concern is the unauthorized use of artists’ names and likenesses.”

    The ability to copy someone’s name and likeness using AI is troubling the music industry, as well as the entertainment industry as a whole. One of the main demands of the current SAG-AFTRA strike is to protect creators from having their work used to train AI generators, and actors from having their likenesses and voices copied without consent.

    Ethical Issues with AI

    Copyright is just one of many ethical issues surrounding AI, and it’s important to remember that this technology and its development are not without consequences.

    One immediate concern is bias in training a dataset. An example is rapper FN Meka, who signed with Capitol Music Group in 2022 but later dropped the contract due to perpetuating racial stereotypes.

    “One of the big issues is garbage in and garbage out,” says Camp. “If we’re training these language models, or these image generators, or these music generators on data that’s inherently biased, inherently racist, then everything we’re asking is going to perpetuate those stereotypes. We need to make sure that we have good data going in and that we’re monitoring it.”

    Monitoring that data isn’t without its harms, either. Another ethical concern is the training process, called “reinforcement learning,” which involves providing human feedback on a range of disturbing content. A recent episode of the Wall Street Journal podcast The Journal features a Kenyan data worker who, among many others, helped train ChatGPT to distinguish “right from wrong” at the cost of very high mental health.

    “Basically, it’s giving a thumbs up or a thumbs down on responses,” says Camp. “Is this an inappropriate response? Is it too violent or graphic or disturbing? OpenAI contracted out that work to people in Kenya, paying them $2 an hour to read those responses. So imagine being paid $2 an hour to show up to work and read some of the most horrific, psychologically disturbing text, and you do that for 10 hours, and then you go home and it’s all swirling around in your head. So there are a lot of flaws in the way sausage is made right now.”

    Music Deepfakes

    A more nascent development of AI in music is the use of audio deepfakes to fake the lyrics or musical style of an existing song to resemble the voice or style of another artist. This has raised many concerns about the legality of the technology, as well as the ethics of its use, especially in the context of artistic identity. Additionally, it has also raised the question of who is credited for these works. Since AI cannot have its own authorship, current speculation suggests that there will be no clear answer until further decisions are made about machine learning technologies in general. The most recent preventative measures have begun to be developed by Google and the Universal Music Group, which have taken into account royalties and credit attribution to allow producers to copy artists’ voices and styles.

    “Heart on My Sleeve”

    In 2023, an artist known as ghostwriter977 created a musical deepfake called “Heart on My Sleeve” that cloned the voices of Drake and The Weeknd by feeding a set of vocal tracks from the respective artists into a deep learning algorithm, creating an artificial model of each artist’s voices that could be matched against the original reference vocals with the original lyrics. The track was submitted for Grammy consideration for Best Rap Song and Song of the Year. It went viral and gained popularity on TikTok and received a positive response from audiences, leading to its official release on Apple Music, Spotify, and YouTube in April 2023. Many believed that the track was entirely written by AI software, but the producer claimed that the songwriting, production, and original vocals (before the conversion) were still done by him. The song was later removed from the Grammy nomination list because it did not meet the requirements for Grammy consideration. The track was removed from all music platforms by Universal Music Group. The song was a turning point for voice cloning using artificial intelligence, and since then, models have been created for hundreds, if not thousands, of popular singers and rappers.

    “Where That Came From”

    In 2013, country singer Randy Travis suffered a stroke that left him unable to sing. Meanwhile, vocalist James Dupré toured on his behalf, performing his songs. Travis and longtime producer Kyle Lehning released a new song in May 2024 called “Where That Came From,” Travis’s first new song since his stroke. The recording uses artificial intelligence technology to recreate Travis’s vocal voice, compiled from more than 40 existing vocal recordings along with Dupré’s recordings.

    AI Musical Tools

    Now that we’ve covered what AI is, as well as some of its major downsides, we can discuss the AI musical tools that exist. At Berklee Onsite 2023 , an annual music conference held on the campus of Berklee College of Music in Boston, Wares shared a few AI musical tools to know about; some you can start learning right now, and some you might just want to learn about.

    BandLab SongStarter

    BandLab’s SongStarter app is an AI-powered song generator that lets you choose a genre, enter song lyrics (and emoji), and it will generate free ideas. You can then take those ideas into their studio feature to make them your own. It’s a great way to get started on a song if you need some initial inspiration.

    Midjourney

    As one of the most popular AI-powered image generators, Midjourney can be used to create album art, song covers, posters, Spotify loops, merch images, and more. What sets it apart from some other AI-powered image generators is its surreal, dream-like style, which may be better suited for music projects. The program is easy to use, but there is a definite learning curve. Like many new tech programs, be sure to watch a few tutorials before diving in.

    Mix Monolith

    The Mix Monolith plugin is an automatic mixing system from AYAIC that will even out your mix. In the Mix Online article, the developer says, “Its purpose is not to automatically create a finished mix, but to establish fundamental gain relationships between tracks and ensure proper gain adjustments.”

    LANDR AI Mastering

    LANDR’s AI mastering tool allows you to drag and drop your track into the program, which then analyzes it and offers simple options for style and volume. Once you select these two options, the program will master your track, giving you more options for file type and distribution method. LANDR boasts over 20 million tracks that have been mixed with their program.

    AIVA

    AIVA is an artificial intelligence program that has been trained on over 30,000 iconic scores from history. You can choose from several different preset styles of music, from modern cinema to twentieth-century cinema, from tango to jazz. You then have the option to enter the key signature, time signature, tempo, instrumentation, duration, and more. If you don’t know what to enter, AIVA will do it for you. Finally, you can generate a track, customize the instrumentation, and upload a variety of file types. As a subscriber, you have a full copyright license for everything you create.

    ChatGPT for Musicians

    One of the most widely used AI tools, OpenAI’s ChatGPT has a variety of uses for musicians. The company is currently under investigation by the Federal Trade Commission, so you should take precautions about what information you share with ChatGPT, as well as verify any facts you receive from ChatGPT.

    With that in mind, the program does have the potential to reduce the time you spend on tasks that take you away from actually making music. Wares and Camp have been experimenting with ChatGPT since its release and have some specific tips that musicians and music professionals might find useful.

    Social Media Strategy

    Social media can be a huge time sink for an amateur musician, and ChatGPT can help ease the load. Wares says you can start by telling ChatGPT what kind of artist you are, what genre of music you play, and what your hobbies and interests are. Then you can request 30 pieces of content for the next 30 days across TikTok, Instagram, Facebook, or whatever social media platform you use. Not only can you request social media content ideas, but you can also ask ChatGPT to create optimized captions and hashtags.

    Technical Riders for Touring

    When going on tour, musicians will typically hire someone to create a technical rider that outlines all the details needed to pull off their show. This could include equipment, stage setup, sound engineering, lighting, hospitality, gig contracts, tour itineraries, venue options, ticket prices, and more. Wares says ChatGPT could be the one to write that tech rider, and recently worked with the band to plan their tour using the technology.

    “We started by creating their tech rider, which included backline requirements, a detailed list of inputs, and even specific microphone recommendations, all based on a few simple tips,” Wares says. “We then asked for recommendations on the tour itinerary in the Northeast, how much we should charge for tickets, and merch ideas based on the unique interests and demographics of the band’s fan base. What would have taken days was done in under an hour.”

    Writing Song Lyrics

    If you need help writing song lyrics, need inspiration, or want to use some word suggestions, ChatGPT can be a useful songwriting tool. Camp gives the example of working with former Berklee student Julia Perry (who interviewed them for a Berklee Now article about AI and music) to generate song ideas using ChatGPT.

    “We were talking about how the universe is magic, and how she wanted to express this deep, unknowable truth about the universe,” Camp says. “And I basically condensed everything she said into two or three paragraphs and said [ChatGPT], give me 20 opening lines for this song.”

    They ended up using one of the 20 options as a starting point for a new song.

    Content Writing

    ChatGPT can help with a variety of content writing and copywriting tasks, whether it’s writing a press release, a bio with multiple character lengths, an album release strategy, a blog post, website copy, email, and more.

    Agreements and Contracts

    In an ideal world, you’d have a lawyer write and review all of your agreements and contracts, but that’s not always realistic or affordable. In some cases, you might want to have ChatGPT draft an agreement rather than have nothing at all. This can be used for management agreements, band agreements, split sheets, performance agreements, and more. But again, an entertainment lawyer is always preferable when possible.

    Where are the people?

    The current state of AI generative music is more mix-and-match than true generation. It’s not really a tribute band, but rather an expansive approach to revival. It can only produce sounds from what’s in the training data, and while it can combine, mix, and refract those elements in new ways, it can’t truly experiment beyond that.

    Musicians will tell you that there are only a limited number of notes that can be played, or that all sounds are just a matter of frequency and wavelength, and therefore there’s only a limited amount of what can be done in purely musical terms. But there’s more to music than just arranging some chords or rhythms, just as there’s more to creating recipes than just choosing from a finite list of ingredients and techniques.

    Ribo is a guitarist known for his experimentation and ability to draw from disparate influences and mix them into something new. At first glance, this sounds a lot like the value proposition put forward by proponents of generative AI, but he says there are fundamental differences between a human and a machine doing the same thing.

    “I can’t get through a 12-bar blues solo without quoting someone,” Ribot said. “We have to give the privilege of human rights to do that. I’m pretty good at knowing when I’m crossing the line. I know I can quote this part of a Charlie Parker song without it being a Charlie Parker song, and I know I can screw it up this badly and it’ll be cool.”
    Ribot’s 1990 album Rootless Cosmopolitans includes a cover of Jimi Hendrix’s “The Wind Cries Mary.” In an homage to Hendrix, Ribot’s version is abstract, the lyrics barked over a scratchy guitar, bearing little resemblance to the original song other than the guitar tone, omitting Hendrix’s melody, chords, and rhythm. Still, Ribot listed it as a cover on the album and pays a mechanical royalty on every sale or stream.
    “This system needs to be preserved and it’s worth fighting for,” Ribot said. “We’re not paid minimum wage when we’re sitting on a record. We have no guarantees even when we’re performing. [Copyright] is literally the only economic right we have.”

    Ribot’s discursive practice is part of a long tradition: music as a medium is defined by an awareness and respect for what came before, what can still grow and change, and not just be recycled. “What drives change in music is changes in people’s moods, their needs and possibilities, and what they love and what pisses them off. People can learn to take feelings, events, and the fullness of their lives and represent them on their guitar or piano. It expands the field as experience expands, history lengthens, and bands emerge that need expression and ideas.”

    Historically, there has been a sacred contract between musicians and audiences that implies authenticity and humanity. Of the millions of Taylor Swift fans who attended the Eras Tour, many could give you a detailed account of her personal life. The same goes for the audiences of Beyoncé, Harry Styles, Elton John, or any of the biggest touring artists. You need a real person to sell out stadiums. No one would even watch The Masked Singer if they didn’t think they’d recognize the performers when they were unmasked.

    When we listen to music intentionally, we’re often listening hermeneutically, as if the song were a doorway into a larger space of understanding other people’s experiences and perspectives. Consider Nirvana. Because grunge’s aesthetic deviance met modern studio technology at just the right moment, Nevermind found a huge audience not just because of the way it sounds, but because Kurt Cobain’s personal arc—the meteoric rise and tragic early death of an anxious suburban kid who became a rock superstar by openly challenging (some) pop-star conventions—resonated with people.

    While the band acknowledged the musicians who inspired them—the Pixies, the Gap Band, and others—Nirvana’s records are ultimately the unique product of the choices made by Cobain, his bandmates, and their collaborators, an expression and reflection of their experiences and ideals. Art, by definition, is the product of human decision-making.

    Some AI-generated music, like other forms of musical process, still retains that human element: because artists like Ivan Paz and Shelley Knotts rely heavily on automated models, they create the system, make countless decisions about how it works, and decide what to do with any sounds it produces.
    But the AI music that threatens human musicians, which takes little more than a few words and produces entire songs from them, is inherently limited because it can only look inward and backward in time from its data, never outward and thus never forward. The guitar was invented centuries ago, but an AI model trained on music before Sister Rosetta Tharpe’s heyday in the 1940s is unlikely to produce anything resembling an electric guitar. Hip-hop is a style of music based on sampling and repackaging other artists’ work (sometimes in forms or contexts that the original artist doesn’t like), but a model trained on music before 1973 won’t be able to create anything like that.

    There are countless reasons why people listen to music, but there are just as many reasons why people make it. People have been making sounds for each other for thousands of years, and for most of that time it would have been foolish to imagine making a living from it—it would have been impossible to even think about amplifying it, let alone recording it. People made music anyway.

    There’s a tension here that predates AI. On the one hand, record labels and digital streaming platforms believe, largely correctly, that the music market wants recognition above all else, so much of the money comes from sales of established artists’ catalogs, with one report suggesting that those sales accounted for 70 percent of the U.S. music market in 2021. The chart-toppers sound increasingly similar. Streaming platform algorithms often feed the same songs over and over again.

    On the other hand, there’s an intrinsic human need for surprise, innovation, transgression. It’s different for each person. The goals of a huge corporation—its scale and oversight, basically—are different from those of its users as a whole and for the individual, and the larger its user base becomes, the more it will tend to automate. Neither AI music generators nor dynamically generated playlists nor any other algorithmically predictive system are inherently good or bad: the results depend entirely on who’s running them and for what purpose.

    But whatever happens, no company will ever have a monopoly on music. No species does. Birds do it. Bees do it. Whales in the sea do it. Some of it, to the human ear, is quite beautiful. But even with all that natural melody, all the music humans have already created, and all the music that AI will either help create or create itself, the human urge to create and express ourselves persists. Music exists in our world for reasons other than commercialism.

    More often than not, the reason is quite simple: a person or group of people decided it should exist, and then made it so. It will continue to exist, no matter how much sonic sludge the machines pump out.

    Embrace or Resist?

    One of the recurring themes when it comes to AI and other emerging technologies is that they will be a big part of the music industry (and most industries) in the future, and that ignoring them will not help the industry’s future leaders.

    “I think AI can help my students be more productive and support their creative process, and allow them to focus on what matters most to them, which is creating and performing music or exploring new business ideas,” says Wears. “However, as a responsible educator, I have to make sure my students don’t become too dependent on these tools, and I’m constantly looking for ways to use AI to help develop their critical thinking skills.”

    Camp agrees, and also encourages people to do what they’re comfortable with as AI continues to evolve.

    “I certainly encourage you, if you want to stay current and use technology to advance what you’re on the planet for, then yes, join in,” says Camp. “But like I said, I have friends who use landlines. I have friends who prefer to buy vinyl records. AI is here. It has a huge impact. You don’t have to use it, but a lot of people choose to.”

    AI at Berklee Online

    Recently, Berklee Online launched an initiative called ARIA: AI-enhanced Realities & Immersive Applications. The project is led by Gabriel Raifer Cohen, associate director of support and audio technology at Berklee Online and a Berklee College of Music alumnus.

    “Like calculators, computers, the internet, and search engines before it, GenAI is here to stay,” says Raifer Cohen. “Ignoring the reality that all of these tools are readily available is a disservice to students. . . . Teaching students how to best—and responsibly—use these technologies as tools of empowerment may be a more worthwhile endeavor than trying to fight them.”

    And just because AI will play a major role in the future of the music industry doesn’t mean we can’t criticize this new technology or advocate for safety measures. “At the same time, we must resist the spread of mediocrity and creative insensitivity fueled by the mindless use of GenAI, while remaining ethically aware and proactive,” he says. “There’s nothing easy about this, but we must consider that developments in AI also open up opportunities for potentially transformative educational experiences.” Raifer Cohen says that as part of the ARIA initiative, Berklee Online will continue to explore these new tools, and only after they’ve been tested and thoroughly studied will the school consider implementing them in the classroom. “Ultimately, we must not forget that for students and teachers, viewers and creators, all these powerful tools are just that: tools,” Raifer Cohen says.

    Free registration

    Register for free and get one project for free