Can a computer generate realistic music?

Many people would probably answer this question with a clear “no”. Is it because they really believe computers cannot? Or maybe there is an inner voice that refuses to believe that musical creativity is not exclusive to humans? In the meantime, AI systems such as DALL-E and ChatGPT have already proven unprecedented performance for generating images and textual artefacts, and are now becoming ubiquitous in our society. What about AI-generated music?

26 April 2023

by Jacopo de Berardinis & Max Tiel

Many people would probably answer this question with a clear “no”. Is it because they really believe computers cannot? Or maybe there is an inner voice that refuses to believe that musical creativity is not exclusive to humans? In the meantime, AI systems such as DALL-E and ChatGPT have already proven unprecedented performance for generating images and textual artefacts, and are now becoming ubiquitous in our society. What about AI-generated music?

The automatic composition of music dates back to the ancient Greeks, and remained up to and beyond Mozart with the “Dice Game”. Even Ada Lovelace, an esteemed mathematician, speculated that the “calculating engine” might compose elaborate and scientific pieces of music of any degree of complexity or extent. In recent years, AI systems have achieved remarkable results on symbolic and audio music [1]. The variety of computationally creative methods for music is quite broad and diversified, and has already enabled the exploration of novel forms of artistic co-creation [2]. These range from the automatic generation, completion, and alteration of chord progressions and melodies, to the creation of mashups, and audio snippets from textual prompts [3].

However, music AI systems are still far from generating full musical pieces that can be deemed realistic across several musical dimensions (harmony, form, instrumentation, etc.). First, most systems require hours and hours of human-composed (real) music before they can start generating interesting artefacts. Most importantly, although the machine can explore thousands of musical ideas, the resulting compositions are rarely used as final outputs. To make these “musical drafts” realistic, human intervention is often needed to correct, adapt, and extend the generations – depending on the creative workflow put forth by the  artist. Hence, human involvement is at least required upstream (data curation) and downstream (music adaptation) of the generation process.

Human participation is also needed to evaluate music generation systems in a variety of forms, ranging from Turing tests to musicological evaluations. Recently, researchers are also starting to devise new computational methods to automate this process. This way, every time a new music generation system is developed, its outputs can be coherently measured and compared in a controlled framework. Towards this direction, Dr. Jacopo de Berardinis, postdoc at King’s College London, has recently published a new method on evaluating the structural complexity of machine-generated music: by letting an algorithm decide on whether a piece of sound has realistic musical structure or not. De Berardinis is part of the Polifonia consortium, an AI-music project funded by the European Union’s Horizon 2020 research and innovation programme.

De Berardinis: ‘’composing musical ideas longer than phrases is still an open challenge in computer-generated music, a problem that is commonly referred to as the lack of long-term structure in the generations. In addition, the evaluation of the structural complexity of artificial compositions is still done manually – requiring expert knowledge, time and involving subjectivity which is inherent in the perception of musical structure.’’

AI can create short pieces of music or make variations and interpolations on existing pieces. But at some point the music will start diverting because machine learning models still struggle with long term dependencies. By automating the process of the evaluation of musical output, a lot of resources that are in human analysis of the music data, can be saved. To address this, de Berardinis detects musical structures from the sound [4], and describes their decomposition process [5]. This then allows the system to ‘judge’ the structural complexity of music on a scale from ‘real music’ to ‘random music’. In other words: you provide a music dataset and let the system match the input music on a continuum between these two complexity classes.

In conclusion, the automatic generation of music that cannot be distinguished from human compositions still remains an open challenge for AI music research. In the meantime, opening a debate on the eventual implications of these methods is also necessary – as having a system that can realistically generate music raises ethical concerns. Instead of designing tools that could potentially replace artists and composers (for certain commissions), de Berardinis argues that research should focus on leveraging the generative capabilities of AI models to design new systems that can enhance and augment the creative potential of artists – thereby enabling novel opportunities for Artificial Intelligence Augmentation (AIA) [6]. With these concerns and objectives, a team in Polifonia is currently working towards the creation of resources and algorithms to promote more transparent, fair, and reliable paradigms in music AI.

[1] Briot, J. P., Hadjeres, G., & Pachet, F. D. (2020). Deep learning techniques for music generation (Vol. 1). Heidelberg: Springer.

[2]  Huang, C. Z. A., Koops, H. V., Newton-Rex, E., Dinculescu, M., & Cai, C. J. (2020). AI song contest: Human-AI co-creation in songwriting. arXiv preprint arXiv:2010.05388.

[3] Agostinelli, A., Denk, T. I., Borsos, Z., Engel, J., Verzetti, M., Caillon, A., … & Frank, C. (2023). Musiclm: Generating music from text. arXiv preprint arXiv:2301.11325.

[4] de Berardinis, J., Vamvakaris, M., Cangelosi, A., & Coutinho, E. (2020). Unveiling the hierarchical structure of music by multi-resolution community detection. Transactions of the International Society for Music Information Retrieval, 3(1), 82-97.

[5] de Berardinis, J., Cangelosi, A., & Coutinho, E. (2022). Measuring the structural complexity of music: from structural segmentations to the automatic evaluation of models for music generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 1963-1976.[6] Carter, S., & Nielsen, M. (2017). Using artificial intelligence to augment human intelligence. Distill, 2(12), e9.

Recent News

During the spring semester, first-year information and computer science students created a user interface to make data on pipe organs accessible to a wider audience. Polifonia is excited to report on this fruitful collaboration between our stakeholder Utrecht University (Frans Wiering) and the ORGANS pilot.

During the spring semester, first-year information and computer science students created a user interface…

9 June 2023

In this new series on the Polifonia project website, the team would like to shed light on early adopters who have re-used Polifonia’s output. Polifonia’s primary goal is for its software, datasets and other applications to be used and exploited by stakeholders and other parties. One of our early adopters is Global Education Digest’s, which re-uses Polifonia’s CLEF, the platform for crowdsourcing and cataloguing that was developed for musoW.

In this new series on the Polifonia project website, the team would like to shed light on early adopters…

2 June 2023

Andrea Poltronieri (UNIBO) is to present the paper The Harmonic Memory: a Knowledge Graph of harmonic patterns as a trustworthy framework for computational creativity at the Web Conference 2023, a yearly international conference on the topic of the future directions of the World Wide Web.

Andrea Poltronieri (UNIBO) is to present the paper The Harmonic Memory: a Knowledge Graph of harmonic…

2 May 2023

Many people would probably answer this question with a clear “no”. Is it because they really believe computers cannot? Or maybe there is an inner voice that refuses to believe that musical creativity is not exclusive to humans? In the meantime, AI systems such as DALL-E and ChatGPT have already proven unprecedented performance for generating images and textual artefacts, and are now becoming ubiquitous in our society. What about AI-generated music?

by Jacopo de Berardinis & Max Tiel Many people would probably answer this question with a clear…

26 April 2023

Following Polifonia’s successful Dom Tower excursion (ORGANS pilot) and Haptic Device workshops (ACCESS pilot), Polifonia again invited music lovers to come and explore music in relation to (art) history. Paul Mulholland (leader WP5) invited the Apollo Youth Panel to try out the Deep Viewpoints app with an exhibition at the Irish Museum of Modern Art.

Following Polifonia's successful Dom Tower excursion (ORGANS pilot) and Haptic Device workshops (ACCESS…

7 April 2023

Peter van Kranenburg (Meertens Institute, KNAW), pilot leader of ORGANS and TUNES, is part of the upcoming AVA_Net webinar on connecting music collections.

Peter van Kranenburg (Meertens Institute, KNAW), pilot leader of ORGANS and TUNES, is part of the upcoming…

6 March 2023

The latest Polifonia tool opens doors of multilingual textual musical heritage resources. Find out what you can do with this tool and how it was developed.

Università di Bologna (UniBo) launches the long–awaited web application Polifonia Corpus, as part…

24 February 2023

Polifonia’s project leader Valentina Presutti (University of Bologna) is one of the 4 keynote speakers at the upcoming edition of ICAART 2023.

Polifonia’s project leader Valentina Presutti (University of Bologna) is one of the 4 keynote speakers…

22 February 2023

Applications are now open for participation in The International Semantic Web Research Summer School (ISWS 2023) in Bertinoro (Italy), from June 11th to June 17th, 2023. Sign up quickly as there are only 60 spots and the closing date of March 30th 2023 is approaching!

Applications are now open for participation in The International Semantic Web Research Summer School…

2 February 2023

In our newest video Paul Mulholland, leader of WP5 explains how Polifonia intends to let people access or contribute to the music data Polifonia offers, in a way that fits their level of expertise, interest or physical abilities.

In WP5  - Human Interaction with Musical Heritage - the team researches and develops highly interactive…

25 January 2023

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement N. 101004746