Introducing the Polifonia Corpus: explore music concepts and texts from the Polifonia Project with this new web tool

The latest Polifonia tool opens doors of multilingual textual musical heritage resources. Find out what you can do with this tool and how it was developed.

24 February 2023

Università di Bologna (UniBo) launches the long–awaited web application Polifonia Corpus, as part of the Polifonia H2020 project. An interactive dashboard has been created to easily access the Polifonia Corpus and carries a user-friendly design based on a music player. The corpus exists of Wikipedia data (all music-related pages), books (e.g. from the Biblioteca Nacional de España), influential music periodicals (e.g. The Musical Times) and the textual sources belonging to Polifonia pilots BELLS, CHILD, MEETUPS, MUSICBO and ORGANS (e.g. the Dutch organ encyclopaedia). The tool will help linguists, scholars and students to access multi-language music related corpora and to investigate them according to new and different criteria. 

Challenges in multi-lingual corpus and transcending keyword-based search
The new tool interrogates a collection of Italian, English, French, Spanish, German and Dutch sources. The large modularized corpus contains more than 100 million words for each language. A significant part of the sources of the corpus was only available as images or pdf files and Optical Character Recognition (OCR) to convert them in a processable format. The team from UniBo, consisting of Valentina Presutti, Rocco Tripodi, Arianna Graciotti, Marco Grasso, have been using more Natural Language Processing techniques to process the corpus and produce automatic morphosyntactic, semantic and MH-specific annotations. Further, custom APIs enable domain experts, scholars and music professionals to leverage the annotations produced to perform advanced structured queries on the corpus. The available search capabilities transcend standard keyword-based search, and allow for querying the corpus by using the advanced semantic information.

How to use Polifonia Corpus
To search in this corpus, the user first needs to prepare a few parameters. The typical user, linguists or students in the field, can start by entering a keyword in the “Query” section, which should be a musical concept such as ‘guitar’, ‘opera’, or ‘aria’. In the “Type” section users specify how the tool should search: by keyword, lemma, conceptual or named entities search. Then follows the selection of the “Module” to determine the source collection the tool should dig into (Wikipedia, Books, Periodicals or Pilots). The next section asks for selection of the module’s “Language”. The results that follow are sentences in which the input word is found. These sentences are listed in a Key Word In Context (KWIC) index, a well known practice in linguistic corpora querying. The results are listed in concordance lines, which means that they showcase the textual content following and preceding the concordance line keyword. It is also possible to access the full sentence line and its related source.

Release
The Polifonia Corpus is now live and released through the dedicated Polifonia Corpus GitHub repository and the interactive website. The Corpus, metadata and statistics, along with its annotations and interrogation tools are also part of the Polifonia Ecosystem.

Recent News

First-year computer science students from Utrecht University on a field trip to the medieval cathedral in Utrecht, a strange sight? In the world of Polifonia, these two go very well together!

First-year computer science students from Utrecht University on a field trip to the medieval cathedral…

21 March 2023

Peter van Kranenburg (Meertens Institute, KNAW), pilot leader of ORGANS and TUNES, is part of the upcoming AVA_Net webinar on connecting music collections.

Peter van Kranenburg (Meertens Institute, KNAW), pilot leader of ORGANS and TUNES, is part of the upcoming…

6 March 2023

Polifonia now offers a brand new section on the website where the project’s outputs are collected and displayed.

Polifonia now offers a brand new section on the website where the project’s outputs are collected…

27 February 2023

The latest Polifonia tool opens doors of multilingual textual musical heritage resources. Find out what you can do with this tool and how it was developed.

Università di Bologna (UniBo) launches the long–awaited web application Polifonia Corpus, as part…

24 February 2023

Polifonia’s project leader Valentina Presutti (University of Bologna) is one of the 4 keynote speakers at the upcoming edition of ICAART 2023.

Polifonia’s project leader Valentina Presutti (University of Bologna) is one of the 4 keynote speakers…

22 February 2023

Applications are now open for participation in The International Semantic Web Research Summer School (ISWS 2023) in Bertinoro (Italy), from June 11th to June 17th, 2023. Sign up quickly as there are only 60 spots and the closing date of March 30th 2023 is approaching!

Applications are now open for participation in The International Semantic Web Research Summer School…

2 February 2023

In our newest video Paul Mulholland, leader of WP5 explains how Polifonia intends to let people access or contribute to the music data Polifonia offers, in a way that fits their level of expertise, interest or physical abilities.

In WP5  - Human Interaction with Musical Heritage - the team researches and develops highly interactive…

25 January 2023

Polifonia is featured in the latest edition of @rhivi. The article introduces the Croatian archival and cultural heritage community to musicdata project Polifonia.

Polifonia is featured in the latest edition of @rhivi. The article introduces the Croatian archival…

24 January 2023

Recently, the ORGANS pilot released a press statement about its ambitions in order to reach all pipe organ enthusiasts about the upcoming Knowledge Graph on organ history. Also, the stakeholder network video is now on our YouTube channel, in which pilitor leader Peter van Kranenburg explains the work being done in this pilot.

Recently, the ORGANS pilot released a press statement about its ambitions in order to reach all pipe…

17 January 2023

Earlier in December, Polifonia’s team from Bologna presented a novel music segmentation method called Pitchclass2vec at the 21st International Conference of the Italian Association for Artificial Intelligence (AIxIA 2022). Master student Nicolas Lazzari was asked to present the experiments that were part of this research.

Earlier in December, Polifonia’s team from Bologna presented a novel music segmentation method called…

21 December 2022

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement N. 101004746