About Early Music Online Search

Early Music Online (EMO) was originally (July 2018) based on a collection of digital images of 16c printed music from the British Library (GB-Lbl). At that time the interface provided content-based search to about 32,000 pages from just under 200 of these books. Since then, it has been considerably expanded, to include about the same number of images from other European music libraries, such as the Bibliothèque nationale in Paris (F-Pn), the Berlin Staatsbibliothek (D-Bsb) and the Polish National Library in Warsaw (PL-Wn). Very soon we shall be incorporating half-a-million pages from the Bayerische Staatsbibliothek in Munich (D-Mbs).

The images have been subjected to optical music recognition using Aruspix, and the musical contents indexed for efficient searching using a state-of-the-art method.

This interface simply allows searches for pages containing music similar to that contained in a query page. It does not yet allow searching for an arbitrary musical sequence of notes, though this is planned for a future version. Also, it makes no use of library metadata - information about the books (titles, composers, printers, dates, places, etc.); this, too, will be incorporated in a later version.

This experimental system runs on a server at Goldsmiths, University of London.

Enquiries to Tim Crawford

Queries - Search box

Queries may be entered into the Search box in the form of an ID based on the shelf-mark of a book in the British Library, the image-number from the microfilms which were digitised for EMO, and the left/right page selected. As such, they were never intended for user-entry but they may be entered for test purposes. (To explore the collection, it will be easier to start with a Random Search.)

The currently-searched page's ID will be displayed in the Search box, and the left/right arrow-keys can be used to navigate to and search the previous/next page. You can also skip to the first page of the previous/next book by holding down the shift key as you click on the respective arrows.

To perform a search, simply hit the Enter or Return key.

Random Search

This simply chooses a page at random from the entire collection. You can use the backslash ('\') as a keyboard shortcut.

Feedback - Relation to query

You can provide feedback concerning the nature of a result by entering your own judgements. To do this, before doing a search check the box marked “Provide judgements to help improve the system”. The result list will then contain drop-down menus which you can use to enter your judgements.

You are not obliged to do this, but all feedback is saved (anonymously) on the server and will be used in future for refining the way the interface operates (e.g. by initially eliminating all non-music pages). This will be enormously useful to us, and it takes very little time. By activating the “Provide judgements” menus we assume you are giving your consent. There is no way we can trace back the information saved by us to you.

The first result will always be the query itself, but this may be a page which doesn't contain any music (e.g. a title-page, or just text, or an image) - in which case, choose 'Not music!'. For all other results, do the same if the page does not contain music. Otherwise, choose from the categories 'Duplicate page' (where it is an extra copy of the same page), 'Same music' (from a different book or edition) or 'Related music' (which might be from a different voice-part, or a different section of a work). Use your own judgement about these decisions, as it is difficult to establish hard and fast rules for the last case in particular.

If you make a mistake, simply re-do it immediately - all feedback is time-stamped, so we will know if a change has been made within a short time.

Result ranking

Basically, matches are made by finding the pages with the maximum 'overlap' with the query. However, this means that pages containing a lot of notes are more likely to contain such overlaps by accident.

There are two 'modes' for ranking results. The 'Basic' mode takes no account of the number of notes on a page, which can lead to false matches with long pages. The default 'Jaccard' mode, however, uses the Jaccard distance measure instead, which tends on the whole to give better results.

When you change this setting by choosing from the “Result Ranking” menu your search will be re-run.

Results to display

You can choose a number of results to display, or 'Best matches', which simply shows our estimate of the best results to be found. Very occasionally an interesting match can be found below this threshold (e.g. when the music is only fleetingly similar), but it is hard to predict when this might occur, so you might find it interesting to explore these options.

When you change this setting by choosing from the “Result Ranking” menu your search will be re-run.

Uploading an image

If you click on “Search with Image Upload” in the page header, you will be able to upload your own query page in one of a variety of common graphics formats. On clicking “Upload and Search” yoiur image will be uploaded to our server (which will usually take a few extra seconds), recognised and encoded as a query which is then run against the database, and the result list presented as usual. (This is an experimental feature which has not been exhaustively tested, and may produce an error message from time to time.)

Comparing result with query

In order to gain some feeling for the quality of the recognition, and the nature and extent of a match, you can see a visualisation of the matched sequences of notes superimposed on the original image. Simply click on the central match-score section of the result-list where a number is displayed to suggest the quality of the match. This will bring up a new window displaying the two pages side-by-side; this may take a few seconds to display, since it uses a different mechanism to verify the match. Sometimes, one or other of the pages lacks the necessary data for the comparison, in which case an error message is shown.

NB: Results generated by a search with an uploaded image currently lack the data necessary for this parallel display; we intend to include this data for all pages in the near future.