Rivista "IBC" XII, 2004, 3

Dossier: Carattere Europa

musei e beni culturali, biblioteche e archivi, dossier /

Searching manuscripts and printed books (full text)

Lotte Hellinga
[già segretario e ora consulente del Consortium of European Research Libraries - CERL]
Marian Lefferts
[project manager del Consortium of European Research Libraries - CERL]

In the preceding pages David James Shaw has already mentioned that the Consortium has commissioned two pilot projects in order to assess the feasibility of providing access to manuscript databases through federated searches. It has taken the Consortium several years to arrive at this still preliminary development, years in which much consultation and discussion has taken place. It may be useful to retrace some of these successive steps, which have led to establishing the principles of the project.

As in every such undertaking, the Consortium's enterprise is a mixture of organisation, skills and expertise. The organisational aspect is fundamental. After the initial consultative international meetings in 1990 and 1991, there could be no doubt that there was enough genuine support for a European database of printed books to consider what form the organisation should take. The advice was that for such an international organisation that would have to act as employer and contractor, it was necessary to form a company "limited by guarantee" in which each contributing member would have equal rights. The advice turned out to be sound, and on this basis we have been able to operate.

While the company was established - this took time - the design for the database was proposed and discussed with future members. The database would be built by combining existing automated files in a single system, allowing searching across all the files in the system. A database host was selected (the Research Libraries Group Inc., RLG) and once the company was established, a contract was agreed. With RLG the Consortium established a procedure for assessing files and carrying out the necessary adaptations before the file was loaded onto the HPB database. Although often time-consuming, this system has led to the very satisfactory results as described above by David James Shaw.

Once the Consortium's enterprise could generally be considered a success, the idea presented itself that the organisational form and the experience we had gained need not be limited to printed books of the Hand Press period. The idea was first mooted at the AGM of November 2000, where the members agreed that the matter should be investigated. Lotte Hellinga reported to the members in 2001, and upon approval prepared a much fuller report for the AGM of 2002, based on consultations and a survey of existing projects. Here the members approved commissioning a technical feasibility study by a consultancy firm. This report, known as the Radcliffe report, identified several service providers who were then invited to submit proposals. The reports can be found at www.cerl.org/Manuscripts/manuscripts_working_group.htm. Meanwhile, a working party was formed of experts of manuscript materials who had also experience of databases, under Lotte Hellinga's chairmanship.

The Consortium received five proposals, which were submitted to the Working party and also to several other experts, who unanimously arrived at the selection of two companies as potential service providers. These two companies are now preparing pilot projects, to be submitted to the members in November 2004 with a recommendation after consultation of experts.

In the lengthy process of consultations and subsequent submission to the Consortium's members, a number of principles became clear which were to guide this enterprise. As with the files for printed books, the Consortium should make use of existing automated catalogues of manuscript materials. However, instead of combining all records in one very large database, as with the HPB, the expectation is that modern technology will allow the Consortium to adopt a system for federated searching across any number of databases. Databases of some of the largest collections which are already fully operating, and the models on which they are formed, are a determining factor in the structure of the project.

Projects exist already that seek to comprise records of manuscripts in a large number of relatively small collections, with new cataloguing following mutually agreed standards, "MASTER" the most notable among them. The Consortium, on the contrary, is not to prescribe standards, but sets itself the task to exploit new (e.g. portal) technology for searching systems that show considerable diversity in cataloguing formats and even intellectual concepts. The structure of most major manuscript collections, often with hierarchical elements, is by the nature of the materials more complex than the automated systems recording printed materials, although even there the Consortium had already come across hierarchical problems, as in serial publications, that were not easy to resolve satisfactorily within existing systems.

Such a federated search system will ensure the autonomy of each single manuscript database; it is an area where standards and principles vary a great deal more than between the databases of printed materials. This system will also mean that the process of adaptation of files, that has proved to be not only time-consuming but also to put a considerable burden on file-providers, can largely be avoided. A much simpler analysis of the structure of each file will still be required.

One problem of diversity, the various traditions of names (personal names, place-names) in the languages of the catalogues, had already been encountered in the HPB, and, as explained by David James Shaw, was met by the Consortium's development of searching assisted by the CERL Thesaurus. This important development of the last few years will find a further and very full application in the new system.

From the premise of accessing autonomous individual files it follows that the material that is made available is what each individual institution decides to include in its manuscript cataloguing system. Some include charters, archival materials, modern literary manuscripts, others do not. We decided therefore that what is a manuscript will simply be defined as "any item in a manuscript database". This avoids all hair-splitting about time limits, categories of material, and other such debates.

We have found that the idea that the HPB file of printed books should be accessed in the same search system is widely welcomed. In one of the AGM discussions it was stressed that the division of manuscripts and printed materials was artificial, imposed by now obsolete notions of library organisation, and has no longer any intellectual validity. In the systems now on offer, the user will be able to determine which files are to be searched, which excluded. This will make it possible for the scholar who wishes to see only e.g. the manuscripts of Marco Polo's Travels, and is less fascinated by the printed versions, to work according to his preferences. Or, another example, a researcher may single out particular collections in preparation of library visits.

The two companies propose to map the search terms to Dublin Core. In the pilots the following fifteen search options will be tested:

  • Keyword (to cover all elements listed below)
  • Names (to include personal and institutional name, in some cases place names)
  • Title (including uniform, standard, generic titles as well as title as in the manuscript)
  • Other names related to the document (e.g. recipient of letter)
  • Incipit, Initia
  • Language
  • Present location
  • Shelfmark, or nickname (e.g. Codex Argenteus)
  • Date
  • Place of origin, writing (imprint for printed books)
  • Scribe/copyist
  • Artist, illumination, illustration
  • Material description (size, paper, vellum)
  • Binding
  • Former owners/provenance

Once these searches have produced hits, the user will be enabled to enter the original database(s) of the institutions.

As a last important facility for scholars, the companies are required to develop an interactive "notepad" that will enable users to attach remarks, further information, corrections, without interfering with the original record, which remains the property and also the responsibility of the originating institution.

The HPB was conceived with a dual purpose: for derived cataloguing by institutions, and for facilitating access to vast quantities of heritage material to scholars. In practice it seems that the use for derived cataloguing is limited, but that the value for scholarship is much more than the sum of its parts. For manuscript material, the use for derived cataloguing will be minimal, but we expect that a system as flexible as what is now in preparation, should prove to be an innovatory instrument for scholarship. The Consortium is therefore well aware that part of its development plan is to make it known to the scholarly world, and to convince scholars to use it, for interests that may range from the classics in manuscript and print to authors of the Early Modern period and beyond: from Hippocrates, Livius and Ovid, to Vico, Voltaire and Stendhal, via Alcuin, Dante, Poggio Bracciolini, Machiavelli, Newton (to gather just a handful), their works were all transmitted in manuscript and print. Our hope is that in due course, scholars of any part of the whole wide range of the written heritage will learn to appreciate the database as an indispensable working instrument.

Azioni sul documento

Elenco delle riviste

    Istituto per i beni artistici culturali e naturali della Regione Emilia-Romagna - Cod. fiscale 800 812 90 373

    Via Galliera 21, 40121 Bologna - tel. +39 051 527 66 00 - fax +39 051 232 599 - direzioneibc@postacert.regione.emilia-romagna.it

    Informativa utilizzo dei cookie

    Regione Emilia-Romagna (CF 800.625.903.79) - Viale Aldo Moro 52, 40127 Bologna - Centralino: 051.5271
    Ufficio Relazioni con il Pubblico: Numero Verde URP: 800 66.22.00, urp@regione.emilia-romagna.it, urp@postacert.regione.emilia-romagna.it