Print Friendly

Pitti, Daniel, Institute for Advanced Technology in the Humanities, University of Virginia, USA, dpitti@Virginia.edu
Simon, Agnès, Bibliothèque nationale de France, France, agnes.simon@bnf.fr
Vitali, Stefano, La Soprintendenza archivistica per l’Emilia-Romagna, Italy, vitali.stefano@gmail.com
Arnold, Kerstin, Bundesarchiv, Germany, k.arnold@barch.bund.de

People live and work in socio-historical contexts: over the course of their lives they produce a range of records that document their lives and the contexts in which they lived: birth records, correspondence, books, articles, photographs, works of art, films, notebooks, collected artifacts, school records, manuscripts, employment records, etc. Social relations between people, relations among documents, and relations among people and the artifacts created by them constitute a vast social-document network1 that can be drawn on by scholars needing to reconstruct and study the lives, works, and events surrounding historical persons.

In the past, cultural heritage professionals have largely viewed document networks and social networks in isolation, with a primary focus on documents. Increasingly, in parallel with (and perhaps inspired by) the emergence of social computing on the Web, cultural heritage professionals and scholars are expanding their focus to social-document networks. Scholarly projects that focus on social-document networks or social networks, such as Research-oriented Social Environment (RoSE)2 and The Crowded Page, 3 have begun to emerge. The cultural heritage communities (library, archive, and museum) have begun to make explicit the implicit networks found in the descriptions of books, articles, manuscripts, correspondence, art objects, and other artifacts in their care, and the authority files used to describe people who created or are documented in the resources described. Particularly important examples are the Virtual International Authority File (VIAF, a collaboration between OCLC Research and major national and research libraries)4 and WorldCat Identities5

This panel will focus on three national and one international cultural heritage projects facilitating humanities research by providing innovative access to social-document networks that provide access to both resources and the socio-historical contexts in which the resources were created.

Interconnecting French cultural heritage treasures on the Web: data.bnf.fr – France

Agnès Simon (Curator, Département Information bibliographique et numérique, Bibliothèque nationale de France (BnF)), will discuss data.bnf.fr, a service the BnF is developing to facilitate the discovery of its holdings, interrelated and interconnected to other resources on the Web.

The BnF is the most important heritage institution in France, with a history going back to the 16th century. It has fourteen divisions located over many sites. A large variety of materials are processed in different catalogues, reflecting not only the history of the materials themselves but also that of the methods and technologies used for their description. However, the descriptions are maintained in disparate BnF catalogues and databases, complicating discovery of these resources and their interrelation.

data.bnf.fr will leverage both new conceptual models, for organization of the bibliographic information (such as FRBR), and Linked Data technologies, to provide integrated access to both the holdings of the BnF and related resources available on the Web. A high-level ontology has been designed to make the bibliographic, archival, or other metadata models used in the BnF interoperable. RDF is used to express and expose data extracted from the various library descriptive systems and other complementary systems available on the Web. Incorporating other Semantic web-related ontologies, data.bnf.fr provides a scalable foundation for interconnecting diverse resources, particularly cultural heritage.

The project’s effectiveness is made possible by innovative use of the semantics and quality of the structured data contained in the various BnF catalogues. Persons, corporate bodies, and works, accurately identified through authority files, constitute nodes in a network of descriptions of related resources, regardless of the type. Descriptions of persons and corporate bodies are enhanced with information about those entities from Encoded Archival Description (EAD) finding aids. Other Internet resources are used in a similar, complementary way. The bibliographic data is remodeled and collocated according to FRBR categories, and displayed in a user-friendly way, with direct links to the digitized material from the Gallica digital library, whenever they exist.

data.bnf.fr began in July 2011 with a significant amount of information and is continuously broadening its scope and exploring new ways to collaborate with ongoing initiatives in other cultural heritage institutions in France, namely current work remodeling the French Archives’ databases to better fit in the Web landscape of the French and European cultural heritage treasures.

Social Networks and Archival Context (SNAC) Project – United States

Daniel Pitti, Associate Director of the Institute for Advanced Technology in the Humanities, University of Virginia, will describe the SNAC project.

The initial phase of the two-year SNAC research and demonstration project began in May 2010 with funding from the National Endowment for the Humanities (U.S.). The project’s objectives are to:

  • extract data describing creators of and others documented in records from EAD-encoded descriptions of records (finding aids),
  • migrate this data into Encoded Archival Context-Corporate Bodies, Persons, Families (EAC-CPF)-encoded authority descriptions,
  • augment authority descriptions with additional data from matching library and museum authority records,
  • and, finally, use the resulting extracted and enhanced archival authority descriptions to build a prototype system that provides integrated access to the finding aids (and thus the records) from which the descriptions of people were extracted and to the socio-historical contexts in which the records were created.

In the initial phase of SNAC, the primary source of data was 28,000 finding aids. In the second phase, the source data will vastly expand and include not only descriptions of archival records but also original archival authority descriptions. The number of finding aids will increase to more than 148,000, and up to two million OCLC WorldCat collection-level descriptions will be added. The National Archives and Records Administration (NARA), British Library, Smithsonian Institution, BnF, and Archives nationales will contribute nearly 500,000 archival authority descriptions.

While SNAC’s immediate objectives are to significantly refine and improve the effectiveness of the methods used in building an innovative research tool, its long-term objective is to provide a solid foundation of both methods and data for establishing a sustainable national archival program cooperatively governed by and maintained by the professional archive and library community.

Catalogo delle Risorse archivistiche (CAT) – Italy

Stefano Vitali, Soprintendente archivistico per l’Emilia Romagna (Supervising Office for the Archives in Emilia Romagna Region), will describe the Catalogo delle Risorse archivistiche (CAT). CAT provides integrated access to archival resources held in national, regional, and local repositories. Access to these holdings will be provided via a central system based on archival descriptions of the custodians and creators of archival records.

CAT will sketch a general map of the national archival heritage, providing initial orientation to researchers and guiding them towards more informative resources available in the systems participating in the National Archival Portal. It will contain descriptive records of both the current custodians and original creators of archival records. Data harvesting techniques based on the OAI-PMH protocol will be used to aggregate descriptive data from systems distributed throughout Italy. In addition, CAT will explore direct submission of XML-based descriptions of archival repositories and creators, as well as direct data entry into the CAT maintenance interface. 

CAT’s goal is to provide a comprehensive list of all of the creators of archival records (persons, corporate bodies, and families) held in Italian repositories, and a comprehensive guide to the custodians (institutional and non-institutional) of the archival records. In addition to providing access to and context for archival records in Italy, CAT will be a bridge to other catalogs and descriptive systems in the cultural heritage domain, include the National Library System.

The Archives Portal Europe: Research and Publication Platform for European Archives– Europe

Kerstin Arnold, Scientific Manager, Bundesarchiv, and Leader of the Europeana interoperability work package of APEx, will describe the APEx project and its predecessor, APEnet.

The Archives Portal Europe has been developed within APEnet, a collaboration of nineteen European national archives and the Europeana Foundation, to build a common access point for searching and researching archival content6. Funded by the European Commission in the eContentplusprogramme, APEnet began in January 2009 and celebrated the release of Archives Portal Europe 1.0 in January 2012. At the moment the portal contains more than 14.5 million descriptive units linked to approximately 63 million digitized pages of archival material. By joining the materials of currently 62 institutions from 14 European countries, the portal has become a major actor on the European cultural heritage scene.

The tasks achieved in APEnet will be taken one step further with its successor, APEx, which recently has held its kick-off meeting at The Hague, The Netherlands. Funded by the European Commission in the ICT Policy Support Programme (ICT-PSP), APEx will inlcude 28 European national archives plus ICARUS (International Centre for Archival Research) as project partners.

While APEnet focused on integrating access to EAD-encoded archival finding aids contributed by the participating national archives and to EAG (Encoded Archival Guide)-encoded descriptions of the national archives themselves, APEx will increase the number of participating archives, enhance and improve the training of archival staff in participating institutions, and improve the quality of the integrated access system, in particular via Web 2.0 functionality and examining Linked Data approaches to be adapted within the Archives Portal Europe.

A primary objective will be incorporating EAC-CPF, the international standard for describing record creators and the people documented in them. The resulting archival authority descriptions will enhance access to archival records, and provide socio-historical context for understanding them.

Notes

1.In the describing the RoSE project to Daniel Pitti, Alan Liu described this network as a ‘social-document graph.’

2.RoSE: http://transliteracies.english.ucsb.edu/category/research-project/rose

3.The Crowded Page: http://www.crowdedpage.org/

4.VIAF: http://viaf.org/

5.WorldCat Identities: http://www.worldcat.org/identities/

6.Archives Portal Europe: http://www.archivesportaleurope.eu/ and APEnet: http://www.apenet.eu/