On-line resources that reference ancient places are multiplying rapidly, bringing huge potential for the researcher provided that they can be found; but users currently have no way of easily navigating between them or comparing their contents. Pelagios, a growing international cooperative of online ancient world projects,1 addresses the problems of discovery and reuse with the twin aims of helping digital humanists to make their data more discoverable, and of empowering real-world users (scholars, students and the general public) to find information about particular ancient places and visualize it in meaningful ways. While the project focuses on the ancient world, the methodology and tools developed will be of interest to anyone working with data containing spatial references. The Pelagios family purposefully includes partners maintaining a wide range of different document types including texts, maps and databases. In doing so we take some of the first steps towards building a Geospatial Semantic Web for the Humanities.2
In this paper we discuss two of the major workflows underpinning Pelagios. First, we address the method by which the partners prepare their data so that it can be linked together in an open and transparent manner: i.e. what are the processes that you should undertake if you want to make your data Pelagios compliant? Second, we consider the various ways in which the results can be visualized, paying particular attention to the tools and technologies used and the problems encountered. We end with a brief description of our visualization services – both a Graph Explorer and various embeddable web widgets – which, we believe, demonstrate the value of taking a lightweight Linked Open Data approach to addressing problems of discoverability, interconnectivity and reusability of online resources. At the same time, however, we use this paper to discuss real-world practical concerns as well as engage in deeper speculation about the significance of this type of approach for escaping the ‘siloing’ mentality that inhibits many other data integration initiatives.
The first part of the paper will sketch out and reflect upon the architectural aspects of Pelagios, in particular the requirements necessary to maximize the exposure and interconnectivity of the data themselves. The structure of the data is targeted at helping the user groups of each partner accomplish two kinds of task:
A) Discovering the references to places within a single document (text, map, database);
B) Discovering all the documents (texts, maps, databases) that reference a specified place.
In order to achieve this we (i) use a common RDF model to express place references (the OAC model); and (ii) align all local place references to the Pleiades Ancient World Gazetteer. The OAC annotation ontology provides an extremely lightweight framework for associating global concepts (such as places) with specific documents (and fragments of them).3 An OAC annotation is a set of RDF triples which identify a target document (by means of its URI) and the body of the annotation itself, in our case a URI identifying a specific place in Pleaides. An example is given below:
ex:ann1 rdf:Type oac:Annotation
dcterms:title “Example annotation”
oac:hasTarget <some resource>
oac:hasBody <http://pleiades.stoa.org/places/[PLEIADES ID]>
The decision to use the OAC model typifies our pragmatic lightweight approach, which has not been to reinvent the wheel but to reduce, reuse and recycle wherever possible. Using a publicly available, lightweight core ontology permits modular extensions for different kinds of document so that details specific to each type do not add unnecessary complexity for users wishing to publish data in conformance with the core ontology. We have also found that Pelagios partners are afforded a great deal of flexibility in implementation. For example, the RDF may be expressed in a number of different formats (RDF/XML, Turtle, RDFa, SPARQL, etc.) and the simplicity of the ontology allows partners to focus on the considerably more challenging task of aligning local place referencing systems with the global Pleiades gazetteer.
Pelagios has been documenting the various processes by which each partner has identified place references and aligned those references to Pleiades: one significant outcome of the project will be a ‘cookbook’ guide for those looking to adopt a lightweight Linked Open Data approach to related domains, which we outline here. The Pelagios process, however, has greater extensibility than provision for digital classicists. Although we are using URIs for places in the ancient world, the OAC ontology is equally applicable to other gazetteers (including those based on modern placenames, such as GeoNames) or even non-spatial entities such as periods or people.
The second part of the paper assesses the possibilities for data exploration once different projects have adopted the core model for representing place references. Here we discuss the technologies exploited in developing the Pelagios Explorer, a prototype Web application that makes discovery and visualization of the aggregated data simple, and various web widgets that can sit on partner websites to provide a window onto the Pelagios linked data world. By aggregating Pelagios partners’ place metadata in a Graph Database, the Explorer supports a number of common types of query using visually-oriented interaction metaphors (‘which places are referenced in these datasets?’, ‘what is the geographical footprint of these datasets?’, ‘which datasets reference a particular place’) and displays results using a graph-based representation.4 In addition, the Pelagios Explorer exposes all data available in the visualizations through an HTTP API to enable machine-access. For their part the widgets that we are currently developing provide more specific views on Pelagios data than the graph explorer allows (such as an overview of the data about a particular place). But equally we are exploring ways of making these widgets easy to configure to, customise for and embed in external Websites, thereby maximizing the potential reuse value of the data.
There are at least two different user groups whose interests and concerns we address. For the ‘super users’ interested in contributing data, adhering to the two principles of the OAC model and Pleiades URIs is the essential step in order to prepare data for use in these various Pelagios visualizations. With these basic issues established, we go on to discuss more detailed control over how data can be represented in a generic interface, for instance using names and titles to label data, or structuring a dataset hierarchically into subsets, such as individual books, volumes, chapters, and pages. Toolset and methods are still at an early stage but some useful resources have already been made available for general use via our blog.5
Our second user group represents scholars, students and members of the public who may wish to discover the resources that reference an ancient place of interest. We consider how the Pelagios framework is beginning to bring together an enormous diversity of online data – such as books that reference places, and archaeological finds discovered there – which a user can search through, combine and visualize in various ways depending on their needs.6 In particular, we consider the challenges, both technological and intellectual, regarding the development, production and use of such tools or services that aim to provide a useful and intuitive resource for non technically-minded subject specialists.
We conclude with a brief reflection on the processes by which the Pelagios family has been developing. In particular we stress the digital services that have made the coordination of such an international initiative possible, and outline the challenges that remain to anyone wishing to add their data to the Pelagios multiverse.7
This work has been supported by JISC (the Joint Information Systems Committee) as part of the Geospatial and Community Outreach programme (15/10) and the Resource and Discovery programme (13/11).
Screenshots of the Pelagios Explorer:
Elliott, T., and S. Gillies (2009). Digital Geography and Classics. DHQ: Digital Humanities Quarterly 3(1).
Harris, T. M., L. J. Rouse, and S. Bergeron (2010). The Geospatial Semantic Web, Pareto GIS, and the Humanities. In D. J. Bodenhamer, J. Corrigan, and T. M. Harris (eds.), The Spatial Humanities: GIS and the Future of Scholarship. Bloomington: Indiana UP.
Sanderson, R., B. Albritton, R. Schwemmer, and H. van de Sompel (2011). SharedCanvas: A Collaborative Model for Medieval Manuscript Layout Dissemination. Proceedings of the 11th ACM/IEEE Joint Conference on Digital Libraries. Ottawa, Canada.
Schich, M., C. Hidalgo, S. Lehmann, and J. Park (2009). The network of subject co-popularity in Classical Archaeology. In Bolletino di Archaeologia On-line.
1.Pelagios includes: Arachne, http://www.arachne.uni-koeln.de; CLAROS, http://explore.clarosnet.org; GAP, http://googleancientplaces.wordpress.com; Nomisma, http://nomisma.org; Open Context, http://opencontext.org; Perseus, http://opencontext.org; Pleiades, http://pleiades.stoa.org; Ptolemy Machine, http://ptolemymachine.appspot.com; SPQR, http://spqr.cerch.kcl.ac.uk; Ure museum, http://www.reading.ac.uk/Ure.
2.See Harris et al. (2010).
3.OAC is a powerful information model in its own right and has recently been used as the basis for the SharedCanvas annotation system. See Sanderson et al. (2011)
4.Network visualizations are ideal for representing bipartite networks such as this. See, for example: Schich et al. (2009).
7.A vision that coincides well with the discussion of the future of Classical scholarship by Elliott and Gillies (2009).