<?xml version="1.0" encoding="UTF-8"?>
<?oxygen RNGSchema="../schema/xmod_web.rnc" type="compact"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0"
     xmlns:xmt="http://www.cch.kcl.ac.uk/xmod/tei/1.0" 
     xml:id="ab-366">
    <teiHeader>
        <fileDesc>
            <titleStmt>
                <title>Designing a national ‘Virtual Laboratory’ for the humanities: the Australian
                    HuNI project</title>
                <author>
                    <name>Burrows, Toby Nicolas</name>
                    <affiliation>University of Western Australia, Australia</affiliation>
                    <email>toby.burrows@uwa.edu.au</email>
                </author>
            </titleStmt>
            <publicationStmt>
                <publisher>Jan Christoph Meister, Universität Hamburg</publisher>
                <address>
                   <addrLine>Von-Melle-Park 6, 20146 Hamburg, Tel. +4940 428 38 2972</addrLine>
                   <addrLine>www.dh2012.uni-hamburg.de</addrLine>
              </address>
            </publicationStmt>
            <sourceDesc>
                <p>No source: created in electronic format.</p>
            </sourceDesc>
        </fileDesc>
        <revisionDesc>
            <change>
                <date>2012-04-15</date>
                <name>DH</name>
                <desc>generate TEI-template with data from ConfTool-Export</desc>
            </change>
            <change>
                <date>2012-04-13</date>
                <name>LS</name>
                <desc>provide metadata for publicationStmt</desc>
            </change>
        </revisionDesc>
    </teiHeader>
    <text type="paper">
        <body>
            <div>
            <head>Context</head>
                <p>This paper looks at the design and architecture of the Humanities Networked
                    Infrastructure (HuNI), a national ‘Virtual Laboratory’ which is being developed
                    as part of the Australian government’s NeCTAR (National e-Research Collaboration
                    Tools and Resources) programme. One of NeCTAR’s main goals is the development of
                    up to ten discipline-based ‘Virtual Laboratories’. The aims of this programme
                    are to integrate existing capabilities (tools, data and resources), support
                    data-centred research workflows, and build virtual research communities to
                    address existing well-defined research problems. </p>
                <p>Beginning in May 2012, HuNI has been funded until the end of 2013. It is being
                    developed by a consortium of thirteen institutions, led by Deakin University in
                    Melbourne.</p>
            </div>
            
            <div>
                <head>Design Framework</head>
                <p>HuNI is specifically designed to cover the whole of the humanities (defined as
                    the disciplines covered by the Australian Academy of the Humanities). It uses
                    scientific e-Research consortia as its model: large-scale, multi-institutional,
                    interdisciplinary groups with an e-Research framework covering the entire field
                    of research.</p>
                <p>This approach has a sound academic basis. It emphasizes the interdisciplinary and
                    trans-disciplinary reach of the e-Research services and tools which are included
                    in the ‘Virtual Laboratory’ environment – and its value to researchers across
                    the full range of humanities disciplines.</p>
                <p>HuNI aims to join together the various digital services and tools which have
                    already been developed for specific humanities disciplines, both by collecting
                    institutions (libraries, archives, museums and galleries) and by academic
                    research groups. It builds on these services and strengthens them, rather than
                    superseding them.</p>
            </div>
            
            <div>
                <p><hi rend="italic">Data-Centred Workflows</hi></p>
                
                <p>The very concept of ‘data’ can be problematic for the humanities.
                            Nevertheless, a distinctive type of humanities data can be identified,
                            different from the quantitative and qualitative data of the social
                            sciences. This ‘humanities data’ consists of the various annotations,
                            tags, links, associations, ratings, reviews and comments produced during
                            the humanities research process, together with the entities to which
                            these annotations refer: concepts, persons, places and events.</p>
                <p>It is important to draw a distinction between ‘data’ in this sense and
                            primary source materials, particularly in digitized form (Borgman 2007:
                            215-217 fails to make this distinction). Primary materials – even in the
                            form of digital objects – are sources of data, rather than data per
                            se.</p>
                <p>A data-centred virtual laboratory for the humanities needs to include
                            services for identifying these semantic entities and their relationships
                            (using the Linked Open Data technical framework), and for capturing and
                            sharing the annotations and other scholarly outputs which refer to them.
                            While the Linked Open Data framework is a relatively recent development,
                            there are already a sufficient number of projects and services underway
                            in Europe and North America which can serve as case studies to
                            demonstrate clearly the viability and value of this approach (Bizer,
                            Heath &amp; Berners-Lee 2009).</p>
            </div>
                    
            <div>       
                <p><hi rend="italic">Integration of Existing Capabilities</hi></p>
                <p>Many significant Australian collections of digital content relevant to humanities
                    research already exist. Some of these are descriptions of physical objects
                    (e.g., books, museum objects and art works) and entities (e.g., people and
                    places), some are collections of digital objects, and others are a mixture of
                    the two. In most cases, these collections were not connected with each other
                    except to the extent that they could be searched by services like Google.
                    Working effectively across such a disparate range of sources has been a major
                    challenge for humanities researchers.</p>
                <p>The production-level tools for working with these collections of digital content
                    are relatively limited. Most tools are designed to work with a single service or
                    a single type of content, such as the visualization tools developed by
                    AusStage<note>AusStage is the Australian performing arts service (<ref target="www.ausstage.edu.au" type="external">www.ausstage.edu.au</ref>)</note>, the user tagging developed by the
                    Powerhouse Museum in Sydney<note><ref target="www.powerhousemuseum.com" type="external">www.powerhousemuseum.com</ref></note>, and the Heurist
                    software<note><ref target="www.heuristscholar.org" type="external">www.heuristscholar.org</ref></note> developed by the University of Sydney for
                    archaeology. The LORE tool (for annotation and the construction of virtual
                    collections) works mainly with AustLit<note>AustLit is the Australian literature service (<ref target="www.austlit.edu.au" type="external">www.austlit.edu.au</ref>)</note>, though its
                    federated search also covers some other content services. </p>
                <p>To meet NeCTAR’s Virtual Laboratory criteria, content sources need to be
                    integrated (or at least inter-linked), and tools need to be usable across as
                    many <hi rend="italic">sources</hi> and <hi rend="italic">types</hi> of content
                    as possible. It is neither practical nor desirable to merge content from
                    multiple disciplines into a single enormous database, given the extensive
                    variations in standards and approaches. Federated searching across many
                    services, on the other hand, will not build the data-centred platform required
                    to support the other functions of the Virtual Laboratory. The only feasible
                    solution for data integration is to deploy a Linked Open Data environment on a
                    national scale.</p>
            </div>
            
            <div>
                <p><hi rend="italic">Architecture and Services</hi></p>
                <p>The project has defined a data-centred workflow for the humanities, with three
                    main stages:</p>
                
                <xmt:uList>
                    <item>
                        Discovery (search and browse services);
                    </item>
                    <item>
                        Analysis (annotation, collecting, visualization and mapping);
                    </item>
                    <item>
                        Sharing (collaborating, publishing, citing and referencing).
                    </item>
                </xmt:uList>
                
            </div>
            
            <div>        
                   
                <p>For the Analysis and Sharing functions, a suite of existing Open Source tools
                    developed in Australia are being used and adapted as part of the project. These
                    include:</p>
                
                <xmt:uList>
                    <item>
                        LORE – developed by AustLit for annotation, federated searching,
                            visualization, aggregation and sharing of compound digital objects
                            (Gerber, Hyland &amp; Hunter 2010);
                    </item>
                    <item>
                        Visualization tools developed by AusStage (Bollen et al. 2009);
                    </item>
                    <item>
                        OHRM<note><ref target="www.esrc.unimelb.edu.au/ohrm/" type="external">www.esrc.unimelb.edu.au/ohrm/</ref></note> – developed by the University of Melbourne
                            to model entity relationships and publish information about collections
                            into aggregated frameworks;
                    </item>
                    <item>
                        <p>Heurist and FieldHelper – developed by the University of Sydney to
                            aggregate data, model entity relationships and publish collections of
                            data to the Web (including maps and timelines).</p>
                    </item>
                    
                </xmt:uList>
            </div>
            
            <div>
                <p>The main adaptation required is to extend their functionality to work with Linked
                    Data URIs and to be hospitable to cloud-based hosting. Where no Australian tool
                    can be used or adapted, international Open Source tools will be used. </p>
                <p>The annotations, compound objects and tags created by researchers
                    using the Analysis tools will be stored in RDF as part of HuNI’s Linked Data
                    Service. Descriptions of these data collections will also be made available for
                    harvesting in RIF-CS format by the Australian National Data Service (ANDS) for
                    its Research Data Australia service.</p>
                <p>HuNI’s Discovery environment builds on the technologies used by a variety of
                    Australian and international services to provide sophisticated searching and
                    browsing across data extracted from heterogeneous data sets and combined into a
                    Linked Data Service. These models include the Atlas of Living
                            Australia<note><ref target="www.ala.org.au" type="external"
                            >www.ala.org.au</ref></note>, as well as general humanities-related
                    services like SOCH (Swedish Open Cultural Heritage) and discipline-specific
                    services like CLAROS. The British Museum’s new ResearchSpace will also serve as
                    a key exemplar. </p>
                <p>The Discovery environment is underpinned by the Linked Data Service. The
                    preferred solution for supporting and presenting Linked Data is the Vitro
                    software developed by Cornell University, which serves as the basis for the VIVO
                    research profiling service. VIVO has been implemented at two of the HuNI partner
                    institutions: the University of Melbourne and the University of Western
                    Australia. </p>
                <p>The outputs from the Discovery environment will be produced in formats which can
                    be consumed by tools performing Analysis and Sharing functions. The Linked Data
                    Service will support an API for exposing data in RDF/XML and JSON formats. This
                    API will form the basis for reuse of the data by service providers other than
                    HuNI, as well as enabling the custodians of data sets which contribute to HuNI
                    to build workflows for pulling new content from the Linked Data Store into their
                    own data sets.</p>
                <p>A necessary prerequisite for assembling heterogeneous data from different data
                    sets into a Linked Data format is a Semantic Mediation and Mapping Service.
                    There will be two main components to this service:</p>
                
                <xmt:uList>
                    
                <item>Tools for
                    extracting, exposing and transforming entity data contained in existing cultural
                    data sets and in digital objects;</item>
                <item>An environment for harvesting, ingesting, matching, aligning, and
                    linking the entity data.</item>
                </xmt:uList>
                    
                    <p>For extracting and exposing entity data, two different
                    approaches will be supported. The major content providers will develop and
                    provide RESTful Web APIs for their data sets. A service will also be established
                    to allow smaller data providers to expose their data for transformation and
                    ingest without the need to develop an API. This service will use the harvest and
                    ingest functionality of the Vitro software. Tools for converting static
                    databases and spreadsheets to RDF will also be deployed. Data ingest from entity
                    identification using text mining will follow at a later stage.</p>
                <p>The Semantic Mediation and Mapping Service will draw on a range of vocabulary and
                    identifier services, which will be managed through a controlled vocabulary
                    registry with links to the ANDS identifier and vocabulary services. HuNI’s
                    initial focus will be on matching, aligning and linking data relating to people,
                    places and objects, using several high-level vocabularies and ontologies. These
                    include Australian vocabularies like the Gazetteer of Australia<note><ref target="www.ga.gov.au/place-names/" type="external">www.ga.gov.au/place-names/</ref> </note> and the PeopleAustralia service (Dewhurst 2008).</p>
                <p>The HuNI service will enable researchers to find and analyse data across a range
                    of humanities disciplines, and to save the outputs of their analysis in a
                    variety of forms, including compound digital objects, annotations, maps,
                    timelines, and graphs. They will be able to share their results and outputs with
                    other researchers. </p>
                
                <p><figure>
                        <graphic url="img366-1.jpg" rend="left" height="256px" width="341px"
                            mimeType="image/jpeg"/>
                        <head>Figure 1: HuNI Architecture</head>
                    </figure></p>
            </div>
        </body>
        <back>
            <div>
                 <head>References</head>

            <p><hi rend="bold">Bizer, C., T. Heath, and T. Berners-Lee</hi> (2009). Linked Data –
                The Story So Far. <hi rend="italic">International Journal on Semantic Web and
                    Information Systems</hi> 5(3): 1-22.</p>
            <p><hi rend="bold">Bollen, J., N. Harvey, J. Holledge, and G. McGillivray</hi> (2009).
                AusStage: e-Research in the Performing Arts. <hi rend="italic">Australasian Drama
                    Studies</hi> 54: 178-194.</p>
            <p><hi rend="bold">Borgman, C. L.</hi> (2007). <hi rend="italic">Scholarship in the
                    Digital Age: Information, Infrastructure, and the Internet</hi>. Cambridge,
                Mass.: MIT Press.</p>
            <p><hi rend="bold">Dewhurst, B.</hi> (2008). People Australia: a Topic-Based Approach to
                Resource Discovery. In <hi rend="italic">VALA2008 Conference proceedings.
                </hi>Melbourne: VALA. <ref target="http://www.valaconf.org.au/vala2008/papers2008/116_Dewhurst_Final.pdf" type="external">http://www.valaconf.org.au/vala2008/papers2008/116_Dewhurst_Final.pdf</ref> (accessed
                30 March 2012).</p>
            <p><hi rend="bold">Gerber, A., A. Hyland, and J. Hunter</hi> (2010). A Collaborative
                Scholarly Annotation System for Dynamic Web Documents – A Literary Case Study. In
                    <hi rend="italic">The Role of Digital Libraries in a Time of Global Change</hi>
                (Lecture Notes in Computer Science 6102). Berlin: Springer, pp. 29-39. </p>
        </div>
        </back>
    </text>
</TEI>