Friday, May 28, 2021

Tampa Museum of Art joins Kerameikos + OpenRefine templates

The Tampa Museum of Art (TMA) has recently joined the Kerameikos project, supplying data and Creative Commons-licensed images for a dozen Attic vases that have been digitized so far as part of their new collections management system. These objects can be seen at the Kerameikos URI for the TMA.

Tampa Museum of Art
 

Importantly, this is the first collection normalized in OpenRefine and directly exported into the Linked Art CIDOC-CRM RDF/XML aggregation model. Previous collections from the British Museum and Fitzwilliam were reconciled to Kerameikos URIs in OpenRefine and then exported into CSV for external processing with PHP scripts. I'd like to get away from this bespoke scripting, and OpenRefine's export templates are more than adequate for generating RDF for import into the Kerameikos SPARQL endpoint.

I have added this template into Gist, and hopefully other projects can use them to do their own reconciliation and normalization, and provide RDF to us without me personally doing this work. In the longer term, we are aiming to harvest Linked Art JSON-LD directly, which I had previously prototyped in October 2019 with data from the Indianapolis Museum of Art.

First, you can see the TMA spreadsheet, post-reconciliation, here.

The forNonBlank GREL statement enables including properties or nodes only if a URI is present in the spreadsheet:

{{forNonBlank(cells["Shape URI"], c, '<kon:hasShape rdf:resource="' + c.value + '"/>', "")}}

 Where kon:hasShape is a subproperty of crm:P2_has_type, but otherwise the Kerameikos data model follows the Linked Art profile pretty precisely. Concept URIs should be Kerameikos ones. The Linked Art JSON-LD harvester normalizes Getty and others to Kerameikos, when they are linked via skos:exactMatch.

Findspot URIs should be reconciled to Wikidata places:


{{forNonBlank(cells["Findspot URI"], c, '<crmsci:O19i_was_object_found_by>
    <crmsci:S19_Encounter_Event>
        <crm:P7_took_place_at>
            <crm:E53_Place>
                <rdfs:label xml:lang="en">' + cells["Findspot"].value + '</rdfs:label>
                <crm:P89_falls_within rdf:resource="' + c.value + '"/>                        
            </crm:E53_Place>
        </crm:P7_took_place_at>
    </crmsci:S19_Encounter_Event>
</crmsci:O19i_was_object_found_by>', "")}}

Measurements are expressed by using Getty AAT URIs for the measurement type (e.g., height, width, etc.) and unit (cm, mm, etc.). Below illustrates rendering a centimeter height measurement from the spreadsheet into RDF:

{{forNonBlank(cells["Height (cm)"], c, '<crm:P43_has_dimension>
    <crm:E54_Dimension>
        <crm:P90_has_value rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">' + c.value + '</crm:P90_has_value>
        <crm:P2_has_type rdf:resource="http://vocab.getty.edu/aat/300055644"/>
        <crm:P91_has_unit rdf:resource="http://vocab.getty.edu/aat/300379098"/>
    </crm:E54_Dimension>
</crm:P43_has_dimension>', "")}}

Presently (until the TMA sorts out a server issue with their IIIF image info.json now being accessible), the TMA images are jpeg files formed by using the image API to get an 800 pixel wide response, but the model for representing IIIF services and manifests can be found at https://linked.art/model/digital/#iiif.

Wednesday, May 5, 2021

Prototype Object Viewer in Kerameikos

Over the last few days, I have put together a prototype of an object viewer within the Kerameikos.org framework that reads the vase URI from a URL parameter and executes a SPARQL query of the underlying Linked Art-compliant CIDOC CRM to gather all of the metadata necessary to create a nice, human-readable page. The construction of these page includes an API call of Kerameikos to get all of the associated SKOS concept data for any kerameikos.org URI referred to by the vase RDF. This pipeline can be extended to query data APIs from Nomisma.org, the Getty vocabularies, or other controlled vocabulary data systems.

I have taken the additional step of implementing an XSLT function that returns multilingual UI labels, even though almost none of these UI labels have been translated into other languages yet. However, the language (whether set by the Accept-Language header by the browser or manually overridden with the 'lang' request parameter) is used to display the preferred label for the Kerameikos.org SKOS concept, if it is available in the underlying RDF data. This is often, though not always, the case for concepts that have been aligned to Wikidata, and labels extracted programmatically from their API.

Collections that make their images available through IIIF manifests (represented by crm:P129i_is_subject_of), such as the Fitzwilliam Museum will have these manifests rendered by Mirador. For other collections that conform to IIIF image APIs, but do not produce manifests, such as the British Museum, the image(s) will be displayed in the Leaflet IIIF viewer. Eventually, I will generate an intermediate API that dynamically generates a manifest from underlying IIIF image URIs so that these images can be annotated with iconographic URIs in order to build a more LOD-integrated research tool for iconography. This framework will extend beyond just vases to encompass other types of material culture.


British Museum vase of Exekias, partially displayed in French.

These pages are constructed by the following URL pattern:

http://kerameikos.org/object/?uri={object URI}

A dynamic GeoJSON response that may contain the production place coordinates or polygon and/or the findspot coordinates follows the pattern:

http://kerameikos.org/object/geoJSON?uri={object URI}

Example: http://kerameikos.org/object/geoJSON?uri=https://www.britishmuseum.org/collection/object/G_1836-0224-127

A link has been added to any image popup in the various concept pages (see below).

A popup of a vase of the Achilles Painter.
 

In the long-term, I hope to be able to peel this functionality from the Kerameikos.org software architecture and turn it into a standalone system that is more generalizable for any CIDOC-CRM that conforms to the profile expressed by the Linked Art community. This system is entirely driven by SPARQL queries at the moment, but I plan to integrate Fuseki with Solr or ElasticSearch to build out a faceted search interface and various data visualization tools, from geographic distributions to networks of artists to other sorts of statistical distributions. The system will be agnostic about specific types of content (vases), and could serve as a large scale aggregation and research tool for many types of objects, a sort of new rendition of Pelagios' dormant Peripleo.