Monday, August 27, 2018

Kerameikos.org receives NEH Digital Humanities Advancement grant

The Kerameikos.org scientific committee is pleased to announced that the National Endowment for the Humanities has granted the project about $85,000 to develop the full range of Archaic and Classical Greek pottery concepts and a series of software development features and APIs. This is an 18-month Level II Digital Humanities Advancement Grant project. Below is a portion of our application that briefly describes our primary goals for this phase of the project:


Final Product and Dissemination


Experience from development and proliferation of Nomisma.org principles suggests that community buy-in of shared vocabularies and technical methodologies can only come when non-technical specialists (in Nomisma’s case: numismatists and archaeologists) are able to see and use functional web applications. In numismatics, broad buy-in therefore only came after the publication of the first phase of Online Coins of the Roman Empire project, with its multilingual interfaces, geographic and quantitative distribution visualizations, and photographs of coins aggregated from several prominent collections. Similar analytical and visualization tools have been built into Kerameikos.org directly, derived from the small subset of Greek vases from the British Museum and the Getty that were ingested in the initial proof-of-concept phase in 2014. It is our hope that the Linked Open Greek Pottery project will foster similar buy-in among archaeologists, art historians, and museum professionals. These specialists will play an integral role in the long-term curation of data and, ultimately, in the expansion of Kerameikos.org beyond its current Archaic and Classical Greek scope. The final products of this 18-month phase fall into three broad categories, culminating in dissemination:

1. Archaic and Classical Greek Pottery Concepts

All time periods, materials, pottery shapes, styles, production places, and corporate and personal entities (painters, potters, and associated workshops) relevant to Archaic and Classical Greek pottery will be represented in Kerameikos.org according to our metadata application profile (a combination of linked data ontologies). These concepts will be linked to matching concepts in other relevant vocabulary systems, such as the Pleiades Gazetteer of Ancient Places and the Getty Art & Architecture Thesaurus. The metadata application profile will be fully documented in similar fashion to those of the Digital Public Library of America and Europeana. Furthermore, we will register a Wikidata.org property for Kerameikos.org and integrate our vocabularies into Wikidata for broader reuse within the Wikipedia community.

2. A Core Set of Research Data and Tools

Using partners that embrace Open Data principles, we will aggregate Greek vases connected to the concepts defined in Kerameikos.org. These partners include large museums with internationally recognized collections, such as the Getty and the British Museum, but also smaller university museums including the Fralin Museum at the University of Virginia, the Harvard Art Museums, and the Ure Museum of Greek Archaeology at the University of Reading (UK). The aggregation of these collections will facilitate sophisticated (even if incomplete) geographic and distribution analyses and will demonstrate the potential of Kerameikos.org as a research tool for the discipline. The vases will be encoded in CIDOC-CRM conforming to the specifications outlined in linked.art, a community-oriented project driven primarily by Linked Open Data specialists at the Getty Museum.

3. Software Development Extensions for Data Aggregation, Manipulation, and Export

In order to thrive and grow, it is necessary to build and document tools to enable the cleaning and harvesting of data by non-technical specialists. A harvesting application will be developed in the Kerameikos.org back-end that will be able to consume vase data from linked.art-compliant JSON-LD (a linked data model aimed at developers) APIs. One of the most important features of the Linked Open Greek Pottery project is its OpenRefine reconciliation APIs. The Kerameikos.org reconciliation API enables museum curators or archaeologists to standardize labels of shapes, styles, people, etc. to the preferred labels defined in Kerameikos.org, and insert Kerameikos.org URIs directly into the source data. This sort of normalization makes it easier to integrate datasets into the broader Greek vase LOD cloud. The Linked Open Greek Pottery project will expand the functionality of existing APIs, as well as provide documentation for usage. Finally, like Nomisma.org, Kerameikos.org can serve as a content hub for smaller collections that lack personnel or technical expertise to provide data directly to the Pelagios Commons project. Object data stored within the Kerameikos.org SPARQL endpoint will be exported directly into Pelagios’ own RDF model.

Wednesday, November 1, 2017

OpenRefine Reconciliation API for Kerameikos

Over the last few days, I developed and launched an OpenRefine reconciliation service for Nomisma.org. This service allows a user to clean up spreadsheets (whether for museum objects or personal research data) by reconciling columns of concepts (denominations, mints, people, etc.) against the concepts defined on Nomisma.

Since Kerameikos is built entirely from the same software architecture as Nomisma, I was able to copy and paste all of the associated Orbeon XML Pipeline files and XSLT stylesheets in order to deploy the same reconciliation service at http://kerameikos.org/apis/reconcile.

Like Nomisma, the service supports the main reconciliation API, the autosuggest feature for additional matches, and preview popups that will display the skos:prefLabel and skos:definition in a small HTML popup window.

Preview API


Entity suggestion API

This new feature necessitated the upgrade to the newest version of Orbeon Forms in order to support the processing of JSON in XSLT. These XML pipelines serve as an intermediary process between the OpenRefine API requests, Solr search queries, and the resulting JSON output.

With this new feature, it will be much easier and more efficient to normalize large quantities of Greek vase data against Kerameikos concepts, which will hopefully pave the way toward greater aggregation of data and facilitate more complete and accurate distribution and geographic visualizations.

We are beginning the process of reconciling the Beazley Archive's term lists against Wikidata entities and the Getty vocabularies in order to generate all of the distinct concepts for Greek pottery in Kerameikos. This new API will then facilitate the normalization of Beazley data against Kerameikos, enabling the data to come full circle through a cleaning process.

Thursday, December 1, 2016

Experimenting with IIIF, CIDOC-CRM, and the Europeana Data Model

One of our major tasks in the future is to facilitate sophisticated analysis of Greek pottery aggregated by means of coreferencing between Kerameikos.org concepts and other vocabulary systems. The proof of concept that we demonstrated at CAA in Paris included dozens of Greek vases from the Getty and British Museum. One such immediate potential data partner is the Harvard Art Museums.

The Harvard Art Museums have adopted an open approach to their collection and have implemented a powerful and well-documented API. This API has allowed us to integrate thousands of Greek and Roman coins from their collection into Nomisma.org to be made available through several type corpus projects (like Online Coins of the Roman Empire). Harvard contains a nice collection of Greek pottery and a means of harvesting these materials programmatically. Furthermore, they are an adopter of the International Image Interoperability Framework (IIIF), which would enable zooming of large-scale images or dynamic extraction of portions of images, as well as facilitate the annotation of these images with related iconographic or decorative subject matter or inscriptions. Since we are using CIDOC-CRM to describe the vases, the question is: how can we extend our model to include metadata that will enable the integration of IIIF functionality directly in Kerameikos.org?

Fortunately, the hard work has already been done for us. Europeana has already published specifications for linking to IIIF services and metadata manifests within the Europeana Data Model, and there are a number of useful examples, such as those provided by John Howard at the University College Dublin Library.

While we may migrate from FOAF TO EDM properties for linking to large images or thumbnails (edm:preview and others), we do not need to modify the current system of foaf:thumbnail and foaf:depiction in order to accommodate IIIF integration.

We can do this by adding some more triples about the URL of the foaf:depiction.  E.g.,

?vase a crm:E22_Man-Made_Object ;
    foaf:depiction <http://nrs.harvard.edu/urn-3:HUAM:DDC251369_dynmc>.


<http://nrs.harvard.edu/urn-3:HUAM:DDC251369_dynmc> a edm:WebResource ;
     svcs:has_service <https://ids.lib.harvard.edu/ids/iiif/46594017>
    dcterms:isReferencedBy <http://iiif.harvardartmuseums.org/manifests/object/288118>.

<https://ids.lib.harvard.edu/ids/iiif/46594017> a svcs:Service ;
    dcterms:conformsTo <http://iiif.io/api/image> ;
    doap:implements <http://iiif.io/api/image/2/level2.json>.

 In our SPARQL query for aggregating objects, we can optionally extract the dcterms:isReferencedBy for the foaf:depiction of our Greek vase. There's an XSLT conditional for parsing the SPARQL response so that our fancybox JQuery plugin will either show a popup of an image file or a popup window of the Leaflet IIIF plugin.

As a simple proof of concept, I have extracted two vases (RDF here) from Harvard of the Berlin Painter and successfully implemented the RDF model and modified the accompanying SPARQL queries and code accordingly to show zoomable images of these vases.

Thursday, November 10, 2016

Distribution visualization with SPARQL and d3js

After more than a year of dormancy, I have picked up Kerameikos.org development again in preparation of a collaboration with the Beazley Archive of the University of Oxford and, hopefully, a grant application. We hope to publish the entire array of identifiers necessary for Archaic and Classical Greek pottery and develop more advanced analysis and visualization systems built upon open vase data we can acquire from a variety of sources (e.g., the British Museum and the Harvard Art Museums).

Aside from some minor stylistic updates to the site, I implemented two major changes:

1. I rewrote the geographic visualizations to serialize the SPARQL response into geoJSON to render in Leaflet instead of the OpenLayers-based Timemap library, which has not seen active development in at least five years. I really like being able to scroll through a timeline of objects, but I will have to wait until another Leaflet plugin can do something similar.

2. I implemented SPARQL-based distribution visualization with the d3plus plugin to d3js. The code was almost entirely ported from the Nomisma.org distribution analysis features I have recently been working on.

This builds on the previously established model where request parameters are parsed within Orbeon's XML Pipeline Language and constructed into an XML object that is then transformed with XSLT into a textual SPARQL query. The difference here is that the example vases from the Getty and British Museum are represented as Linked Open Data with CIDOC-CRM, as well as defined by the typological URIs in their own vocabulary systems (AAT/ULAN/TGN and the British Museum's own internal LOD thesaurus, respectively). As a result, the XML model that represents the query is significantly more complex than the Nomisma visualizations, which are built on a simpler RDF model and only a single vocabulary system.

In the query below, we are getting the distribution of shapes for Red Figure pottery:

SELECT DISTINCT ?concept ?label (count(?concept) as ?count) WHERE {
  {
    SELECT ?1 WHERE { kid:red_figure skos:exactMatch ?1}
  }
    ?object crm:P32_used_general_technique ?1.
    ?object kon:hasShape ?dist  
  {
    SELECT ?dist ?label ?concept WHERE {
      ?concept skos:exactMatch ?dist;
               skos:prefLabel ?label FILTER langMatches(lang(?label), "en")}
  }
} GROUP BY ?concept ?label ORDER BY ?label
As you can see, there is a subselect where we gather all of the URIs that are SKOS exact matches for the Kerameikos URI and then get the objects created with this technique. Using a simplified semantic that better represents knowledge organization specifically within ceramics studies, we use kon:hasShape to get the shape URIs. Like techniques, these URIs may be in the AAT or BM thesaurus. We therefore have to get the matching Kerameikos URI, and extract the English label. Here is the full query. Here are the results to the SPARQL query in HTML.

With regard to the XML model that forms the SPARQL query, the XPL/XSLT stylesheet is on Github. Below is an example, where $object is the object in the triple. The $id variable is formed by position (must be unique in the query) of the piece of the query in HTTP request parameter. The parameter, in this case, is 'compare=technique kid:red_figure'. Queries can be more precise by concatenating multiple predicate-object pairs with a semicolon.

<statements>
    <select id="{$id}">
        <triple s="{$object}" p="skos:exactMatch" o="?{$id}"/>
    </select>
     <triple s="?object" p="crm:P32_used_general_technique" o="?{$id}"/>
     <triple s="?object" p="kon:hasShape" o="?dist"/>
</statements>

This XML is transformed with XSLT into SPARQL and executed in the XPL. Like in Nomisma, you can compare multiple query sets.

Distribution of shapes for Red vs. Black Figure Greek pottery (from a limited sample size)

Charts are generated via AJAX on Kerameikos ID pages but are generated by passing request parameters on the distribution page, enabling the copying and pasting of charts. Furthermore, you can download CSV that represents the datasets, which will include geographic coordinates if Production Place is the distribution category.

Thursday, April 30, 2015

Kerameikos.org editor extended to support Getty ULAN linking

This morning, the Getty Museum announced the release of the Union List of Artist Names (ULAN) into their new linked open data publication system. As soon as I had a few minutes free, I updated the Getty SPARQL lookup mechanism in the Kerameikos.org editor to extend the XForms component for querying the ULAN to link painters and potters in Kerameikos to Getty URIs. The updates were pushed into production in less than five minutes.



Now that lookups to the Getty and British Museum SPARQL endpoints and the VIAF RSS feed are available for people, I was able to create the id for Kleitias in about a minute (having also extracted preferred labels from dbpedia).

For a bit more information about how this works under the hood:

If you are editing a foaf:Person concept, the link to the Getty ULAN lookup will appear under "Import" in the right sidebar. If a preferred label has already been empty, clicking the Getty ULAN link will automatically query with the provided preferred label, otherwise you can enter your own string. The query string is passed into an XForms instance containing the following SPARQL query:

PREFIX gvp: <http://vocab.getty.edu/ontology#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
PREFIX ulan: <http://vocab.getty.edu/ulan/>
SELECT ?c ?label ?scopeNote WHERE {
?c a gvp:PersonConcept; skos:inScheme ulan: ;
gvp:prefLabelGVP/xl:literalForm ?label ;
skos:scopeNote/rdf:value ?scopeNote ;
luc:term "SEARCH_QUERY"} LIMIT 25


The luc:term is the property used to query a Lucene index. The query is sent to the Getty SPARQL endpoint via an XForms submission, and the XML response is then represented graphically as a table of results with checkboxes to include as skos:exactMatch. The checkboxes are not really necessary for the Getty lookup, as we can probably assume there is one URI per entity, but the mechanism is carried over from the VIAF lookup. VIAF has not fully disambiguated entities across all authority records.

Sunday, February 8, 2015

Kerameikos updates

Ontology

After much discussion of the Kerameikos list, we are moving forward with the initial proposed set of classes in the ontology, plus one: ProductionPlace. A production place in Kerameikos is a theoretical concept. It may be as specific as a workshop, if it can be ascertained, or as generic as a large region, such as West Greece. Presently, there are four pottery-specific classes in the ontology besides ProductionPlace: Shape, Style, Technique, and Ware (still under debate). There are a variety of classes borrowed from CIDOC-CRM, such as E4_Period.

The complexity of instances in the Kerameikos id namespace is variable. Shapes and places are relatively simple. Techniques, styles, and periods can be built from the ground up--starting simply, but with the potential for greater complexity. For example, the Black Figure technique is invariably composed incision and silhouette, also techniques. A style, like Marine Style, has a single identifier in a system like the Getty AAT, but is a combination of concepts: it is a particular decorative style from a particular place (Crete), culture (Minoan), and period (Late Minoan), which falls in the Bronze Age (which has vague absolute dates attached).

The Kerameikos editor allows for the creation and linking of these concepts on a simple level, but I am to extend the editing functionality to enable more complex typologies.

Concept Schemes

Previously, the Kerameikos id namespace (http://kerameikos.org/id/) was the landing page for browsing the collection, but it now resolves to a skos:ConceptScheme. It is possible to use content negotiation to get this scheme in RDF/XML, Turtle, and JSON-LD. The browse page has been moved to the browse/ pipeline.

The introduction of concept schemes into Kerameikos.org will enable the next iteration of ids. We plan to begin attributing concepts in the id namespace to bibliographic references, i. e., to link painters and potters to references that identify them. This means that in the near future, we'll introduce a 'work' concept scheme. We'll be able to link ids to Worldcat or JSTOR URIs. In the case of Worldcat, at least, we'll be able to extract bibliographic and authority RDF metadata from OCLC. The attribution of ids is absolutely necessary for the project, and will likely open up other avenues of inquiry. In the future, one might get a list of all painters identified by John Beazley or identify all vases found in Vulci that were published in 1890-1900.

Data Dumps

Data dumps are linked from the home page. The Kerameikos.org data are available in RDF/XML, JSON-LD, and Turtle. The Pelagios dump is now available as well, although only a few production places are linked to Pleiades.

Thursday, January 22, 2015

Roman cooking ware terminology, function, real use: problems and solutions for standardization, recording, sharing

Submitted by Laura Banducci, Carleton University


Kerameikos.org has focused thus far on creating and connecting data about Greek painted pottery; it could be extended to serve usefully in Roman pottery studies. In Roman pottery there are similar complexities of language differences (within ancient languages, and among modern researchers) which create obstacles to knowledge sharing. In the case of some wares, namely cooking vessels, we also have substantial cultural differences which befuddle attempts to associate like form with like form. For example, a casserole in North American English can mean two very differently-shaped vessels. Then, in Italian the idea of a casseruola only applies to one of these English forms. The term “casserole” itself has a particular food associated with it. There are cooking jars versus cooking bells, pans, versus trays, etc. The examples of this are myriad.

Furthermore, the understanding of function and use is quite a significant facet in many research questions regarding cooking wares, since food and cooking can touch upon technology, environment, and cultural identity. Yet function and use are typically understood from the observation of form. These formal definitions are wrapped not only in our own cultural biases but also frequently in the connections we have drawn between vessels named in ancient texts and artefacts (Bats 1988). Yet, scrutiny of Latin sources reveals that “patinae” “ollae” or “testa” are used in inconsistent ways (Donnelly Forthcoming 2015), thus their strict association with certain shapes, recipes or food groups is inappropriate.

This paper elucidates these problems and proposes several solutions for the roundtable. I suggest a standardized way of choosing terms for shape using specific physical descriptions. Next, a major way to contribute to the understanding of both intended function of these vessels and their actual ancient use, is to add the study of use-wear analysis to ceramic study as a standard practice. Use-wear analysis or “ceramic alteration analysis” (Skibo 1992) is increasingly acknowledged to be the next logical step in the close study of utilitarian vessels (Lis 2010; Pena 2014; Swift 2014; Banducci 2014). Traces of wear can be combined with observations made about form to determine use.  This type of analysis also has the potential to reveal multi-functionality, including both contemporaneous multiple uses of one object as well as the use of an object for its non-intended purpose. The difficulty has been in creating a recording system to observe essentially qualitative data (the observation of different types of wear) in a quantitative way. The recording system is comprehensive enough to admit wide variability while also sufficiently well-defined to permit focused analyses of characteristics within and across the dataset. This system was inspired in part by the framework used by conservators completing surveys of the conditions of artefacts in museum collections. Databases of wear and morphology have the potential to be scalable to many different types of archaeological vessels – tracking function, use, use-life. Disseminating systematic ways of observing and recording this information requires a robust digital platform like kerameikos.org.