Locating provenance for a RESTful web service

This blog post shows how RESTful web services can provide, and link to, provenance data for their exposed resources by using the PROV-AQ mechanism of HTTP Link headers. This is demonstrated by showing how to update a hello world REST service implemented with Java and JAX-RS 2.0 to provide these links.

The  PROV-AQ HTTP mechanism is easiest explained by an example:

GET http://example.com/resource.html HTTP/1.1
Accept: text/html
HTTP/1.1 200 OK
Content-type: text/html
Link: <http://example.com/resource-provenance&gt;;
rel="http://www.w3.org/ns/prov#has_provenance&quot;;
anchor="http://example.com/resource&quot;
<html>
<!– … –>
</html>

view raw
gistfile1.http
hosted with ❤ by GitHub

This request for http://example.com/resource.html returns some HTML, but also provides a Link: header that says that the provenance is located at http://example.com/resource-provenance. Within this file, the resource is known as the anchor http://example.com/resource rather than http://example.com/resource.html. The anchor URI can be omitted if it is the same as the one requested.

Link headers are specified by RFC 5988, which also defines standard relations like rel="previous". PROV-AQ uses rel="http://www.w3.org/ns/prov#has_provenance" to say that the linked resource has the provenance data for the requested resource. PROV-AQ also defines other relations for provenance query services and provenance pingback, which is not covered by this blog post.

Continue reading “Locating provenance for a RESTful web service”

Recording authorship, curation and digital creation with the PAV ontology

PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning. PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.

PAV version 2.1.1 was released on 2013-03-27, making PAV an extension of the W3C provenance ontology PROV-O, thus  enabling interoperability between PAV and PROV-compliant tools such as ProvToolbox.

Overview

pav-simpler

Note: PAV does not define any classes, and the PAV properties do not put any explicit restrictions on their domain/ranges. Therefore the classes above, like “another resource”, are only for illustration of typical use. The diagram above does not show data properties attached to resources, like pav:createdOn.

Example

Here’s an example of using PAV:

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix pav: <http://purl.org/pav/&gt; .
@prefix foaf: <http://xmlns.com/foaf/0.1/&gt; .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix : <http://example.com/blog#> .
<http://example.com/blog.html&gt;
pav:createdBy :alice ;
pav:createdWith :wordpress ;
pav:importedFrom <http://example.com/data.csv&gt; ;
pav:importedBy :csv2html ;
pav:authoredBy :bob ;
pav:curatedBy :charlie ;
pav:authoredOn "2012-12-24T15:15:15Z"^^xsd:dateTime ;
pav:importedOn "2013-03-27T10:06:17Z"^^xsd:dateTime .
:alice foaf:name "Alice" .
:bob foaf:name "Bob" .
:charlie foaf:name "Charlie" .
:csv2html a prov:SoftwareAgent ;
foaf:homepage <https://github.com/mrc/csv2html&gt; .
:wordpress a prov:SoftwareAgent ;
foaf:homepage <http://wordpress.org/&gt; .

view raw
gistfile1.pl
hosted with ❤ by GitHub

Continue reading “Recording authorship, curation and digital creation with the PAV ontology”

Tutorial on the W3C PROV family of specifications

Provenance, a form of structured metadata designed to record the origin or source of information, can be instrumental in deciding whether information is to be trusted, how it can be integrated with other diverse information sources, and how to establish attribution of information to authors throughout its history.

The PROV set of specifications, produced by the World Wide Web Consortium (W3C), is designed to promote the publication of provenance information on the Web, and offers a basis for interoperability across diverse provenance management systems. The PROV provenance model is deliberately generic and domain-agnostic, but extension mechanisms are available and can be exploited for modelling specific domains.

Paolo Missier, Khalid Belhajjame and James Cheny gave a tutorial at the EDBT conference on 2013-03-20 in Genova, Italy. The tutorial provided an account of these specifications. Starting from intuitive and informal examples that present idiomatic provenance patterns, it progressively introduces the relational model of provenance along with the constraints model for validation of provenance documents, and concludes with example applications that show the extension points in use.

Tutorial material

The tutorial is in three parts, each about 30 minutes long, and consists of the following material:

There is also a short paper describing the motivation, structure and content of the tutorial, published in the EDBT’13 proceedings: The W3C PROV family of specifications for modelling provenance metadata, Paolo Missier, Khalid Belhajjame, and James Cheney