The PROV working group received a question from Mike:
My understanding is that an entity referenced in a PROV bundle (e.g. via wasGeneratedBy) must be in the bundle…but I do not wish to duplicate entity definitions through out my bundles. My entities are long lived and will exist in multiple bundles.
So lets say I have a resource for alarms which contains a list of all alarms my company monitors. If I turn off the alarm at
alarm/1, my understanding is that in PROV a new entity is created for the new state of
alarm/1. But in my actual data store, I don’t create a new record, I just toggle a flag.
So there is a disconnect between how my PROV looks and how my data looks. This is by design is my understanding. So I would have a new entity in my prov for the
alarm/1in the new state which is a specialization of
Ultimately, I want to display all of the provenance for
alarm/1so I can see its history from creation to invalidation. Am I going about this the wrong way?
This is a very good question. I am not sure how this relates to bundles, so I’ll start with the topic of entities and then move on to specializations and finally bundles.
In PROV we describe entities as in one way or another being ‘static’ descriptions of things in the world. In your example, there are two apparent abstraction levels of how ‘static’ an alarm is. The most general entity is:
|<alarm/1> a prov:Entity, ex:Alarm ;|
|prov:atLocation <customer/5> .|
We here consider the alarm over its lifetime at a given customer, no matter its current status. So we can describe its installation date as its provenance using prov:generatedAtTime:
|<alarm/1> prov:generatedAtTime "1984-05-15T17:19:41Z" .|
We can also properties that are more fluctuating and might change during the lifetime of the entity:
|<alarm/1> ex:currentStatus "active" ;|
|ex:brightness 0.80 ;|
|ex:noiseLevel 0.50 .|
If I retrieve the same resource later today, this might instead show:
|<alarm/1> ex:currentStatus "disabled" ;|
|ex:brightness 0.20 ;|
|ex:noiseLevel 0.89 .|
This is fine, there is no requirement for a prov:Entity to not change its attributes – however its provenance should not change.
Now what if we wanted to know how it changed from active to disabled, but don’t really care about all the possible levels of brightness and noise it had in-between? Then it might make sense to specialize the alarm entity to what we would in common programming probably just call “alarm state”. It is still describing the same alarm, but at a finer granularity and a shorter time spam:
|<alarm/1/state/123> a prov:Entity, ex:AlarmState ;|
|prov:specializationOf <alarm/1> ;|
|ex:currentStatus "active" ;|
|prov:generatedAtTime "2013-10-28T18:00:00Z" ;|
|prov:invalidatedAtTime "2013-10-28T23:50:00Z" .|
|<alarm/1/state/124> a prov:Entity, ex:AlarmState ;|
|prov:specializationOf <alarm/1> ;|
|ex:currentStatus "disabled" ;|
|prov:generatedAtTime "2013-10-28T23:50:00Z" .|
We might specify a new subclass
ex:AlarmState with the understanding of ‘locking down’ the
ex:currentStatus field – such subclasses would allow different kind of specialization, in case you also needed a specialization like
Each state has a different generation and invalidation time, indicating the life span of the state. This is a continuous span, so the alarm state that was disabled last week is different from the disabled alarm state today, because the alarm was active in the mean time. However both of these are specializations of the entity that describe the alarm regardless of status.
You might want to organize these states in an order so you don’t need to compare the start/end timestamps, using prov:wasRevisionOf.
|<alarm/1/state/124> prov:wasRevisionOf <alarm/1/state/123> .|
|<alarm/1/state/123> prov:wasRevisionOf <alarm/1/state/122> .|
So what if we want to describe who disabled it? A simple solution is to now just provide prov:wasAttributedTo at each state — (indicating who was associated with the activity that led to its creation):
|<alarm/1/state/124> prov:wasAttributedTo <customer/5>.|
So now we know that
customer/5 caused the alarm to be disabled somehow (it was probably not a corrupt supervisor at the security company).
If you want to detail this more, say to record how the customer did this (e.g. clicking the alarm panel) – then you can introduce an activity to describe the transition:
|<alarm/1/state/123> prov:wasInvalidatedBy <activities/987> ;|
|<alarm/1/state/124> prov:wasGeneratedBy <activities/987> .|
|<activities/987> a prov:Activity, ex:AlarmPanelAction ;|
|prov:wasAssociatedWith <customer/5> .|
Now as to get back to the bundles – a prov:Bundle is defined as
a named set of provenance descriptions, and is itself an Entity, so allowing provenance of provenance to be expressed
Now I would say that any resource that contains provenance statements (and in particular PROV statements) is a prov:Bundle. However this fact and typing might not be recorded anywhere, and it would generally only be used as a term when you want to describe provenance of provenance records, or if you are cataloguing provenance traces.
So moving to your example, perhaps the provenance of the longer-living entities (the alarm and their installations) are recorded in a separate resource, e.g.
alarm/all; while you record short-lived alarm states in a different resource, e.g.
events/2013-10 for alarm events this month.
Now you might not want to include the full provenance of
events/2013-10 – and so you can use prov:mentionOf and prov:asInBundle to indicate that you are specializing an entity that is described in a different bundle.
In a way this is just a more formal way of saying:
Above we used prov:has_provenance, which is a looser way to say that you should be able to find a provenance description in the given resource that somewhat involves the resource.
Using prov:mentionOf and prov:asInBundle adds the additional meaning that you mean to specialize the entity as it is described in that bundle. This might not always be what you want, the exact interpretation of this is still application specific, for instance above we don’t intend for the current
ex:currentStatus to be inherited by the different timestamped states, but the prov:atLocation of
alarm/1 is also true for both
As in any kind of modelling, consideration needs to taken as to what granularity of description is beneficial for each particular use case. Entities provides a way to describe a resource with some aspects frozen, enabling us to describe its provenance. For instance, an alarm can be described as a series of alarm states, which each a certain time period they exist over. It is however domain-specific what we mean is included in the snapshot and what is considered just additional properties. PROV specialization allow us to relate entities that are descriptions of the same thing in the world, but at different abstraction levels, for instance covering different time spans.