Comments on Records in Contexts

In September last year the International Council on Archives’ Expert Group on Archival Description (EGAD) published the first consultation draft of Records in Contexts: A Conceptual Model for Archival Description with the call for comments closing on 31 January 2016. I submitted some hastily-assembled thoughts on the final day. For those interested in such things, here they are.

Comments on Records in Contexts: A Conceptual Model for Archival Description – Consultation Draft v0.1 – September 2016

I would like to thank the International Council on Archives and the Experts Group on Archival Description (EGAD) for the opportunity to comment on this draft, and congratulate EGAD for the work and depth of thinking that has clearly gone into preparing the document.

General comments

I fully support the intent of the work as outlined in the introductory sections, particularly the move to bring key aspects of the four existing ICA standards into a single combined conceptual model. I also recognise the necessity of more fully supporting the ability of the archival community to make description available as Linked Open Data (LOD).

The consultation draft is strong throughout and provides an excellent starting point for discussion with the broader community.

However, while the entire draft conceptual model is clearly based on a lengthy series of discussions, internal (to EGAD) drafting processes, and no doubt some disagreements and debates, the justification for particular decisions is not always clear.

For example, the introductory sections make a clear case for the need to look beyond hierarchical, inward-looking description to effectively capture contextual complexity; but prior to this the ‘primary description entities’ are presented without explanation or discussion. The document states that “there has been extensive analysis of each of the existing standards” leading to this decision, but the justification for the resulting list is not provided.

Similarly with the distinction between ‘entities’ and ‘properties’, and the definition of these (and other key terms) used within the document. What is an entity here? Why was the decision made to make date an entity rather than a property within this model? Though what there is in the introduction is mostly strong, there is implicit knowledge which needs to be made explicit to bring the rest of the community (people not involved in the EGAD process) on the journey.

Specific comments

p9-10: Without seeing further justification, I am not convinced by some of the ‘primary entities’ as given, in particular ‘Agent’, ‘Function (Abstract)’, and ‘Date’.

Agent: The term ‘agent’ raises a lot of issues. I see agency as relational, emerging from the interaction between a thing and other things to which it relates – its context. It is neither an entity type nor a property; and my preference would be to use e.g. ‘Person’ and ‘Group’ as entities.

Leaving to one side the complexities of contemporary theories of agency, in this document Agent (RiC-E4) is defined as: “A person or group, or an entity created by a person or group, that is responsible for actions taken and their effects” (p.14). Then there is an entity ‘Concept/Thing’ (RiC-E14) which includes in its scope “all RiC entities” (p.19).

The fact the properties for this second entity have no scope or examples listed (p.38) may be creating confusion; but it seems to suggest a person could be a concept/thing in some contexts and an agent in others. Or, if all people are agents, the definition seems to strongly preference recording creators and collectors over others (contrary to the more complex ideas about provenance supported in the introduction).

To understand a collection of records created by an orphanage about a child we might want to capture that child as a person in our network – are we going to suggest they were “responsible for actions taken and their effects” in relation to the records about them? Or are they a ‘Concept/Thing’ and only become an ‘Agent’ when they collect their own records? That person may have differing degrees and types of agency at different times and in relation to different things; agency is not a binary, something one has or does not.

Function (Abstract): I don’t see the value of including this. The definition states that it is “independent of the instances of the Function that is specific to a particular social and cultural context.” As both the examples are organisational, does this just mean “distinct from an individual organisation”? In a broader context? In the context of the domain being described? If the function of the word “particular” is intended to suggest one or either of these things more clarity is needed; and the purpose of capturing such an entity made clear. Why is capturing contextualised functions not enough?

Date: A justification is needed for this being an entity and not a property. It seems likely, on reading the document and knowing something of its context, that the idea is to separate out time and place from the specifics of records and other entities, because they are ‘things’ in their own right.

In separating out place as an entity (which I agree with in RiC-CM) it seems as though ‘date’ followed. Time and space are dimensionally interrelated, both endlessly divisible, the ways they are described are culturally specific, and we want to locate our (other) entities in time and space. But, though theoretically interesting, there are significant practical problems in terms of the complexity of the resulting network and the document does not give enough guidance to readers as to how this would work (and why it is necessary).

I understand this is not an implementation document, but the example network provided in Appendix 1 (p.93) seems to suggest dates would be, for all intents and purposes, used and appear like properties for any actual implementation. If date is to be retained as an entity, I would request that the examples provided in the document need to more clearly illustrate how the multitude of dates required in description will be incorporated into the network the model proposes.

Also, ‘Date’ finally appears as a property, but only for relations (RiC-P68 – p.91), with no explanation as to why there is a sudden change. (Is it simply because relations are not treated as entities? See comments on Relations below.)

A few more minor points:

RiC-P5 and RiC-P8 seem to have quite a bit of overlap as described in terms of whether the record is “whole and complete” and whether the record has “legibility and completeness”, and the examples both mention lost text due to damage (p.22)

Looking at records, and compounding the complexity of ‘date as entity’ (above), the property-like dates which could be associated with a record include creation and modification dates, the temporal extent of content, the temporal extent of the carrier, time and date periods related to access, use and history, and more (p.22-26). As presented, with date removed from any discussion of properties, it makes it hard for a user to conceptualise what this might look like. A section integrating entities (particularly date and place) with properties might alleviate this.

RiC-P18 Conditions of Access: Access could be an entity type in its own right. Ideas like ‘Restricted’ or ‘Closed under data protection legislation’ are contextual and bounded by time and place, and often require description, explanation, related records/evidence, stem from Mandates, etc. Furthermore, elevating access in some form to an entity would shift the balance a little away from what is still in some ways a very ‘creator-centric’ model and show that user and access conditions are also first-level entities in the archival view of the world (p.25).

RiC-P26 Arrangement could do with a little more clarity in terms of its scope and examples. The idea of a set of records intellectually but not physically arranged alphabetically is a curious one – does this mean e.g. ‘in folders labelled A, B, C, etc.’ even if they haven’t been stored in this order? (p.27)

RiC-P34 Language information: is the language ‘used by the Agent’ a property of a relationship rather than a property of the Agent? In the context of archival description a person doesn’t ‘have a language’, they are using a language to write a document, to conduct business, to take part in a transaction, or similar. The property should be attached to the relationship; otherwise we may know they use four languages but don’t know when they use them and for what.

RiC-P36 Gender: in support of comments made by others as part of this consultation, the way gender is applied, the vocabulary used, the means for representing people who identify with different genders or have varying gender expressions at different times, and the reasons for capturing gender in the first place (when is it required? why is it required?) need to be carefully considered.

Relations

Some points on relations:

I do not believe past and present tense relations are required. Apart from being a potential maintenance nightmare, it assumes the perspective of the present in a way which runs counter to the intention of contextual networks, which should use entities, relations and properties to show change through time, allowing people to view what that network looked like from different historical perspectives. If I am interested in 1962, in that context J.F. Kennedy is president of the United States of America. As someone in the present the date range of 1961-1963 on the relationship tells me when that statement was true.
Continuing the above, if users want some sort of visibility of ‘past/present’ relationships it should be inferred and displayed by a system using the date information available (i.e. be part of implementation), not supported by baking different relation tenses into the conceptual model.
Working with an extensive list of relations, and collaborating with allied professionals with their own lists, will require some sort of ‘meta-category’ of relation, such as: temporal/associative/familial/hierarchical, and categories to denote whether something is part of a larger whole (like a faculty in a University) or a separate but related (like a commercial entity run by a University but not ‘part of’ it). That way people can maintain their own social languages and names and have a higher-level category to support aggregation without having to go through a cross-mapping/semantic equivalence process.
If the previous point is taken, perhaps the conceptual model only needs to include these meta-categories, with the mapping of specific relations to these categories becoming an implementation issue.

Finally, and perhaps most importantly to my mind, relations should be entities, or at least treated more like entities than is currently apparent in the draft document.

The table of ‘Shared Properties’ only shows ‘Date’ and ‘Place’, which are suddenly both properties rather than entities (p.91). Each instance of a relation also needs an identifier – so records (or evidence) and entities can be related to them if needs be – and a ‘General Note’ property akin to RiC-P4 for “Description of the relation that is not otherwise addressed.”

The capacity to work with relations as entities (including the capability to attach evidence, mandates, and other entities to relations) is essential for future practice, even if not something which will be employed in many implementations in the near future.

And if ‘Date’ and ‘Place’ are going to be treated as entities they need to be entities here too – not entities and properties simultaneously within a single model – which presumably requires that relations be entities for any implementation to work.

Thank you again for the opportunity to comment. I wish you the best of luck with the next stage of the process, and look forward to further opportunities to provide feedback as the conceptual model continues to develop.

Mike Jones (31 January 2017)

4 Comments

Add yours

Richard Vines
February 2, 2017 at 10:11 am


Great work Michael. As always. I can only thank you for this diligence as some who sees the importance of this way of thinking for the future of knowledge intensive organisations.

- Mike Jones
  February 2, 2017 at 10:14 am
  
  
  Thanks Richard. I didn’t give myself enough time on this one so my thoughts are a little unformed and unfocussed, but I hope there is something here of value.
  
Conal
February 2, 2017 at 5:48 pm


Thanks Mike

I had a look at the document and was thinking of studying it in enough detail to make a useful submission, but I was just too busy and didn’t get to it in time. The archaic format of the draft was very unhelpful, I thought. Naively I’d thought they might produce a draft RDF schema. But all I could see was PDF! The numerous tables didn’t even rate a spreadsheet!

A few comments:

Re Person, Group, and Agent, I agree that Person and Group should be first-class objects in the model. In many parts of the draft they talk about “Agent (of type person)” or “Agent (of type group)” etc. and there are many properties which can validly apply only to Agents of this or that type; so in a real OWL or RDFS implementation of this model, I would expect “Person” and “Group” would for practical purposes have to be modelled as Classes in their own right (as sub-classes of Agent). The fact that there’s no “has type” property (with values “person”, “group”, etc.) defined for Agents is a good sign in this regard. I hope that’s what happens next.

Regarding Functions and Abstract Functions, it seems to me that instances of the RiC-E8 Function (Abstract) were useful designations of types of function (i.e. terms from controlled vocabularies of functions), whereas RiC-E7 Function class would be actual (concrete) activities which exemplified those abstractions. But in fact there’s RiC-E9 Activity class. for that. I was left puzzled about what level of abstraction the intermediary (RiC-E7 Function) entity is supposed to model. I can certainly see scope for confusion there.

Finally regarding Dates as objects, my opinion is that this is a good idea (and that they should be consistent about it and eschew simple date properties, except obviously as properties of Date objects). It’s possible for instance that some event took place on a Date about which all we know is that it must have been after some other date, or before some other date, or both. Those other dates may also not be known with any accuracy or precision. If the data model treated dates as simple properties with values like “1888-01-05” then this would be a big problem in some instances. Whereas if you DO know the dates of things, it’s no big problem to represent that in RDF using a Date resource which in turn has a simple date property. The extra articulation of the model is worth it.

Regarding the division of relation types into current and historical; I agree that’s a bad idea and it is enough to reify the relations and then temporally qualify them. By the way I think it’s implied by the definition of RiC-P68 Date at the very top of p91 that relations will be first class entities that can themselves be linked to (implying that relations will not be able to be implemented in RDF as simple RDF properties). The definition says: “Chronological information associated with the relation that contributes to its identification and contextualization.”

Incidentally, the CIDOC CRM provides an example of how property names can imply a deliberately broad temporal coverage: e.g. http://erlangen-crm.org/current/P49i_is_former_or_current_keeper_of

- Mike Jones
  February 3, 2017 at 10:55 am
  
  
  Hi Conal – thanks for your thoughtful and insightful comments. Really interesting stuff. This year I will be diving into CIDOC CRM, LOD/RDF and similar more than I have in recent years (it’s relevant to my PhD research) so look forward to exploring some of this in more detail.
  
  Also, though you may have missed the formal cut-off, if you do want to contribute some comments via Gavan I’m sure he would be interested and happy to accept them.

1 Pingback

“The end is nigh”: RiC(h) Description – part 2 – In the mailbox

Comments on Records in Contexts

Mike Jones

4 Comments

Add yours

1 Pingback

Leave a Reply Cancel reply

About

Search Context Junky

Recent Posts

Categories

Archives

Meta

Subscribe to Blog via Email

Comments on Records in Contexts

Share this:

Mike Jones

4 Comments

Add yours

1 Pingback

Leave a Reply Cancel reply

About

Search Context Junky

Recent Posts

Categories

Archives

Meta

Subscribe to Blog via Email