Possibility of adding project-specific metadata templates

tobias · August 20, 2017, 1:28pm

Hello,

I read in the manual that it is possible to create new categories for metadata. I wonder, is there an option to name them randomly, or do I have to choose necessarily from the existing categories.
What I would like to do is create project-specific metadata categories, i.e. “historical person represented” “historical event represented” etc. So the point would be not to use standardized metatdata templates but define them individually for each project.

Also, how is it possible to change the metadata template setup for more than one item at once? (I can’t even mark a large number of items other than clicking on each single item holding cmd, right?)

And my last question, concerning this topic, would be if the categories that are displayed in the middle frame (title, creater, date, type) can be changed, or other ones added, as in iTunes?

Thank you for your help.

inukshuk · August 21, 2017, 7:57am

In Tropy we want to encourage the use of common/shared vocabularies. For that reason, you cannot create arbitrary properties. However, you can import RDF vocabularies for your domain (you can currently import vocabularies in .ttl or .n3 format); furthermore, if you have no need to share your data, you can also just rename existing properties (e.g., you could rename dc:subject to ‘historical event represented’).

If there is no suitable RDF vocabulary for your domain which covers the properties you need, you can also create your own vocabulary and import it into Tropy.

We’re currently working on making the Tropy’s item table / grid support large projects: this includes making selections using shift to make it easier to select multiple items (you’re right, at the moment you need to use cmd and select each item individually). It will also be possible to configure which columns should be displayed in the item table. Stay tuned!

inukshuk · August 22, 2017, 12:09pm

Let me add and emphasize that we’re building Tropy specifically to support this kind of use case (i.e., you want custom properties like ‘historical person represented’ etc.). We’re still discussing various questions internally (some technical, some ‘political’) regarding the use of vocabularies, but I’d just like to stress that Tropy is definitely intended to support this specific use-case.

tobias · September 2, 2017, 10:00am

Thank you. For the moment, the ‘renaming’ works quite well for me: but thank you for considering the question, I guess it is important that both the individualized and shared vocabularies won’t get in conflict.

inukshuk · September 2, 2017, 10:20am

We actually decided to make things easier for users who do not necessarily want to work with existing vocabularies: we’re adding a way to add custom vocabularies within Tropy (i.e., you won’t have to create a full RDF vocabulary yourself). This will be available in the next beta (out in 1-2 weeks).

DonaldsonCD · September 5, 2018, 9:51am

Did this feature roll-out? I’m not seeing it when I try to make a custom template.

inukshuk · September 5, 2018, 10:03am

We haven’t enabled it, no. (You would not see any difference in the template editor but in the vocabulary list by the way.)

I would strongly recommend to use an existing vocabulary (you can customize the labels of each property in the vocabulary list if you like) but if there is no suitable existing vocabulary for you, I’d be happy to create one for you to import: just tell me which fields you’d need.

DonaldsonCD · September 6, 2018, 7:43am

Hi! Thanks for the offer. Is there anywhere you can point to actually learning more about the vocabularies and how they work for a non-programmer? The Tropy intro is useful but the site for downloading/exploring open vocabularies is bewildering and my searches come up with no hits.

I can do some work-arounds for now using an existing vocabulary (Dublin Core is what I’m using now).

Ultimately though, I’m curious because I’m part of a cataloguing project that involves digitizing and cataloguing thousands of West African Islamic manuscripts that include annotations or portions in West African languages (as opposed to Arabic, the dominant language). Our catalogue template includes tons of specific information (39 fields in total with a few subfields). Currently, our team in Mali is using Access for the cataloguing aspect. For quality control and preliminary analysis, I’ve been using Tropy to, in a more user friendly-way: group images together, apply metadata and then create “selections” with notes transcribing, commenting on segments of the images that are of interest or have been flagged in our catalogue (but with no hard link to specific coordinates within an image file).

What I realized in doing so for a very small portion of the manuscripts was that Tropy is a potentially great one-stop shop for us to look at, process and sort the manuscripts as well. That is, there are certain fields in our catalogue that are more useful for me in deciding which manuscripts may be of interest. These fields however don’t line up with any of the properties that I use in the vocabulary lists that I’ve found; they include things like “number of folios”, “relative proportion of African language text”, “African language present” etc.

A larger question would be about the possibility of just simply using Tropy or some modified version of it tailored for our purposes so that all the properties from the catalogue were automatically applied to the images in Tropy.

That’s probably way more information than you wanted, but any thoughts, links or recommendations would be appreciated. It’s not clear to me how a standards/librarian person would tackle such a project/needs as ours. Thanks!

inukshuk · September 6, 2018, 8:46am

I completely agree with you that the world of RDF / linked-data vocabularies and ontologies is daunting and can seem bewildering; it often is, to me as well. Part of the reason, I believe, is that working with metadata taxonomies is such a complex task. With Tropy we’re hoping to build a tool that is easy to use and yet produces data which can be processed further (or in the future) by other linked-data aware tools and platforms.

For the time being, Tropy is using mostly RDF properties defined by these vocabularies. For our purposes, you can think of an RDF property as consisting of an id and a label. The id is a URI such as http://purl.org/dc/terms/creator and a label is a corresponding name such as Creator. The label is basically a shorthand that we use in the UI, that can be customized and translated, but internally the URI is used for everything (basically, the URI is what the computer uses, because it is unique, the label is what we humans use, because we usually have enough context to understand the semantics). The reason I’m mentioning this is just to explain what you need in order to create your own vocabulary to use in Tropy: a list of id / label pairs, that’s it. If you do not care about linked-data then the id can basically be anything (it just needs to be unique in your Tropy database); if you’re planning for others to use your data (and vocabulary) it would make sense to devise of a sensible URI scheme to use for your ids.

Tropy imports vocabularies in N3 notation. So, to create your own vocabulary, you would create a text file like this:

@prefix ex: <http://www.example.org/example#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

ex:numberOfFolios a rdf:Property ;
  rdfs:label "Number of Folios" .

ex:relAfricanTextProportion a rdf:Property ;
   rdfs:label "Relative Proportion of African Language Text" .

You could save this file as ex.n3 for example and import into Tropy. If you make changes to the file you can delete the vocabulary in Tropy and import it again. Obviously you would use your own ‘namespace’ instead of ‘ex’ as well as your own URI scheme in the ex @prefix. Basically, your vocabulary is just a list of properties following the syntax: ns:name a rdf:Property ; rdfs:label "Name" .

A larger question would be about the possibility of just simply using Tropy or some modified version of it tailored for our purposes so that all the properties from the catalogue were automatically applied to the images in Tropy.

Tropy currently supports export plugins; we’ll be looking to add import plugins as well. If I understand you correctly, this sounds like a situation where an input plugin could be used (i.e. fetching properties from a different data source and applying them to items in Tropy). But that’s probably something to discuss separately.

DonaldsonCD · September 7, 2018, 9:21am

Hi @inukshuk. Thanks! I’ve read your message a dozen times or so and I think I get the larger point, but I have to admit I’m a bit lost with some things:

A (RDF) vocabulary is a list of ids and corresponding labels. Yes?
What is “linked-data”? You write…

Do you mean tied to some standard like the Dublin Core? Could I just make a custom RDF Vocabulary using the parts of the Dublin Core vocabulary that work for my purposes and then add additional niche ones for our project’s purpose? That way, our work would be partially transferable or readable by other machines/programs/people down the road even though we also use some custom ids/labels that aren’t standard?

How does an RDF vocabulary relate to a “URI scheme” and a “namespace”? Am I missing something about the fact that an RDF Vocabulary needs to be online? I guess I don’t see why there needs to be any web-addresses used in the N3 file at all if I’m simply creating a vocabulary which is basically a set of abbreviations and full semantic labels for describing an image file.

Apologies for all the questions.

inukshuk · September 7, 2018, 10:22am

In Tropy you can create metadata ‘templates’ (in the preferences window). A template is basically a list of RDF properties. To use a property Tropy first needs to know that it exists: that’s where RDF vocabularies come in. If you want to use mostly Dublin Core plus a handful of your own custom properties you need to import only those custom properties (since Tropy knows about Dublin Core already). Then you can create item templates which use properties from both Dublin Core and your own vocabularies.

RDF vocabularies are more than just lists of properties, but for your purposes that’s all you need so I tried to over-simplify in my post above.

The URIs in the N3 file are used as identifiers by Tropy; they do not need to be URLs / web addresses, but if you or your project controls a domain name, it makes sense to use it for this purpose. You could also use URIs which are just ids but this is not recommended. Here is an example of what this might look:

@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<urn:uuid:881d4108-9c97-4015-b267-ad9a832472bf> a owl:Ontology ;
    dc:title "My Custom Tropy Vocabulary"@en .

<urn:uuid:409cbab9-319d-4f52-908a-00cc15be6b3b> a rdf:Property ;
    rdfs:label "Number of Folios"@en ;
    rdfs:isDefinedBy <urn:uuid:881d4108-9c97-4015-b267-ad9a832472bf> .

<urn:uuid:0f8f333b-9e51-4883-88ea-cffe90e06a22> a rdf:Property ;
    rdfs:label "Relative Proportion of African Language Text"@en ;
    rdfs:isDefinedBy <urn:uuid:881d4108-9c97-4015-b267-ad9a832472bf> .

Here I’m using the urn:uuid scheme and random UUIDs that I generated using uuidgen -r on the command line. (Please note, I just noticed an issue with Tropy importing such a vocabulary so this example won’t work yet, but it’s already fixed in the dev version so this example will be working in Tropy soon).

DonaldsonCD · September 7, 2018, 12:32pm

Hmm, ok.

So all of that top stuff starting with @prefix is just an identifier (NOT the same as an id, right?); meaning that it just points to where the vocabulary originates from? Kind of like some ReadMe file information?

Is there any reason that I shouldn’t use a random assortment of the pre-installed RDF properties (from the various RDF vocabularies that Tropy ships with) plus my own custom ones via a custom vocabulary? What are the potential alternatives with in Tropy?

I’m not surely what urn or uuid are and why you used them instead of ex like in your earlier example, but I think it’s alright.

I’ve drafted up a list of essential information from our catalogue template that I’d like to be to apply to images in Tropy in case you’re still up for creating a me a vocabulary that I can import and potentially add to down the road. Can’t upload it here but can write it out or email if so…

In terms of learning more about how Tropy and metastandards could be adapted or used by our team here, what would be the right kind of person to talk to? I’m not sure where to start looking for a programmer-type or library-type person that could work on such a project. I see for instance that at Boston University they’ve used some sort of metadata standard for some of their West African manuscripts, but they don’t have any of the useful custom information that would allow a researcher such as myself to actually quickly look at anything of interest. I’d need to basically download the jpegs and put them into Tropy and start from scratch again to pick out selections and do transcriptions. Would like to try to get over that hump for our project which is larger and at the beginning of its 12 year life…

Sorry for abusing your time and expertise!

abbymullen · September 10, 2018, 1:35pm

Hi,

Apologies if you mentioned this above, but are you part of a university setting? If so, there’s a pretty decent chance your library has a metadata specialist, or at least someone who could help you. That’s where I’d recommend you start–they should be able to help you understand more about metadata in general (which I agree is very confusing!) and even craft your own metadata template.

inukshuk · September 10, 2018, 3:23pm

Yes, the university’s library is probably a good place to contact for a metadata specialist.

I think the best solution would be to create one or more Tropy templates for your project; ideally one that other projects in your field could use as well. We’re planning to create an online repository to make it easier to share and collaborate on templates, but in the meantime this forum is the best place to discuss this (ideally in a dedicated thread for that template, since the current thread is a more general discussion).

Ideally, the template would use only properties which have already been established; this is where a metadata specialist will be able to help. Note that it’s perfectly fine to use properties defined in different vocabularies. If you do need to define/add your own properties, that’s fine, too (in this case you’d have to add them to a custom vocabulary; for others to work with your template and data they would have to install both the template and the vocabulary).

DonaldsonCD · September 12, 2018, 6:46am

Thanks @inukshuk and @abbymullen. I am part of a University setting, but long story short, I’m at German institution and so it’s been tough to navigate who helps who given my limited language skills – no excuse though, I’ll start emailing and keep you all posted. I’m speaking with BU’s librarian who worked on their similar project last year so perhaps I’ll get her on this bandwagon and we’ll create a thread.