Plugin for Handwriting to Text OCR

Hi,

I’m from handwritingocr.com. Tropy was recommended by one of our users, who suggested it would be useful to create a plugin enabling OCR for handwritten texts stored in Tropy, via our API.

Is this something that would be useful to the community, and is there a marketplace for 3rd party plugins at present (or just through Github)?

Thank you.

Matt

2 Likes

We’re currently working on adding a dedicated OCR plugin hook, so once that’s ready I think it should be relatively easy to add a plugin using your API. Transcription support in Tropy will be based on ALTO XML. Plain text works too, of course, but in order to utilize all features in the UI ALTO is required. Is there any chance to get ALTO as result from the API? Otherwise the plugin probably would have to convert the JSON to ALTO.

Thanks for your reply, and it’s great to hear you’re working on an OCR webhook. Our API returns JSON, so we’d need to convert to ALTO.

I’ve created a simple plugin to send pages from Tropy to our API that I will publish soon. I would like, though, to get the response directly into the annotation field associated with a photo. I’m guessing that public webhooks are not available since it’s a local installation. Is the best way to receive a response from our API to just poll repeatedly after submitting the OCR request? Also, is there a way to show alerts to the user?

Thanks again.

Right, that’s the reason why we’re adding a dedicated OCR plugin hook. Basically, you create ‘transcription’ objects in Tropy which can be backed by a plugin. The plugin can update the status of transcriptions and use different methods to receive status updates - this way the plugin can also resume if a transcription wasn’t finished when you quit Tropy.

Instead of saving the transcribed text as a regular text annotations, the transcriptions are built into the UI so you can view the text or select it out of the images as you’d expect (though that part uses the ALTO data).

This is not released yet, but for OCR plugins this is definitely going to be the best way to integrate them into Tropy.

1 Like

That does sound better. In the meantime, can you tell me how I should be appending to an Item’s notes? In an Item object inside the Export hook, we have a notes array. How should I save/update text there?

Apologies if this is documented somewhere, I didn’t spot it though.

That’s not possible since the export hook is only for exporting data.

So at the moment you’d have to write the notes directly to the SQLite file or use the local HTTP API to create the notes. That’s both not ideal. We’re also adding a ‘processing’ hook that’s more general purpose and allows adding new tags or notes to the items, but in your case the OCR hook would be the best choice.

I note that, after adding “api”: true to state.json, state.json is always reset to its default settings (i.e. without “api”: true) on opening Tropy. This means the API is never enabled.

Also, is there any documentation for the HTTP API, specifically how to create a note?

Tropy saves the state.json file on shutdown, so if you make manual changes to the file you need to do it while Tropy is closed, otherwise the changes will be overwritten.

We don’t have documentation on the HTTP API at the moment, but you can see the available endpoints here. You can create notes with a POST request with a ‘html’ and a ‘photo’ or ‘selection’ parameter in the body.