IA integration : Tropy MCP someday?

I wonder if there are plans to enable Tropy integrations with IAs via an MCP.

There are already several MCPs for Zotero, some of them are great (Beaver, Zotero-MPC), which make an index of the full library (including full text if needed) and make an embedding DB, in order to connect it to private and open IA services.

This would be an incredible feature to have in Tropy. I also wonder if there are any handwritten-printed OCR plugins on the way (as discussed in previous treads). Automatic OCR + Indexation and MCP connection to IAs is extremely beneficial in Zotero, and would be a game changer in Tropy.

We’re experimenting with this. In our experience, it’s relatively easy to give access to a Tropy project to current AI models either via the developer API (REST APIs are easy for LLMs to understand if you point them at the source code) or the .tpy file, which is a self-documenting SQLite file.

With HTR/OCR in mind specifically, we’ve already done considerable explorations but are still cautious with regard to the reliability of the results of LLMs vs more specific HTR/OCR models. This is both in terms of the results themselves, but also their provenance (e.g., can you consistently explain why a certain word was recognized at a specific spot on page). Still, Tropy’s transcription support will be able to cover both approaches.

Thank you very much for those suggestions ! I’ll try to explore the developper API ; for the .tpy file, are you suggesting to give it directly to an IA ? For the moment, I was thinking about a pretty simple solution : a JSON export of all items and giving the output file to an IA (and maybe an optional cleanup script to take out non essential metadata and eventually to translate to markdown) ; it actually works pretty well. I also want to experiment with Notebook LM to preserve images, since the interface enables working with a preview of an image side by side with the OCR ; in this case, the upcoming HTR/OCR implementation could be extremely useful before batch exporting pdfs with a script. In this regard, Beaver for Zotero is a great plugin, because it points directly to image sources. Very happy to know that this horizon gets closer !

In my experience, the JSON export is easier to read for human eyes, but the SQLite should be perfectly fine for an AI model. The SQLite file also includes full-text indices for metadata and notes which are ready-to use. So for read-only the SQLite file should be fine. For write access I personally feel more comfortable using the API because it’s gives the model a constrained write access only. That said, LLM can of course generate SQLite content too (you can generate a Tropy project or modify an existing one), but I’d use that only for very specific purposes not for a real project I want to work on going forward. It would be very easy for the generated file to contain subtle inconsistencies or errors that can go unnoticed at a quick glance.

Of course, I would never use it to write in the DB (or elsewhere), its for query and analytical purposes. Thank you, this is very helpful.