Shared project, searchable material, grammar corrector and exploded files

Hi. We are a research team working on thousands of archival documents from the 50s until today. I am currently organizing the material and making sure it is available for all the team members. I have read that former users have experienced some challenges using the software as a research team and making changes in Tropy at the same time. Hopfully the software has been updated and there is a way to use Tropy as a team?

I’ve met some unique challenges that I would love some input on:

  • I have made a project file on a shared folder (onedrive), where also the archival material is located. The project file is a standard project (not the advanced version). Despite having everything in a shared folder, my colleague can access the project file, but the archival material is offline on her computer. Do you know why that is?
  • What happens if we are both making changes in the project file at the same time, from two different devices?
  • I have read that some people have experienced sync conflict (I don’t know what that is), when working on the same project from a shared folder. Does this mean that it is not possible to make changes at the same time? If so, is it possible to use the same project on the same shared folder, but at different times?

Other areas of difficulties:

  • We have digitalized the archival material using ABBYY FineReader. However, it is not possible to search for words within the documents by using the search section on Tropy. Only words from the title, notes or metadata appear when searching for words on Tropy. Does it normally work to search for words from the documents or isn’t this a feature Tropy’s offering? Is there any way we can make the documents searchable?
  • Is it possible to insert a grammar corrector in the settings or as a preset? English is not my first language, and it is important that all the words are spelled correctly in order to be searchable.
  • In our archival data I have numerous files that need to ‘be exploded’, before I can merge the correct items. I have experienced that if I merge some items and later discover that one of the documents belongs in another item, I cannot change the location of the document without ruining the order of the items. Is there a way of re-organizing the order of documents in an item?
  • Regarding tags: Is there any way to organize the tags by categorizing them? Is there a way of copying several tags and applying them on another item?
    Thank you so much for your advice.

It is possible to share standard project across several devices using OneDrive or similar cloud-sync providers. A standard project is basically a folder with the project.tpy database file and and assets folder containing your photos. When all of this is fully synced to your local device, the project is fully accessible there. OneDrive can be configured to download files only on demand, but this setting is often incompatible with the way Tropy accesses your files, so please make sure to configure OneDrive to fully sync/download the assets folder in order to view all images in Tropy.

When working on the project from several devices you need to be careful to avoid sync conflicts of the project.tpy file. The issue here is that OneDrive cannot see inside of the project database file and update it partially. Instead it replaces the entire file whenever it changes. That means that if you have two devices A and B and open the project file on both of them the following will happen: you enter new information on device A, Tropy saves the changes to the disk, OneDrive detects those changes, and updates the file in the cloud. At the same time, your project is open on device B and you add different information there, the file is saved, OneDrive detects the changes and updates the file in the cloud. At this point, you have this version history: the original file, the file with the changes from device A, and the original file with the changes from device B. When you later open your project on device A you will see the changes made on B but the changes you made on device A are not there anymore. Depending on the circumstances, OneDrive may or may not detect that a file was changes simultaneously. If it does detect this then it can mark the two versions as a sync conflict and prompt for a review. In the worst case, Tropy and OneDrive change the file at the same time and in the process corrupt the database file. For all these reasons you need to be careful when working on a shared project like this.

The most important rules are to always make sure your project is fully synced before you open it. When you make changes, make sure to close the project when you’re done and again make sure that OneDrive uploads your changes to the cloud. In addition, configure OneDrive to keep a version history of the project.tpy file so that you can restore a previous version in case something goes wrong. In addition to that, to be extra safe you can also make manual backup copies of the project.tpy file from time to time.

Tropy is built for images. If you import a PDFs each page is treated as a separate image and there is no separate text layer. We hope to add support for a transcription layer based on the ALTO format in the future, so if you have transcriptions in ABBYY FineReader you’ll be able to import them using ALTO in the future. At the moment, however, the only way of adding transcriptions and make them searchable is by creating notes.

We’ve currently disabled the spell checker in Tropy because the research material is often in multiple different languages and we believed a limited spell checker is probably more nuisance than assistance. We can consider adding an option to enable it however if there’s demand for it.

You can change the order of photos in an item by dragging the photos up or down in the photo panel.

There is currently no interface of applying several tags at once, however, there’s a trick that works like this: in the tag adder you can add several tags at once to an item by separating the tags using commas and holding the Shift key while pressing Enter. This way you can add a set of tags to one (or multiple) items.

In the screenshot above, I’ve selected three items and entered “A, B, C” in the tag adder. Pressing Enter would add the tag “A, B, C” to these three items; holding the Shift key while doing so would add three tags “A”, “B”, and “C” to the three items.

2 Likes

Thank you very much for your thorough reply. This is very helpful.

We tried to access the project from a shared OneDrive, using a different device. First of all, all the files were offline. I don’t know why that is, since my colleague had access to the project file and the archival material.

Secondly, and even more importantly, even though we didn’t make any changes to the project, the project file is now corrupt. I got an error message every time I tried to upload archive documents, and after closing the project, it was not possible to re-open the project file. Please see the error message I have received. I have tried to update the project file to an earlier version, without success. Is there anything else I can try? Thank you in advance.

Skjermbilde 2024-10-02 kl. 09.37.09
Skjermbilde 2024-10-02 kl. 14.01.55

I’m not sure what it means that the files are offline? As I’ve explained above, your files need to be fully synced and downloaded on your local device before Tropy can access them.

In addition, on macOS there can be file permission issues in accessing shared files. You can typically work around those by giving Tropy the full-disk-access permissions in the system preferences.

I’m not sure how the file can become corrupt without making any changes to it, but my guess would be that it is related to OneDrive being configured to fetch files on-demand and effectively trying to open the placeholder file before the actual file was downloaded.

In any case, it would be best to make a copy of the file on the original device before it syncs with OneDrive - if that hasn’t happened already. If the file is now corrupt on the original device it would still be best to restore the previous version from OneDrive. If that’s not possible, all we can do is trying to restore the data from the corrupted file but whether or not that’s possible and if there is data loss involved depends on how severe the file is damaged.

1 Like

Real time syncing is always problematic and pointless if more people edit the same file, no matter what application you use.

Since sharing is “fragile”, you’ve to make sure that both research teams keep their modifications offline in their offline copy and sync-update with the cloud only after predefined time intervals. Example:

  • Research team A is allowed to sync their file at minute:50.
  • Research team B is allowed to edit files from minute:0.

I left 10 minutes as a buffer so tropy is able to apply all changes to the remote, shared cloud file. 10 minutes are plenty and you can cut down to 5 minutes, but this depends on your internet speeds and the amount of modifications everyone made on each side and eventual connection errors.

1 Like

Thank you both for your replies. This is very helpful. Since we need to work simultaneously on the project and our archive material is mainly in PDF (and we need the content within the document to be searchable), we are considering using Zotero instead of Tropy. What are the advantages/disadvantages between the two softwares when working with archival material? Thank you in advance.

Zotero has a builtin PDF reader that allows you to view and annotate PDFs. It’s main purpose is for interacting with PDFs that have a text layer. Tropy’s image viewer is optimized for images. It allows you to import PDFs but internally these are rendered and treated as images and therefore best suited for PDFs which contain mainly of images (or scanned text). We’re working on adding transcription support to Tropy which will allow you to view and work with OCR results for your images (including PDFs).

So basically, I’d suggest to use Zotero if your PDFs have full text layers and your’re working mostly with the text. If your PDFs are mostly visual or consist of handwritten text or print and your research is not only about the text but also about the presentation Tropy should be a good choice. That said, both Zotero and Tropy are free and this is not an exclusive choice; you can try both and see which one suits your workflow best or even use them for different portions of your material.