New user here. I have several hundred multipage PDFs (scanned manuscripts) with metadata recorded in a csv file. I used the CSV plug-in to import them. The metadata imported perfectly, but the plug-in imported only the first page of each pdf file. To get the full file, I had to go to each individual item and use the “add photo” feature. Is the plug-in limited to importing just the single page, or is there a way for me to import the entire file at one go? I have a few thousand more such documents to import so I’d like to do this as efficiently as possible. Thank you.
The CSV import assumes that the metadata structure is given by the file and doesn’t make any assumptions. It’s possible to import multi-page items this way, but you’d have to specify each photo in an item by repeating the relevant columns in the CSV file. That said, I believe that this solution was intended for importing multiple files and does not work well with multi-page PDFs, because we currently have no column to indicate the page in a file. So this way, you could import all the pages with metadata but the photo would always show the first page in the PDF.
I think we’ll have to add a ‘page’ column to the PDF plugin to support this.
Related question: when importing multipage documents directly (not via the csv plugin), is it easier to import individual images and then merge them within Tropy, or to import them as multipage PDFs (one document per pdf) and have Tropy convert them into multi-image items? I understand that Tropy was originally developed to work with individual images, but the conversion process for pdfs seems to simplify things by keeping the document pages together.
Also, regarding the original question: it sounds like the optimal approach with the csv plugin would be to mass import the metadata without providing the path, and then add each pdf to its respective item one at a time. In other words, use the csv plugin to import the collection metadata, then use “add photos” to bring in the pdfs. Does that make sense?