Error importing PDFs

Hello! I have historically had no difficulty importing PDFs into Tropy, but today, when attempting to import PDFs (of historical New York Times articles, no more than 300KB each), I received the following error message. I am working on a Mac with updated software. Any suggestions would be much appreciated!

{“msg”:“Failed to import item.”,“stack”:“Error: image type not supported: undefined\n at Image.parse (/Applications/Tropy.app/Contents/Resources/app.asar/lib/index-7138169f.js:1267:58)\n at async Image.open (/Applications/Tropy.app/Contents/Resources/app.asar/lib/index-7138169f.js:1254:5)”,“system”:“Darwin 22.4.0 (arm64)”,“time”:1681242028445,“version”:“1.13.0”}

Could you share one of these PDFs that fail to import with us to look at this further?

98811697.pdf (213.2 KB)

Thanks!

The reason why Tropy rejects this one is because it isn’t technically a PDF file: it’s a HTTP response containing a PDF file. So basically, this is a PDF file with a little extra text at the beginning; Tropy is relatively strict when it evaluates files for importing. It checks the beginning of the file to determine its kind and rejects it if it looks like something unsupported such as in this case. We should consider stripping off HTTP code, because we’ve seen this before, but in the meantime you need to remove the HTTP header from the file before importing it.

It’s probably easier to download the file again and make sure the download works. Can you tell us where you downloaded the file from?

Thank you for clarifying the issue! I downloaded these “PDFs” from the New York Times (TimesMachine: Tuesday March 20, 1979 - NYTimes.com).