Skip to content

License file filename errors with DSpace SWORD deposit #2096

@jameswsullivan

Description

@jameswsullivan

Describe the bug

Some submissions' license files have WWW- prefixed to them in the exported zip files, such as:

Image

During the deposit from Vireo to DSpace, such cases give a 500 error like this:

INFO [org.purl.sword.client.Client] Checking the status code: 500
[Fatal Error] :1:1: Content is not allowed in prolog.
ERROR 1 --- [io-9000-exec-37] c.a.CustomResponseEntityExceptionHandler : SWORD deposit failed: Unprocessable Entity

On the DSpace side it provides more details pointing to the file names:

ERROR unknown 1234abcd-1234-567d-89f0-1234567891234abdc org.dspace.sword.SWORDMETSIngester @ caught exception:
org.dspace.content.crosswalk.MetadataValidationException: Manifest file references file 'Permission.pdf' not included in the zip.
        at org.dspace.content.packager.AbstractMETSIngester.getFileInputStream(AbstractMETSIngester.java:1337) ~[dspace-api-7.6.1.jar:7.6.1]
ERROR unknown 1234abcd-1234-567d-89f0-1234567891234abdc org.dspace.sword.DSpaceSWORDServer @ caught exception:
org.dspace.sword.DSpaceSWORDException: org.dspace.content.crosswalk.MetadataValidationException: Manifest file references file 'Permission.pdf' not included in the zip.
        at org.dspace.sword.SWORDMETSIngester.ingest(SWORDMETSIngester.java:173) ~[dspace-sword-7.6.1.jar:7.6.1]

To Reproduce
You might be able to catch the errors if you do a batch deposit, I'm not sure why some submissions' license files have the WWW- prefix rather than the seeded timestamp prefixes. If I rename the files in the UI it'd get rid of the WWW- and replace it with a timestamp prefix and the error would be gone.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions