Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use droid signature files #10

Open
akkie opened this issue Feb 1, 2022 · 12 comments
Open

Use droid signature files #10

akkie opened this issue Feb 1, 2022 · 12 comments

Comments

@akkie
Copy link

akkie commented Feb 1, 2022

Would it make sense to use the free DROID signature files to detect the mime type? They are completely free and actually maintained: https://www.nationalarchives.gov.uk/aboutapps/pronom/droid-signature-files.htm

@TonyValenti
Copy link
Contributor

Hi @akkie - This is a great find! Do you understand the format of the XML files? If you can write some code to convert the XML to a Definition, I will certainly incorporate that!

@akkie
Copy link
Author

akkie commented Feb 1, 2022

Sorry, I don't understand the format. I searched for a mime detection lib and someone in the dotnet repo suggested to start with this source for a mime type utility. Then I stumbled over your lib and thought this would helpful.

@tiesont
Copy link

tiesont commented May 4, 2022

I'd imagine LINQ-to-XML can handle deserializing those definitions fairly easily. If someone still wants to explore this route, I could probably spend some time building a parser/loader, although I'm not sure how quickly I would get to it.

@TonyValenti
Copy link
Contributor

I could bang it out really quick but I don't know what the different xml nodes mean. Do you have a documentation link?

@tiesont
Copy link

tiesont commented May 4, 2022

I was planning on starting with just being able to deserialize to a C# object, then figure out how those map to the relevant properties.

I was going to start here

https://www.nationalarchives.gov.uk/aboutapps/pronom/puid.htm

which does link to this:

https://www.nationalarchives.gov.uk/aboutapps/pronom/pdf/pronom_unique_identifier_scheme.pdf

That PDF at least describes the XML format, although I haven't dug into any of this enough to be able to answer any question beyond "is there anything out there explaining this format?".

@TonyValenti
Copy link
Contributor

Hmm. Maybe I'm not understanding this but the PDF doesn't seem to describe the xml at all. What I see is a brief description on how the "ids" are basically organized.

@tiesont
Copy link

tiesont commented May 4, 2022

Like I said, I haven't dug too much into this. I know extracting the XML files into a usable format shouldn't be too difficult, and I think I see how to translate the nodes into the magic bytes and whatnot. I just can't explain any of it, at the moment.

Basically, I was volunteering to work on this, if someone still though it was useful.

@TonyValenti
Copy link
Contributor

Yes. I definitely think this is useful.

Take a look at the existing code in the repo and how I deserialize the TRID signatures from xml. I would appreciate it if you followed a similar convention / process.

@TonyValenti
Copy link
Contributor

@TonyValenti
Copy link
Contributor

Hi @tiesont . I just wanted to touch base and let you know that if you write the code, I will likely merge the PR.

@tiesont
Copy link

tiesont commented Jun 17, 2022

Hi @tiesont . I just wanted to touch base and let you know that if you write the code, I will likely merge the PR.

Sorry, other stuff has been keeping me busy. Still happy to take a crack at it, but it might be a little while.

@ArcticLampyrid
Copy link

This may explain how to use droid:
https://github.com/digital-preservation/droid/wiki/files/DPTP-01.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants