Skip to content

Wikimedia-Sverige/commons-diff

Repository files navigation

commons-diff

Process changes to file pages on Wikimedia Commons since a specific date.

Extracts changes that have been since the defined date to

  • a specific field in the information template
    • SDC captions
    • selected SDC statements.

For example output, see example_output.json

USAGE PARAMETERS

  • --cutoff 2023-01-10
    • Grab changes from the specific date
  • --list inputlist.txt
  • --category "Name of category on Commons" Use either of these to specify which files to use. If using --list, the list must consist of a list of files, eg
Damskor - Nordiska museet - Nordiska kompaniet NK K3c 1 0134.tif
Damskor - Nordiska museet - Nordiska kompaniet NK K3c 1 0130.tif
  • -- config configfile.json This file defines which changes to grab. It must be structured like
{
"info_template": {"Nordiska museet Bildminnen image" : "description"},
"relevant_sdc": ["P180"]
}

The three things we specify are

  1. which infotemplate to process,
  2. inside the infotemplate, which field to process (contains descriptions to diff),
  3. which SDC statements to diff (P180 is depicts).
  • --out outputfile.json Optional, name of output file. If not used, a generic timestamped filename will be used.

About

No description or website provided.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages