Skip to content

Commit

Permalink
14 validate against skohub shacl (#16)
Browse files Browse the repository at this point in the history
* Use skohub shape to validate data

* Revert "Use skohub shape to validate data"

This reverts commit 2cec370.

* Use skohub shape to validate data #14

* Remove branches from github action #14

Action should run on every push not depending on branch

* Update changed files action version #14

* Run on changes of github action file (e.g. main.yml) #15

* Separate changed file action to find ttl and yml files #15

* WIP validate files with skohub shape #14

* WIP fix github environment variable

* WIP cat output to see why riot cries #14

* WIP disable workflow #14

* WIP Use pwd for providing full path in mount #14

* WIP Invalidate file for test purposes #14

* WIP cat out testfile #14

* WIP add riot validation step before validating with shacl #14

* WIP clean up a bit #14

* WIP reenable cleanup #14

* WIP GitHub Action not finding testfile. VMs is teared down anyway so its ok

* WIP use correct SPARQL query to query result. We are checking for warning and violations

* WIP Adjust path to query #14

* WIP Check only for violations. Warning are also present in the SkoHub Vocabs output

* WIP make skos file valid turtle, but not valid skos for test purposes

* WIP test new matrix preparation

* WIP check for yml and ttl files in prepare matrix step #14

* WIP parse json output with jq #14

* WIP Testing: Just change a ttl file

* WIP Test: Just change README

* Cleanup #14

* Remove validation step #14

SkoHub Vocabs will already validate the vocabularys.
  • Loading branch information
sroertgen authored Jan 30, 2024
1 parent e36407f commit 948f28c
Show file tree
Hide file tree
Showing 6 changed files with 202 additions and 90 deletions.
94 changes: 62 additions & 32 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,6 @@ name: Build /public and deploy to gh-pages with docker container

on:
push:
branches:
- master
- main
- dev
- gh-pages
workflow_dispatch:
inputs:
logLevel:
Expand All @@ -17,50 +12,85 @@ on:
description: 'Test scenario tags'

jobs:
all-ttl-files:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.file-list }}
steps:
- name: Checkout repository
uses: actions/checkout@v2

- name: Find TTL files
id: find-ttl
run: |
ttl_files=$(find . -name '*.ttl' -printf '"%p",' | sed 's/,$//')
echo "ttl_files=$ttl_files"
echo "ttl_files=$ttl_files" >> "$GITHUB_ENV"
- name: Set matrix for TTL files
id: set-matrix
run: echo "file-list=[${ttl_files}]" >> $GITHUB_OUTPUT

- name: List all changed files
run: echo "${ttl_files}"

changedfiles:
runs-on: ubuntu-latest
outputs:
ttl: ${{ steps.set-matrix.outputs.ttl }}
ttl: ${{ steps.set-ttl-matrix.outputs.ttl }}
yml: ${{ steps.set-yml-matrix.outputs.yml }}
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v34
- name: Get changed ttl-files
id: changed-ttl-files
uses: tj-actions/changed-files@v42
with:
files: |
**/*.ttl
json: "true"

- name: Set matrix
id: set-matrix
run: echo "ttl=${{ steps.changed-files.outputs.all_changed_files }}" >> $GITHUB_OUTPUT
json: true

validate:
runs-on: ubuntu-latest
needs: changedfiles
# only run there are changed files
if: ${{needs.changedfiles.outputs.ttl != '[]'}}
strategy:
fail-fast: false # other validation jobs should continue checking even if one file is invalid
matrix:
file: ${{ fromJson(needs.changedfiles.outputs.ttl) }}
steps:
- uses: actions/checkout@v3
- name: echo changed files
run: echo "${{ matrix.file }}"
- name: Get changed yml-files
id: changed-yml-files
uses: tj-actions/changed-files@v42
with:
files: |
**/*.yml
json: true

- name: get shape
run: curl https://raw.githubusercontent.com/skohub-io/shapes/main/skos.shacl.ttl --output skos.shacl.ttl
- name: Set turtle file matrix
id: set-ttl-matrix
run: echo "ttl=${{ steps.changed-ttl-files.outputs.all_changed_files }}" >> $GITHUB_OUTPUT

- name: Validate with script
run: bash ${GITHUB_WORKSPACE}/scripts/validate-skos.sh -f ${{ matrix.file }}
- name: Set yml file matrix
id: set-yml-matrix
run: echo "yml=${{ steps.changed-yml-files.outputs.all_changed_files }}" >> $GITHUB_OUTPUT

prepare-matrix:
runs-on: ubuntu-latest
needs: [changedfiles, all-ttl-files]
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- name: Determine matrix
id: set-matrix
run: |
changed_files_ttl=${{ toJSON(needs.changedfiles.outputs.ttl) }}
changed_files_yml=${{ toJSON(needs.changedfiles.outputs.yml) }}
if [ $(echo "$changed_files_ttl" | jq length) -gt 0 ]; then
echo "matrix=${{ toJSON(needs.changedfiles.outputs.ttl) }}" >> $GITHUB_OUTPUT
elif [ $(echo "$changed_files_yml" | jq length) -gt 0 ]; then
echo "matrix=${{ toJSON(needs.all-ttl-files.outputs.matrix) }}" >> $GITHUB_OUTPUT
else
echo "matrix=[]" >> $GITHUB_OUTPUT
fi
build:
runs-on: ubuntu-latest
needs: [changedfiles, validate]
needs: [prepare-matrix]
if: ${{ needs.prepare-matrix.outputs.matrix != '[]'}}
steps:
- name: Checkout 🛎️
uses: actions/checkout@v2 # If you're using actions/checkout@v2 you must set persist-credentials to false in most cases for the deployment to work correctly.
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,3 +97,4 @@ jobs:

- In an earlier version, there was the .env variable `PATH_PREFIX` set to point to the repository the vocabulary is hosted at. To align with rest of code, this was changed to `BASEURL`.
- The docker image now also support i18n

2 changes: 1 addition & 1 deletion colors.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ colour: a skos:ConceptScheme ;
skos:hasTopConcept colour:violet, colour:blue .

colour:violet a skos:Concept ;
skos:prefLabel "Violett"@de, "violet"@en ;
skos:prefLabel "Violett"@de, "violet"@en;
skos:altLabel "Lila"@de, "purple"@en ;
skos:topConceptOf colour: .

Expand Down
22 changes: 0 additions & 22 deletions scripts/check-for-violation.rq

This file was deleted.

138 changes: 138 additions & 0 deletions scripts/validate-skos
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
#!/bin/bash
set -euo pipefail

scripts=$(realpath $(dirname -- "$0"))
shape=$(realpath "$scripts/../skos.shacl.ttl")
severity=all
report=

usage() {
echo "$0 [OPTION]... FILE"
echo "Validate SKOS file (Turtle syntax). No return message means everything is fine."
echo
echo "Options:"
echo " -s FILE shape file (default: $shape)"
echo " -l LEVEL severity violation|warning|all (default: $severity)"
echo " -o FILE keep full validation report in this file"
echo " -r show raw validation report and exit"
echo " -h show this help message"
exit $1
}

die() {
echo "$*" >&2
exit 1
}

cleanup() {
echo "Cleaning up"
docker container stop validate-skos-fuseki > /dev/null
}

trap cleanup 0 2 3 15

while getopts s:l:o:rh flag
do
case "${flag}" in
s) shape=${OPTARG};;
l) severity=${OPTARG};;
o) result=${OPTARG};;
r) report=1;;
h) usage 0;;
*) usage 1;;
esac
done
shift $(($OPTIND - 1))

[ -z "${1:-}" ] && usage 1

file=$(realpath "$1")
[ -f "$file" ] || die "File not found: $file"
# create temporary testfile and make sure it gets deleted
testfile=$(mktemp /tmp/validate-script.XXXXXX)


shape=$(realpath "$shape")
[ -f "$shape" ] || die "File not found: $shape"
# add the skos definitions to the file if the shape is "skos.shacl.ttl"
if [ "$(basename $shape)" = "skos.shacl.ttl" ]; then
cat $file $(realpath skosClassAndPropertyDefinitions.ttl) > $testfile
else
cat $file > $testfile
fi

grep -vE '^\s*(#.*)?$' "$file" >/dev/null || die "File contains no RDF statements: $testfile"

if [[ $severity == "warning" ]]; then
SEVERITY_FILE="./scripts/checkForWarning.rq"
elif [[ $severity == "all" ]]; then
SEVERITY_FILE="./scripts/checkForBoth.rq"
elif [[ $severity == "violation" ]]; then
SEVERITY_FILE="./scripts/checkForViolation.rq"
else
die "Unknown severity: $severity"
fi

# create temporary file (will be deleted in cleanup function)
if [[ -z "${result:-}" ]]; then
result=$(mktemp /tmp/validate-script.XXXXXX)
else
result=$(realpath "$result")
fi

# Check if the container is running
if docker ps | grep -q "validate-skos-fuseki"; then
docker stop validate-skos-fuseki
sleep 1
fi

# wait till fuseki is up
max_attempts=5
delay=3
attempt=1

echo "Starting validation container"

while [ $attempt -le $max_attempts ]; do
# start fuseki
docker run -d --rm --name validate-skos-fuseki -p 0:3030 -v $(pwd)/fuseki/config_inference.ttl:/fuseki/config_inference.ttl skohub/jena-fuseki:latest /jena-fuseki/fuseki-server --config /fuseki/config_inference.ttl > /dev/null
port=$(docker port validate-skos-fuseki 3030/tcp | head -1 | awk -F: '{print $2}')
sleep $delay
curl "http://localhost:$port/$/ping" > /dev/null && break
attempt=$((attempt + 1))
done

if [ $attempt -gt $max_attempts ]; then
echo "The command has failed after $max_attempts attempts."
exit 1
fi

# validate ttl file
riotResult="$(docker run --rm -v $testfile:/rdf/testfile.ttl skohub/jena:4.6.1 riot --validate /rdf/testfile.ttl)"
echo $?

# upload file
curl --request POST \
--url "http://localhost:$port/dataset/data?graph=default" \
--header 'Content-Type: text/turtle' \
--data-binary @$testfile > /dev/null

# validate w/ shacl
curl --request POST \
--url "http://localhost:$port/dataset/shacl?graph=default" \
--header 'Content-Type: text/turtle' \
--data-binary @$shape > "$result"

echo "Checking validation result"

if [[ "$report" -eq 1 ]]; then
cat "$result"
else
validationResult="$(docker run --rm -v $(pwd)/scripts/checkForViolation.rq:/rdf/checkForViolation.rq -v $result:/rdf/result.ttl skohub/jena:4.6.1 arq --data /rdf/result.ttl --query /rdf/checkForViolation.rq)"

lines=$(echo "$validationResult" | wc -l )

# Correct validation has 4 lines of output
[[ ${lines} -eq 4 ]] || die "$validationResult"

fi
35 changes: 0 additions & 35 deletions scripts/validate-skos.sh

This file was deleted.

0 comments on commit 948f28c

Please sign in to comment.