Extend the existing weekly GitHub Actions pipeline to include comprehensive data quality validation and reporting.
Implementation Plan
Shape Expressions Validation
- Add ShEx/SHACL validation to the weekly data generation pipeline (Saturday 08:00 UTC)
- Validate RDF structure, data completeness, and consistency
- Check reference integrity and identifier resolution
Quality Metrics to Generate
- Coverage metrics: AOPs, Key Events, Key Event Relationships, Chemical Stressors
- Data completeness: Missing properties, empty values, required fields
- Consistency checks: Duplicate entities, conflicting data, invalid references
- Data freshness: Last update timestamps, source data age
- Reference integrity: Broken external links, missing identifiers (HGNC, ChEBI, etc.)
Output Format
- Generate
quality-report.json alongside existing RDF files
- Include summary statistics, detailed validation results, and trend data
- Export as static files for consumption by SNORQL interface
Integration Points
- Leverage existing data pipeline infrastructure
- Add quality gates to prevent deployment of poor-quality data
- Coordinate with aopwiki-snorql-extended repo for quality dashboard display
Cross-Repository Coordination
This issue coordinates with the SNORQL interface repository to create a comprehensive data quality solution. The SNORQL side will display the quality reports generated here.
Related: Will create corresponding issue in aopwiki-snorql-extended for the display interface.
Extend the existing weekly GitHub Actions pipeline to include comprehensive data quality validation and reporting.
Implementation Plan
Shape Expressions Validation
Quality Metrics to Generate
Output Format
quality-report.jsonalongside existing RDF filesIntegration Points
Cross-Repository Coordination
This issue coordinates with the SNORQL interface repository to create a comprehensive data quality solution. The SNORQL side will display the quality reports generated here.
Related: Will create corresponding issue in aopwiki-snorql-extended for the display interface.