Skip to content

OWASP-BLT/OWASP-metadata

Repository files navigation

OWASP Metadata

GitHub Pages Scraper Workflow License Python JavaScript HTML5 CSS3 Chart.js Mermaid PRs Welcome GitHub Issues GitHub Stars GitHub Forks

A unified metadata aggregation system for OWASP projects and chapters. This project aims to unify and standardize data across the OWASP repository ecosystem without requiring major changes to existing repositories, by leveraging the existing Jekyll front matter in index.md files.

🌐 Live Demo

View the Live Dashboard β†’

Explore the interactive web interface with:

Purpose

The primary goal of this project is to:

  1. Aggregate Metadata: Collect and standardize metadata from OWASP repositories that use Jekyll-based index.md files with YAML front matter
  2. Enable Discovery: Power the OWASP Slack bot to guide new users toward projects they would be interested in based on their skills, interests, and location
  3. Provide Insights: Offer analytics and visualizations on metadata coverage across the OWASP ecosystem

How It Works

OWASP repositories typically include an index.md file with Jekyll front matter containing metadata such as:

---
title: Project Name
layout: col-sidebar
tags: security, web, tools
level: 3
type: tool
region: Global
pitch: A brief description of the project
---

This project:

  1. Scrapes all OWASP organization repositories via the GitHub API
  2. Extracts YAML front matter from each repository's index.md file
  3. Normalizes the data into consistent formats (CSV, JSON)
  4. Visualizes the data through a web-based explorer and analytics dashboard

Data Outputs

The scraper generates several data files in the data/ directory:

File Description
metadata.json Complete metadata for all repositories in JSON format
metadata.csv Full metadata in CSV format
metadata_matrix.json Matrix showing which fields are present per repository
metadata_matrix.csv Matrix in CSV format
metadata_summary.md Summary of field usage across all repositories
metadata_checklist.csv Checklist format for tracking metadata completeness

Web Interface

The project includes multiple interactive web interfaces:

Interface Description Link
Metadata Explorer Interactive table for browsing, filtering, and searching repository metadata View β†’
Analytics Dashboard Visual analytics showing field usage, completeness rates, and trends View β†’
Project Wayfinder Visual diagram showing projects grouped by type and maturity level View β†’
SDLC Integration Chart Mermaid-based diagram mapping OWASP projects to SDLC phases View β†’

Features

  • πŸŒ“ Dark/Light Theme Toggle - Switch between themes for comfortable viewing
  • πŸ” Advanced Filtering - Filter by project type, maturity level, and metadata fields
  • πŸ“₯ Export Functionality - Download data as CSV or diagrams as SVG
  • πŸ“± Responsive Design - Works on desktop and mobile devices
  • ⚑ Real-time Updates - Data refreshes weekly via GitHub Actions

OWASP Slack Bot Integration

The standardized metadata from this project will be consumed by the OWASP Slack bot to:

  • Help new contributors find projects matching their skills and interests
  • Recommend relevant chapters based on user location
  • Provide quick access to project information and resources
  • Guide users to projects based on tags, type, and activity level

Project Structure

β”œβ”€β”€ scripts/
β”‚   └── scrape_metadata.py    # Main scraper script
β”œβ”€β”€ data/                     # Generated metadata files
β”œβ”€β”€ index.html                # Metadata explorer UI
β”œβ”€β”€ charts.html               # Analytics dashboard
β”œβ”€β”€ diagram.html              # Project Wayfinder diagram
β”œβ”€β”€ mermaid-diagram.html      # SDLC integration diagram
β”œβ”€β”€ app.js                    # Explorer application logic
β”œβ”€β”€ charts.js                 # Analytics charts logic
β”œβ”€β”€ diagram.js                # Project Wayfinder logic
β”œβ”€β”€ mermaid-diagram.js        # SDLC diagram logic
β”œβ”€β”€ mermaid.min.js            # Mermaid library for diagrams
β”œβ”€β”€ styles.css                # Shared styles
└── charts.css                # Analytics-specific styles

Usage

Running the Scraper

# Set up environment
pip install -r requirements.txt

# Set GitHub token (optional, but recommended for higher rate limits)
export GITHUB_TOKEN=your_token_here

# Run the scraper
python scripts/scrape_metadata.py

Viewing the Data

Visit the Live Dashboard to explore the metadata interactively, or run locally by opening index.html in a browser.

For analytics and visualizations, visit the Analytics Dashboard.

To see how OWASP projects map to the Software Development Lifecycle, check out the SDLC Integration Chart.

Contributing

Contributions are welcome! This project helps improve metadata consistency across OWASP repositories. If you notice missing or inconsistent metadata in OWASP projects, consider contributing to those repositories by adding or updating their index.md front matter.

How to Contribute

  1. Fork this repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is part of the OWASP Foundation's open source initiatives and is licensed under the Apache License 2.0.


Made with ❀️ by the OWASP BLT Project

Releases

No releases published

Packages

No packages published

Contributors 5