Engineering Impact Dashboard

Deep analytical profiling of engineering contributors for any GitHub repository. Quantifies impact using six behavioral traits grounded in peer-reviewed research, assigns persona archetypes, and visualizes collaboration through a knowledge graph.

Live demo: https://posthog-eng-impact-dashboard.vercel.app

Motivation

Traditional engineering metrics — lines of code, commit counts, PR throughput — reward volume over value. A developer who ships 500 lines of throwaway code looks more productive than one who writes 50 lines that survive for years. Teams relying on these numbers make poor staffing decisions, misidentify bottlenecks, and lose their quiet force-multipliers.

This project replaces vanity metrics with behavioral signals that answer the questions engineering leads actually care about: whose code sticks around after the first week? Who makes everyone else's code better through reviews? Who holds institutional knowledge across system boundaries? Run the pipeline against your own repository before a reorg, a performance cycle, or a hiring plan — and see where impact actually lives, backed by the same research (DORA/Accelerate, GitClear, Bosu et al.) used by teams at Google, Microsoft, and Spotify to measure what matters.

Architecture

pipeline/                          web/
┌─────────────┐                    ┌──────────────────────┐
│ extract.py  │──→ raw_data/       │ Next.js 16 (App      │
│  (GitHub    │    extracted.json   │ Router, SSR)         │
│   GraphQL)  │                    │                      │
├─────────────┤                    │ Components:          │
│ sanitize.py │──→ cache/          │  Dashboard           │
│  (bot       │    reviewer_       │  KnowledgeGraph      │
│   detection)│    classifications │  ContributorList     │
├─────────────┤                    │  ProfileModal        │
│ analyze.py  │──→ computed/       │  ResearchModal       │
│  (6 traits, │    traits.json     │  MiniRadar/FullRadar │
│   K-means)  │                    │  StatsBar            │
├─────────────┤                    └──────────┬───────────┘
│ refine_     │──→ traits.json                │
│ personas.py │    (in-place)                 │
├─────────────┤                               │
│ graph.py    │──→ web/public/data/ ──────────┘
│  (nodes +   │    analysis.json
│   edges)    │    (read at build time)
└─────────────┘

Pipeline scripts (run in order)

Script	Purpose	Output
`extract.py`	Fetches git log, PR reviews, issues, profiles via GitHub GraphQL/REST	`raw_data/extracted.json` (~4 MB)
`sanitize.py`	4-layer bot detection via GitHub API (account type, bio regex, /apps/, ghost)	`cache/reviewer_classifications.json`
`analyze.py`	Computes 6 traits per contributor with git blame sampling, PageRank, Shannon entropy	`computed/traits.json`
`refine_personas.py`	Two-pass persona correction: hard rules + soft signature scoring	Updates `traits.json` in-place
`graph.py`	Builds knowledge graph: co-authorship (0.40) + reviews (0.35) + Jaccard files (0.25)	`web/public/data/analysis.json`

Six behavioral traits

Trait	Method	Research basis
Code Survivability	14-day churn window via git blame	GitClear (2024), 211M lines
Collaboration Index	Weighted composite: reviews + co-authors + cross-scope + issues	Bosu et al. (2015), MSR/IEEE
System Breadth	Shannon entropy H(D)/H_max over directory domains	Shannon (1948)
Focus Depth	Gini coefficient of commit distribution	Vasa et al. (2009), IEEE
Review Influence	PageRank (d=0.85) on reviewer→author graph	Brin & Page (1998)
Velocity Consistency	1 - CV(weekly_commits)	Forsgren et al. (2018), DORA/Accelerate

Five persona archetypes

The Architect — high breadth + high survivability + feature-dominant
The Mentor — high review influence + high collaboration
The Firefighter — bursty velocity + fix-dominant
The Solo Maker — high focus + low collaboration + low review
The Operator — high velocity + chore/CI significant

Frontend stack

Next.js 16 with App Router (server-side rendering)
React 19, TypeScript
recharts (radar charts with percentage tooltips)
react-force-graph-2d (knowledge graph with persona-colored nodes)
Tailwind CSS v4 (dark theme)

Prerequisites

Python 3.10+
Node.js 22+ with pnpm
GitHub personal access token with repo scope

Local setup

1. Pipeline

cd pipeline

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Set GitHub token
export GITHUB_TOKEN=ghp_your_token_here

# Run extraction (slow — GitHub API calls + git log parsing)
python extract.py

# Sanitize bot accounts
python sanitize.py

# Compute traits (slow — git blame sampling)
python analyze.py

# Refine persona assignments
python refine_personas.py

# Build knowledge graph → web/public/data/analysis.json
python graph.py

2. Frontend

cd web

# Install dependencies
pnpm install

# Development server
pnpm dev
# → http://localhost:3000

# Production build
pnpm build
pnpm start

3. Deploy to Vercel

cd web

# Link project (one-time)
vercel link --project posthog-eng-impact-dashboard

# Build locally and deploy (avoids uploading monorepo)
vercel build --prod
vercel deploy --prebuilt --prod

Project structure

eng-impact-dashboard/
├── pipeline/
│   ├── extract.py                 # GitHub data extraction
│   ├── sanitize.py                # Bot detection & classification
│   ├── analyze.py                 # 6-trait computation engine
│   ├── refine_personas.py         # Persona correction pass
│   ├── graph.py                   # Knowledge graph builder
│   ├── requirements.txt           # Python dependencies
│   ├── cache/                     # API response cache (gitignored)
│   ├── computed/                  # Trait results (gitignored)
│   └── raw_data/                  # Extracted data (gitignored)
├── web/
│   ├── public/data/analysis.json  # Final output consumed by frontend
│   ├── src/
│   │   ├── app/                   # Next.js pages + layout
│   │   ├── components/            # React components
│   │   └── types.ts               # TypeScript interfaces
    ├── next.config.js
    ├── package.json
    └── tsconfig.json

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
pipeline		pipeline
web		web
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Engineering Impact Dashboard

Motivation

Architecture

Pipeline scripts (run in order)

Six behavioral traits

Five persona archetypes

Frontend stack

Prerequisites

Local setup

1. Pipeline

2. Frontend

3. Deploy to Vercel

Project structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Engineering Impact Dashboard

Motivation

Architecture

Pipeline scripts (run in order)

Six behavioral traits

Five persona archetypes

Frontend stack

Prerequisites

Local setup

1. Pipeline

2. Frontend

3. Deploy to Vercel

Project structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages