Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
166 changes: 166 additions & 0 deletions docs/get-project-field-info.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
#!/usr/bin/env python3
"""
GitHub Project GraphQL Helper Script.

This script retrieves project field information from GitHub Projects using the GraphQL API.
It fetches custom fields, their IDs, and options for single-select fields, excluding
non-project fields.
"""

import requests
import json
import argparse
from typing import Dict, List, Tuple, Any
from pprint import pprint


def get_project_info(org: str, project_number: int, token: str) -> Tuple[str, Dict[str, Any]]:
"""
Retrieve project information and custom fields from GitHub Projects.

Args:
org: GitHub organization name
project_number: GitHub project number (integer)
token: GitHub personal access token with appropriate permissions

Returns:
Tuple containing:
- project_id: The GitHub project ID (string)
- fields: Dictionary mapping field names to field configurations

Raises:
requests.RequestException: If the HTTP request fails
KeyError: If the expected data structure is not found in the response
"""
headers = {"Authorization": f"Bearer {token}"}

query = '''
query($org: String!, $number: Int!) {
organization(login: $org){
projectV2(number: $number) {
id
}
}
}
'''

variables = {
"org": org,
"number": int(project_number),
}

data = {
"query": query,
"variables": variables,
}

response = requests.post("https://api.github.com/graphql", headers=headers, json=data)
response.raise_for_status() # Raise exception for bad status codes
response_json = json.loads(response.text)

project_id = response_json['data']['organization']['projectV2']['id']

query = '''
query($node: ID!){
node(id: $node) {
... on ProjectV2 {
fields(first: 20) {
nodes {
... on ProjectV2Field {
id
name
}
... on ProjectV2IterationField {
id
name
configuration {
iterations {
startDate
id
}
}
}
... on ProjectV2SingleSelectField {
id
name
options {
id
name
}
}
}
}
}
}
}
'''

variables = {
"node": project_id,
}

data = {
"query": query,
"variables": variables,
}

fields_response = requests.post("https://api.github.com/graphql", headers=headers, json=data)
fields_response.raise_for_status() # Raise exception for bad status codes
fields_response_json = json.loads(fields_response.text)

# Standard GitHub project fields that should be excluded as they are not controlled by the projectv2 API
not_project_fields = ['Title', 'Assignees', 'Labels', 'Linked pull requests', 'Reviewers', 'Repository', 'Milestone', 'Tracks']

# Filter out standard project fields and create field mappings
field_names = [
{'name': field['name'], 'id': field['id']}
for field in fields_response_json['data']['node']['fields']['nodes']
if field['name'] not in not_project_fields
]

fields = {
field['name']: field
for field in fields_response_json['data']['node']['fields']['nodes']
if field['name'] not in not_project_fields
}

# Process options for single-select fields
for field in fields.values():
if 'options' in field:
field['options'] = {option['name']: option['id'] for option in field['options']}

return project_id, fields


def main() -> None:
"""
Main function to parse command line arguments and execute the script.

Parses command line arguments for GitHub organization, project number,
and personal access token, then retrieves and displays project field information.
"""
# Set up argument parser
parser = argparse.ArgumentParser(description='GitHub Project GraphQL Helper')
parser.add_argument('--token', '-t', required=True, help='GitHub personal access token')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken the token plaintext on the command line is unsafe. It risks leaving it behind in shell histories and making it visible in the output of process scans like ps aux.

Could you please rework this to instead look for the GH_TOKEN environment variable unconditionally?

Like this:

token = os.environ["GH_TOKEN"]

It'd also be helpful to recommend the gh CLI in the instructions, to further discourage people from manually copying and pasting long-lived tokens.

Something like this:

gh auth login
export GH_TOKEN=$(gh auth token)

python ./docs/get-project-field-info.py --org rapidsai --project 128

parser.add_argument('--org', '-o', required=True, help='GitHub organization name')
parser.add_argument('--project', '-p', required=True, type=int, help='GitHub project number')

args = parser.parse_args()

try:
# Use the provided arguments
project_id, fields = get_project_info(org=args.org, project_number=args.project, token=args.token)

pprint(project_id)
pprint(fields)

except requests.RequestException as e:
print(f"HTTP request failed: {e}")
except (KeyError, ValueError) as e:
print(f"Data processing error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")


if __name__ == "__main__":
main()
143 changes: 143 additions & 0 deletions docs/project-automation-readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# RAPIDS Project Automation

## Overview

This collection of GitHub Actions workflows supports automation of the management of GitHub Projects, helping track development progress across RAPIDS repositories.

The automations can also synchronize fields from PR to Issue such as the statuses, release information, and development stages into project fields, making project management effortless and consistent.

## 🔍 The Challenge

GitHub Projects are powerful, but automating them is not straightforward:

- Each item has a **project-specific ID** that's different from its global node ID
- Different field types (text, date, single-select, iteration) require different GraphQL mutations
- Linking PRs to issues and keeping them in sync requires complex operations
- Work often begins in an Issue, but then is completed in a PR, requiring this synchronization

### PRs can be linked to multiple issues, and issues can be linked to multiple PRs which can lead to challenges in automation

The PR-to-Issue linkage exists within the PR's API, but not within the Issue's API. Additionally, in GitHub if many PRs are linked to an issue, closing _any_ of the PRs will close the issue.

The shared workflows here follow a similar pattern in that if `update-linked-issues` is run, it will update all issues linked to the PR if they are in the same project as the target workflow (ie issues with multiple Projects will only have the matching Project fields updated).

## 🧩 Workflow Architecture

By centralizing this logic in reusable workflows, all RAPIDS repositories maintain consistent project tracking without duplicating code.

Using a high-level tracker as an example, the automation is built as a set of reusable workflows that handle different aspects of project management (♻️ denotes a shared workflow):

```mermaid
flowchart LR
A[PR Event] --> B[project-high-level-tracker]
B --> E["get-project-id ♻️<br/>project-get-item-id.yaml"]
E --> H["update-week-field ♻️<br/>project-get-set-iteration-field.yaml"]
E --> I["set-opened-date-field ♻️<br/>project-set-text-date-numeric-field.yaml"]
E --> J["set-closed-date-field ♻️<br/>project-set-text-date-numeric-field.yaml"]
H --> K["update-linked-issues ♻️<br/>project-update-linked-issues.yaml"]
I --> K
J --> K
```

## 📁 Workflows

### Core Workflows

1. **project-get-item-id.yaml**
Gets the project-specific ID for an item (PR or issue) within a project - a critical first step for all operations.

2. **project-set-text-date-numeric-field.yaml**
Updates text, date, or numeric fields in the project.

3. **project-get-set-single-select-field.yaml**
Handles single-select (dropdown) fields like Status, Stage, etc.

4. **project-get-set-iteration-field.yaml**
Manages iteration fields for sprint/week tracking.

### Support Workflow

5. **project-update-linked-issues.yaml**
Synchronizes linked issues with the PR's field values.<br>
Unfortunately, we cannot go from Issue -> PR; the linkage exists only within the PR's API.


## 📋 Configuration

### Project Field IDs

The workflows require GraphQL node IDs for the project and its fields. These are provided as inputs:

```yaml
inputs:
PROJECT_ID:
description: "Project ID"
default: "PVT_placeholder" # PVT = projectV2
WEEK_FIELD_ID:
description: "Week Field ID"
default: "PVTIF_placeholder"
# ... other field IDs
```

### Getting Project and Field IDs

To gather the required GraphQL node IDs for your project setup, use the included helper script:

```bash
python docs/get-project-field-info.py --token YOUR_GITHUB_TOKEN --org PROJECT_ORG_NAME --project PROJECT_NUMBER
```

Using the cuDF Python project, https://github.com/orgs/rapidsai/projects/128
```bash
python docs/get-project-field-info.py --token ghp_placeholder --org rapidsai --project 128
```

The script will output:
1. The Project ID (starts with `PVT_`) - use this for the `PROJECT_ID` input
2. A dictionary of all fields in the project with their IDs and metadata:
- Regular fields (text, date, etc.) `PVTF_...`
- Single-select fields include their options with option IDs `PVTSSF_...`
- Iteration fields include configuration details `PVTIF_...`

Example output:
```bash
PVT_kwDOAp2shc4AiNzl

'Release': {'id': 'PVTSSF_lADOAp2shc4AiNzlzgg52UQ',
'name': 'Release',
'options': {'24.12': '582d2086',
'25.02': 'ee3d53a3',
'25.04': '0e757f49',
'Backlog': 'be6006c4'}},
'Status': {'id': 'PVTSSF_lADOAp2shc4AiNzlzgaxNac',
'name': 'Status',
'options': {'Blocked': 'b0c2860f',
'Done': '98236657',
'In Progress': '47fc9ee4',
'Todo': '1ba1e8b7'}},
...
```

Use these IDs to populate the workflow input variables in your caller workflow.

## 📊 Sample Use Case

### Tracking PR Progress

The workflow automatically tracks where a PR is in the development lifecycle:

1. When a PR is opened against a release branch, it's tagged with the release
2. The stage is updated based on the current date relative to release milestones
3. Week iteration fields are updated to show current sprint
4. All linked issues inherit these values


## 📚 Related Resources

- [GitHub GraphQL API Documentation](https://docs.github.com/en/graphql)
- [GitHub Projects API](https://docs.github.com/en/issues/planning-and-tracking-with-projects/automating-your-project/using-the-api-to-manage-projects)

---

> [!Note]
> These workflows are designed for the RAPIDS ecosystem but can be adapted for any organization using GitHub Projects for development tracking.