grader

this is an app where rohan's students can submit their ongoing projects and get tailored feedback from an LLM according to his rubric. it uses the github API and Claude opus to crawl and evaluate repository content (markdown files, code, repo structure, etc) and provide detailed feedback.

i used Claude for a lot of the debugging in this project and also for navigating previously uncharted waters of javascript/flask/render

Setup Instructions

Prerequisites

Anthropic account for Claude API access
Render account for deployment

Step 1: Local Setup

Clone the repository:

git clone https://github.com/michaeladrouillard/grader.git
cd grader

Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate  # on Windows, use: venv\Scripts\activate
pip install -r requirements.txt

Step 2: API Keys

GitHub API Key:
- GitHub Settings > Developer settings > Personal access tokens > Tokens (classic)
- Generate new token with repo scope
Anthropic API Key:
- Go to Console
- Generate an API key

Step 3: Local Configuration

Create a config.py file in the src directory:

GITHUB_API_TOKEN = 'your_github_token_here'
ANTHROPIC_API_KEY = 'your_anthropic_key_here'

You can also use os so you don't have to hardcode it or whatever here. But just make sure you .gitignore this file

Test locally:

python src/repo_grader.py https://github.com/username/repositoryThatYouWantToEvaluateUsingThisApp

Step 4: Deployment to Render

Create a new Web Service on Render:
- Go to Render Dashboard
- Click "New +" and select "Web Service"
- Connect your GitHub repository
Configure the Web Service:
- Name: Choose a name (e.g., "paper-grader")
- Environment: Python 3
- Build Command: pip install -r requirements.txt
- Start Command: python app.py
- Add Environment Variables:
  - GITHUB_API_TOKEN: Your GitHub token
  - ANTHROPIC_API_KEY: Your Anthropic API key
Deploy your service:
- Click "Create Web Service"
- Wait for deployment to complete

Step 5: Frontend Setup

Update the API endpoint in docs/js/grader.js:

const API_URL = 'https://your-render-service-name.onrender.com/api/grade';

Deploy the frontend to GitHub Pages:
- Go to repository Settings > Pages
- Set source to GitHub Actions
- Commit and push your changes
- Wait for the GitHub Action to complete

The grader should now be accessible at https://yourusername.github.io/grader!

Costs

The app uses Claude 3 Opus, and when I tested it on repos students submitted for the election forecasting assignment it came out to about $0.70 per run. YMMV based on the repo/prompt/token count being ingested by the model, and you can track cost stuff on the Anthropic Console.

Customizing for a Different Rubric

Updating the Rubric

Use an LLM for all of this it will go much faster lol.

Modify rubric.json

Locate src/data/rubric.json
Follow this structure for each rubric item:

{
  "rubric_items": [
    {
      "title": "Item Name",
      "range": {
        "min": 0,
        "max": 10  // Maximum possible points
      },
      "values": {
        "0": "Poor or not done",
        "2": "Some issues",
        "4": "Acceptable",
        "6": "Exceeds expectations",
        "8": "Exceptional"
      },
      "criteria": "Detailed description of what to look for when grading this item",
      "critical": false  // Set to true if failing this item should result in zero overall
    }
  ]
}

Important Notes About Rubric Structure:
- Each item MUST have a UNIQUE title
- The values object must include all possible scores
- Scores must be within the range.min and range.max
- critical items should use a max score of 1 (pass/fail). This is for items where if the student doesnt do it, they fail the assignment entirely.

Update Frontend Configuration

Modify maxGrades in docs/js/grader.js:

const maxGrades = {
    'Your Item Name': maximum_points,
    // Add all your rubric items here
};

Update item categories:

const itemCategories = {
    critical: [
        // Your critical items
    ],
    documentation: [
        // Your documentation items
    ],
    // Add other categories as needed
};

Update Backend Processing

Modify category assignments in src/repo_grader.py:

# In batch_grade_rubric method
doc_items = [item for item in self.rubric 
            if not item.get('critical', False) and 
            item['title'].lower() in ['your', 'document', 'items']]

tech_items = [item for item in self.rubric 
             if not item.get('critical', False) and 
             item['title'].lower() in ['your', 'technical', 'items']]

The backend works this way to drastically reduce the amount of tokens/LLM calls needed to grade a repo by grouping rubric items into batches based on which docs the LLM should look at. So, the tech items will look at .py, .ipynb, etc files and so on. Depending on your rubric, you may have to play around with this, but this is basically the main juncture where you can reduce costs/latency.

Once you set this up, test locally.

Potential pitfalls

Inconsistent item names between rubric and code
Forgetting to update both frontend and backend
Not properly setting up the backend processing (i.e., during a test run the LLM wasn't grading a rubric item that evaluated whether the student put their name and date. this information would have been found in the docs (.qmd, .pdf, etc.) but instead this item ended up in a batch that was fed repo structure and metadata)

Contributing

Feel free to submit issues and pull requests for any improvements :-) I only tested this for the election forecasting assignment, so there may be hiccups for other kinds of rubrics.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
docs		docs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
grading_report_Marziia_us-presidential-election-analysis_20241216_153613.md		grading_report_Marziia_us-presidential-election-analysis_20241216_153613.md
grading_report_Marziia_us-presidential-election-analysis_20241216_154333.md		grading_report_Marziia_us-presidential-election-analysis_20241216_154333.md
gunicorn.conf.py		gunicorn.conf.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

grader

Table of Contents

Setup Instructions

Prerequisites

Step 1: Local Setup

Step 2: API Keys

Step 3: Local Configuration

Step 4: Deployment to Render

Step 5: Frontend Setup

Costs

Customizing for a Different Rubric

Updating the Rubric

Update Frontend Configuration

Update Backend Processing

Potential pitfalls

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

michaeladrouillard/grader

Folders and files

Latest commit

History

Repository files navigation

grader

Table of Contents

Setup Instructions

Prerequisites

Step 1: Local Setup

Step 2: API Keys

Step 3: Local Configuration

Step 4: Deployment to Render

Step 5: Frontend Setup

Costs

Customizing for a Different Rubric

Updating the Rubric

Update Frontend Configuration

Update Backend Processing

Potential pitfalls

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages