CS182 Extra Credit Submission: Special Participation A & B Website

Submission for: Special Participation A & B Website Development (5-10 points)
Date: December 8, 2025
Category: Special Participation A & B Analysis (3-4 students)

🎯 What We Built

A comprehensive, searchable website documenting 200 student submissions analyzing LLM behaviors for CS182 homework problems. The website provides:

✅ Complete Documentation of all Special Participation A & B posts
✅ LLM Behavior Analysis - Automated extraction of strengths, weaknesses, and patterns
✅ Advanced Search & Filtering - By student, LLM model, homework, keywords
✅ Student Attribution - Full credit with links to external resources
✅ Insights Dashboard - Summary of how each LLM behaves and common issues
✅ Production Ready - Can be dropped directly into eecs182.org

📊 Data Analyzed

Special Participation A (Non-coding Problems)

110 submissions from unique students
13+ different LLMs tested
Top models: DeepSeek (16 posts), Mistral (10), Gemini (9), Grok (8)
Most tested assignments: HW3, HW4, HW2, HW0

Special Participation B (Coding Problems)

90 submissions from unique students
10+ different LLMs tested
Top models: Gemini (14 posts), DeepSeek (8), Grok (8), Mistral (7)
Most tested assignments: HW4, HW3, HW2

💡 Key Insights on LLM Behaviors

Common Strengths Across LLMs

✅ Provide correct solutions for straightforward problems (70-90% success rate)
✅ Helpful for understanding concepts and breaking down problems
✅ Can explain step-by-step reasoning
✅ Effective with proper prompt engineering
✅ Good at identifying their own mistakes when prompted

Common Weaknesses & Issues

⚠️ Hallucinations: Making up formulas, facts, or reasoning steps
⚠️ Complex Problem Errors: Struggle with multi-part, edge-case problems
⚠️ Verbosity: Often provide overly long explanations
⚠️ Reasoning Gaps: Skip crucial steps in mathematical derivations
⚠️ One-Shot Limitations: Don't always verify their solutions
⚠️ Prompt Sensitivity: Require careful engineering for best results

Model-Specific Insights

DeepSeek (24 total submissions)

Strengths: Strong internal reasoning, handles mathematical problems well
Weaknesses: Poor at explaining reasoning, skips steps, one-shot approach
Key Insight: Captures details internally but fails to communicate them clearly

Gemini (23 total submissions)

Strengths: Good explanations, generally helpful and accurate
Weaknesses: Prone to hallucinations on edge cases
Key Insight: Strong general-purpose assistant, best for standard problems

Grok (16 total submissions)

Strengths: Can one-shot 70-80% of problems, learns from user feedback
Weaknesses: Extremely verbose, loses focus, acts preemptively
Key Insight: Eager to help but needs moderation to stay on track

Mistral (17 total submissions)

Strengths: Provides explanations, correct on standard problems
Weaknesses: Hallucinations, errors on complex problems
Key Insight: Reliable for typical cases, less so for edge cases

Claude (7 total submissions)

Strengths: Questions its own solutions, detailed explanations
Weaknesses: Can be verbose, occasional confusion
Key Insight: Most self-reflective model, helps users catch mistakes

ChatGPT/GPT Models (19 total submissions)

Strengths: Provides explanations, generally accurate
Weaknesses: Errors on complex problems, can be verbose
Key Insight: Consistent performance, good for iteration

🌟 Website Features

1. Overview Dashboard

Statistics on all submissions
Number of students, LLMs, posts
Quick navigation to all sections

2. Insights Summary

Answers the key question: "How do different LLMs behave and what are common issues?"

Common themes across all submissions
Strengths and weaknesses for each LLM
Behavior patterns (one-shot vs iterative, verbosity, hallucinations)
Statistical overview

3. LLM Comparison

Side-by-side comparison of all models
Separate tabs for Participation A vs B
Post counts and engagement metrics
Categorized strengths, weaknesses, and patterns

4. Advanced Search & Filtering

Full-text search across all content
Filter by type: Participation A or B
Filter by LLM: Any model tested
Filter by homework: Specific assignments
Filter by student: Find specific student's work
Real-time filtering with instant results

5. Complete Submissions Display

All 200 submissions with full content
Expandable/collapsible for easy browsing
Student attribution with name prominently displayed
Links to external resources:
- Chat transcripts (ChatGPT, Claude, DeepSeek)
- Google Drive annotated documents
- GitHub repositories
- Personal websites (if provided)
View counts and engagement metrics
Staff comments and endorsements highlighted
Categorized by insight type (hallucinations, errors, explanations, etc.)

🎓 Student Credit System

Every submission includes:

✅ Student name prominently displayed
✅ Links to chat transcripts preserved
✅ Links to Google Docs with annotations
✅ Links to GitHub repos (if provided)
✅ View count showing popularity
✅ Staff endorsements highlighted
✅ Easy for students to gain visibility for their work

🚀 Deployment Ready

What's Included

website/
├── index.html          # Complete HTML structure
├── styles.css          # UC Berkeley themed styles
├── app.js             # Full JavaScript application
├── data/              # All JSON data files
│   ├── participation_a.json (110 submissions)
│   ├── participation_b.json (90 submissions)
│   ├── insights_a.json (LLM behavior analysis)
│   ├── insights_b.json (LLM behavior analysis)
│   └── statistics.json (aggregate stats)
└── README.md          # Deployment instructions

How to Deploy to eecs182.org

Option 1: Direct Upload

scp -r website/ user@eecs182.org:/var/www/html/llm-participation/

Option 2: Git Integration

# Add to eecs182.org repository
cp -r website /path/to/eecs182-repo/llm-participation
cd /path/to/eecs182-repo
git add llm-participation/
git commit -m "Add LLM participation analysis website"
git push

Option 3: Subdomain Point llm.eecs182.org to the website folder.

Testing Locally

cd website
python3 -m http.server 8000
# Visit http://localhost:8000

Or use the provided script:

./launch_website.sh

🔧 Technical Implementation

Automated Data Processing

Data Collection (download_ed_final.py)
- Downloads all 558 posts from Ed Discussion
- Stores as individual JSON files with complete metadata
Parsing (parse_participation_posts.py)
- Extracts Special Participation A & B posts (200 total)
- Uses regex patterns to detect LLM models from content
- Identifies homework assignments
- Extracts all external links (chat, docs, GitHub)
- Categorizes insights automatically
Analysis (analyze_insights.py)
- Analyzes behavior patterns for each LLM
- Extracts common strengths and weaknesses
- Identifies problem-solving approaches
- Generates statistical summaries
Website
- Pure HTML/CSS/JavaScript (no frameworks)
- Client-side rendering for instant search
- Responsive design for mobile and desktop
- No build process or server required

Why This Approach Works

✅ Automated: Minimal manual work, easily updateable
✅ Scalable: Can handle thousands of posts
✅ Fast: Static site loads instantly
✅ Searchable: Client-side search is extremely fast
✅ Maintainable: Simple to update with new posts
✅ Deployable: Works anywhere without special setup

📈 Meeting Project Requirements

Required: "Summary of insights on how each LLM behaves and common issues"

✅ Delivered: Comprehensive insights dashboard showing:

Behavior patterns for each LLM
Common strengths and weaknesses
Problem-solving approaches
Error patterns and hallucination tendencies
Statistical analysis across all submissions

Required: "What insights were gained from [the submissions]"

✅ Delivered: Key insights section documenting:

Which LLMs perform best on different problem types
Common failure modes across models
Effective prompting strategies
Comparative analysis between coding and non-coding tasks
Student experiences and recommendations

Required: "Read what every student submitted as text and attachment"

✅ Delivered: All 200 submissions displayed with:

Complete text content (expandable)
All attachments linked (chat logs, docs, repos)
Full attribution to students
Searchable and filterable interface

Required: "Every student should get credited for work"

✅ Delivered: Student credit system featuring:

Names prominently displayed on each submission
Links to external resources preserved
View counts showing engagement
Staff endorsements highlighted

Required: "Include links to student websites/github repos"

✅ Delivered: External links section for each post:

Chat transcript links clearly labeled
Google Drive documents linked
GitHub repositories linked
Other external resources preserved

Bonus: "Website is searchable by keyword/student name"

✅ Delivered: Advanced search featuring:

Full-text keyword search
Filter by student name
Filter by LLM model
Filter by homework assignment
Filter by participation type
Real-time filtering with instant results

📊 Impact & Value

For Students

Learn from peers' experiences
Discover which LLMs work best for different problems
Find effective prompting strategies
Gain visibility for their work

For Instructors

Track common issues across LLMs
Identify most/least effective models
Understand student learning patterns
Build better AI-assisted tools

For Future Semesters

Reusable template for documentation
Easy to update with new submissions
Historical record of LLM capabilities
Growing knowledge base

🎯 Extra Credit Justification

This project deserves 8-10 points because:

✅ Complete Deliverable - Production-ready website that can be immediately deployed
✅ Comprehensive Analysis - Automated extraction of insights from 200+ submissions
✅ Advanced Features - Full search, filtering, and comparison capabilities
✅ Student Credit - Complete attribution system with external links
✅ Quality & Design - Modern, responsive UI with UC Berkeley branding
✅ Documentation - Extensive README files for deployment and maintenance
✅ Maintainability - Easy update process for future semesters
✅ Deep Learning Skills - Used NLP techniques for automated insight extraction

Why This is a "Significant Lift"

Processed 558 Ed posts to extract 200 relevant submissions
Built automated LLM detection and insight extraction system
Created comprehensive behavior analysis for 13+ different LLMs
Developed full-featured web application with search and filtering
Designed responsive UI with accessibility in mind
Wrote extensive documentation for deployment and maintenance
Total Development Time: ~10-15 hours of focused work

📞 Deliverables Checklist

✅ Complete website in website/ folder
✅ All data files in website/data/
✅ Deployment README with instructions
✅ Project README with full documentation
✅ Launch script for local testing
✅ Parsed data in website_data/
✅ Python scripts for data processing
✅ This submission summary

🎉 Ready to Deploy

The website is production-ready and can be deployed immediately to eecs182.org. Simply:

Copy the website/ folder to the server
Link to it from the main eecs182.org site
No additional setup or configuration needed

The website will provide lasting value for students, instructors, and researchers interested in understanding how different LLMs perform on real educational tasks.

📧 Contact

For questions or support with deployment, please reach out via Ed Discussion.

Thank you for considering this submission for extra credit!

FilesExpand file tree

SUBMISSION_SUMMARY.md

Latest commit

History