-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Overview
As PinchBench grows in popularity, we need abuse detection and prevention mechanisms. This issue tracks thoughts and potential mitigations.
Potential Abuse Vectors
1. Fake/Inflated Scores
- Submitting fabricated results to boost a model's ranking
- Modifying benchmark tasks locally before running
- Cherry-picking only successful runs
2. Spam Submissions
- Flooding the API with junk submissions
- Creating many tokens to bypass per-token limits
- DoS via expensive database operations
3. Leaderboard Gaming
- Submitting the same high score repeatedly to dominate "recent" views
- Creating fake "verified" accounts
Ideas for Mitigation
Submission Validation
- Task hash verification: Include hash of task files in submission; reject if it doesn't match known benchmark version
- Timing sanity checks: Flag submissions where execution time is suspiciously fast (faster than model's known token generation speed)
- Cost sanity checks: Flag if reported cost is way off from expected given token counts
- Score variance detection: Alert if a model suddenly jumps significantly from historical average
Rate Limiting
- Per-token submission limits (e.g., max 50/day)
- Per-IP registration limits (already have this)
- Cooldown between submissions for same model from same token
Verification Tiers
- Unverified: Anyone can submit, shown but flagged
- Verified: GitHub-linked accounts, higher trust
- Official: Our benchmark runs, marked as authoritative
Anomaly Detection
- Track submission patterns per token
- Flag accounts that only submit one model (potential shill accounts)
- Compare community submissions against official runs for same model
Transparency
- Public audit log of flagged/removed submissions
- Show submission history per user (helps community police)
Questions
- How aggressive should we be? False positives hurt legitimate users
- Do we hide suspicious submissions or just flag them?
- Should we require GitHub verification for leaderboard inclusion?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels