Free, open-source Python scraper for MLB betting statistics.
Pulls pitcher game logs, batter game logs, park factors, bullpen usage, and Statcast advanced metrics (pitch velocity, spin, exit velocity, sprint speed, catcher framing) — all from free public sources.
- Pitcher game logs: IP, K, BB, HR, pitches, game score, ERA, WHIP
- Batter game logs: AB, H, 2B, 3B, HR, RBI, BB, K, SB, OBP, SLG, OPS
- Rolling averages over configurable windows (L3, L5, L8, L15)
- Pitcher splits (vs LHB/RHB, home/away, day/night)
- Batter vs pitcher head-to-head historical stats
- Park factors (batting, pitching, HR) for all 30 teams
- Bullpen usage tracker (appearances + pitch counts over last N days)
- Statcast: pitch arsenal (velocity, spin rate, movement, whiff%, put-away%)
- Statcast: exit velocity, launch angle, barrel%, hard-hit%, xBA, xwOBA
- Statcast: sprint speed leaderboard
- Statcast: catcher framing (strikes gained, runs saved)
- Raw pitch-by-pitch CSV download via Statcast search
- Async with rate limiting and in-memory caching
- No API key required
pip install -r requirements.txtimport asyncio
from baseball_reference import BaseballReferenceScraper
from statcast import StatcastScraper
async def main():
# Pitcher game logs
async with BaseballReferenceScraper() as bbref:
games = await bbref.get_pitcher_gamelog("scherma01", 2024)
print(f"Max Scherzer: {len(games)} starts")
# Rolling averages
rolling = BaseballReferenceScraper.pitcher_rolling(games, windows=(3, 5, 10))
for w, stats in rolling.items():
print(f"L{w}: ERA={stats['era']:.2f} K/9={stats['k_per_9']:.1f}")
# Park factors
parks = await bbref.get_park_factors(2024)
for p in parks[:5]:
print(f"{p.team}: HR factor={p.hr_factor:.2f}")
# Statcast advanced metrics
async with StatcastScraper() as sc:
exit_velo = await sc.get_exit_velo(2024, min_bbe=100)
for batter in exit_velo[:5]:
print(f"{batter.player_name}: avg EV={batter.avg_exit_velo} barrel%={batter.barrel_pct}")
asyncio.run(main())Run the included example:
python example.py| Stat | Description |
|---|---|
PitcherGameLog.ip |
Innings pitched (decimal) |
PitcherGameLog.strikeouts |
Strikeouts |
PitcherGameLog.walks |
Walks |
PitcherGameLog.earned_runs |
Earned runs |
PitcherGameLog.pitches |
Total pitches |
PitcherGameLog.game_score |
Bert Blyleven game score |
PitcherGameLog.era |
Rolling ERA |
PitcherGameLog.k_per_9 |
Strikeouts per 9 (computed) |
BatterGameLog.hr |
Home runs |
BatterGameLog.obp |
On-base percentage |
BatterGameLog.slg |
Slugging percentage |
BatterGameLog.ops |
OPS |
ParkFactor.hr_factor |
Park HR factor (1.0 = neutral) |
| Stat | Description |
|---|---|
PitchData.velocity |
Average pitch velocity (mph) |
PitchData.spin_rate |
Average spin rate (rpm) |
PitchData.whiff_pct |
Swing-and-miss rate |
PitchData.put_away_pct |
Put-away rate on 2-strike counts |
ExitVeloData.avg_exit_velo |
Average exit velocity (mph) |
ExitVeloData.barrel_pct |
Barrel rate |
ExitVeloData.hard_hit_pct |
Hard-hit rate (95+ mph) |
ExitVeloData.xwoba |
Expected wOBA |
SprintSpeed.sprint_speed |
Sprint speed (ft/s) |
CatcherFraming.runs_saved |
Framing runs above average |
- Baseball Reference — game logs, splits, park factors
- Baseball Savant / Statcast — Statcast leaderboards and pitch-by-pitch CSV
MIT — use it however you want.