Skip to content

mtarcure/baseball-betting-scraper

Repository files navigation

Baseball Betting Scraper

Free, open-source Python scraper for MLB betting statistics.

Pulls pitcher game logs, batter game logs, park factors, bullpen usage, and Statcast advanced metrics (pitch velocity, spin, exit velocity, sprint speed, catcher framing) — all from free public sources.

Features

  • Pitcher game logs: IP, K, BB, HR, pitches, game score, ERA, WHIP
  • Batter game logs: AB, H, 2B, 3B, HR, RBI, BB, K, SB, OBP, SLG, OPS
  • Rolling averages over configurable windows (L3, L5, L8, L15)
  • Pitcher splits (vs LHB/RHB, home/away, day/night)
  • Batter vs pitcher head-to-head historical stats
  • Park factors (batting, pitching, HR) for all 30 teams
  • Bullpen usage tracker (appearances + pitch counts over last N days)
  • Statcast: pitch arsenal (velocity, spin rate, movement, whiff%, put-away%)
  • Statcast: exit velocity, launch angle, barrel%, hard-hit%, xBA, xwOBA
  • Statcast: sprint speed leaderboard
  • Statcast: catcher framing (strikes gained, runs saved)
  • Raw pitch-by-pitch CSV download via Statcast search
  • Async with rate limiting and in-memory caching
  • No API key required

Installation

pip install -r requirements.txt

Quick Start

import asyncio
from baseball_reference import BaseballReferenceScraper
from statcast import StatcastScraper

async def main():
    # Pitcher game logs
    async with BaseballReferenceScraper() as bbref:
        games = await bbref.get_pitcher_gamelog("scherma01", 2024)
        print(f"Max Scherzer: {len(games)} starts")

        # Rolling averages
        rolling = BaseballReferenceScraper.pitcher_rolling(games, windows=(3, 5, 10))
        for w, stats in rolling.items():
            print(f"L{w}: ERA={stats['era']:.2f} K/9={stats['k_per_9']:.1f}")

        # Park factors
        parks = await bbref.get_park_factors(2024)
        for p in parks[:5]:
            print(f"{p.team}: HR factor={p.hr_factor:.2f}")

    # Statcast advanced metrics
    async with StatcastScraper() as sc:
        exit_velo = await sc.get_exit_velo(2024, min_bbe=100)
        for batter in exit_velo[:5]:
            print(f"{batter.player_name}: avg EV={batter.avg_exit_velo} barrel%={batter.barrel_pct}")

asyncio.run(main())

Run the included example:

python example.py

Stats Available

Baseball Reference (baseball_reference.py)

Stat Description
PitcherGameLog.ip Innings pitched (decimal)
PitcherGameLog.strikeouts Strikeouts
PitcherGameLog.walks Walks
PitcherGameLog.earned_runs Earned runs
PitcherGameLog.pitches Total pitches
PitcherGameLog.game_score Bert Blyleven game score
PitcherGameLog.era Rolling ERA
PitcherGameLog.k_per_9 Strikeouts per 9 (computed)
BatterGameLog.hr Home runs
BatterGameLog.obp On-base percentage
BatterGameLog.slg Slugging percentage
BatterGameLog.ops OPS
ParkFactor.hr_factor Park HR factor (1.0 = neutral)

Statcast / Baseball Savant (statcast.py)

Stat Description
PitchData.velocity Average pitch velocity (mph)
PitchData.spin_rate Average spin rate (rpm)
PitchData.whiff_pct Swing-and-miss rate
PitchData.put_away_pct Put-away rate on 2-strike counts
ExitVeloData.avg_exit_velo Average exit velocity (mph)
ExitVeloData.barrel_pct Barrel rate
ExitVeloData.hard_hit_pct Hard-hit rate (95+ mph)
ExitVeloData.xwoba Expected wOBA
SprintSpeed.sprint_speed Sprint speed (ft/s)
CatcherFraming.runs_saved Framing runs above average

Data Sources

License

MIT — use it however you want.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages