diff --git a/.claude/commands/rtw-help.md b/.claude/commands/rtw-help.md new file mode 100644 index 0000000..02e986d --- /dev/null +++ b/.claude/commands/rtw-help.md @@ -0,0 +1,47 @@ +--- +description: Show all RTW commands and domain primer +argument-hint: [topic] +model: haiku +allowed-tools: Read, Glob, Grep +--- + +# RTW Command Help + +Show available commands and optionally provide a domain primer. + +## Step 1: List All Commands + +Read the YAML frontmatter `description:` field from each file in `.claude/commands/rtw-*.md`. Organize into two categories: + +**Domain Workflow** (interactive, for trip planning): +- `/rtw-plan` — [read description from frontmatter] +- `/rtw-search` — [read description from frontmatter] +- `/rtw-analyze` — [read description from frontmatter] +- `/rtw-booking` — [read description from frontmatter] +- `/rtw-compare` — [read description from frontmatter] +- `/rtw-lookup` — [read description from frontmatter] + +**Developer Tools** (fast, for project health): +- `/rtw-verify` — [read description from frontmatter] +- `/rtw-status` — [read description from frontmatter] +- `/rtw-setup` — [read description from frontmatter] +- `/rtw-help` — [read description from frontmatter] + +Show the typical workflow: +``` +Plan → Search → Verify (D-class) → Analyze → Book +``` + +## Step 2: Topic Deep Dive (if argument provided) + +If `$ARGUMENTS` contains a topic keyword, provide a brief primer: + +**"domain"** or **"basics"**: Explain oneworld Explorer RTW tickets, Rule 3015, ticket types (DONE/LONE), tariff conferences (TC1/TC2/TC3), and the booking process (call AA desk). + +**"carriers"**: List oneworld carriers relevant to RTW: BA, QF, CX, JL, AA, QR, IB, AY, MH, RJ, SriLankan, FJ, LATAM. Note which have high/low YQ surcharges. + +**"ntp"**: Explain BA New Tier Points — how they're earned on RTW tickets, which segments earn most, and the NTP calculation formula. + +**"verify"** or **"dclass"**: Explain D-class availability, ExpertFlyer, what D0-D9 means, and how the verify command works. + +If no argument, just show the command list. diff --git a/.claude/commands/rtw-init.md b/.claude/commands/rtw-init.md new file mode 100644 index 0000000..7f07da4 --- /dev/null +++ b/.claude/commands/rtw-init.md @@ -0,0 +1,130 @@ +--- +description: First-time credential and environment setup +allowed-tools: Bash(python3 *), Bash(uv *), Bash(which *), Bash(echo *), AskUserQuestion, Read, Write +--- + +# RTW Optimizer — First-Time Initialization + +Check all required credentials and services, guide the user through setting up anything missing. + +## Step 1: Check Python & Dependencies + +Run: `python3 --version` and check output contains "3.11" or higher. + +If Python is missing or too old, stop and explain the requirement. + +Run: `uv sync` to ensure all dependencies are installed. + +## Step 2: Check SerpAPI Key + +Run: `python3 -c "import os; key = os.environ.get('SERPAPI_API_KEY', ''); print('configured' if key else 'missing')"` + +**If configured**: Report "SerpAPI: configured" and move on. + +**If missing**: Explain to the user: + +``` +SerpAPI is used for Google Flights pricing searches. +Free tier: 100 searches/month at https://serpapi.com + +To set up: +1. Sign up at https://serpapi.com (free account) +2. Copy your API key from the dashboard +``` + +Use AskUserQuestion: +- header: "SerpAPI" +- question: "Do you have a SerpAPI key to configure?" +- options: + - label: "Yes, I have a key" + description: "I'll paste my API key" + - label: "Skip for now" + description: "I'll set it up later. Google Flights pricing won't work." +- multiSelect: false + +If "Yes": Ask the user to provide their key, then instruct them: + +``` +Add this to your shell profile (~/.zshrc or ~/.bashrc): + +export SERPAPI_API_KEY=your_key_here + +Then restart your terminal or run: source ~/.zshrc +``` + +Write the export line for them. Run: +`python3 -c "import os; os.environ.get('SERPAPI_API_KEY') or print('Note: key will be available after restarting terminal')"` + +If "Skip": Note that `/rtw-search` will work without pricing data (uses `--skip-availability`), and `/rtw-scrape` won't work. + +## Step 3: Check ExpertFlyer Credentials + +Run: `python3 -m rtw login status --json 2>/dev/null || echo '{"has_credentials": false}'` + +Parse the JSON output. + +**If has_credentials is true**: Report "ExpertFlyer: configured (username)" and move on. + +**If missing**: Explain to the user: + +``` +ExpertFlyer is used for D-class seat availability checking. +Requires a paid subscription at https://www.expertflyer.com + +This is optional — the optimizer works without it, but you won't +be able to verify which flights actually have D-class seats available. +``` + +Use AskUserQuestion: +- header: "ExpertFlyer" +- question: "Do you have an ExpertFlyer account to configure?" +- options: + - label: "Yes, set up now" + description: "I have an ExpertFlyer account and want to store credentials" + - label: "Skip for now" + description: "I don't have ExpertFlyer. D-class verification won't work." +- multiSelect: false + +If "Yes": Run `python3 -m rtw login expertflyer` which will interactively prompt for email and password, store them in the system keyring, and test the login. + +If the login test succeeds, also check if Playwright's Chromium is installed: +Run: `python3 -c "from playwright.sync_api import sync_playwright; print('installed')" 2>/dev/null || echo "missing"` + +If Playwright Chromium is missing: +Run: `uv run playwright install chromium` + +If "Skip": Note that `/rtw-verify` (D-class) won't work, but all other commands function normally. + +## Step 4: Quick Smoke Test + +Run: `python3 -m rtw --help > /dev/null 2>&1 && echo "CLI: working" || echo "CLI: broken"` + +Run: `uv run pytest -x -q -m "not slow and not integration" 2>&1 | tail -3` + +## Step 5: Summary + +Present a status dashboard: + +``` +RTW Optimizer — Initialization Complete +======================================== +Python: [version] +Dependencies: installed +SerpAPI: [configured / not configured] +ExpertFlyer: [configured (email) / not configured] +Playwright: [installed / not installed] +Tests: [N passed] + +Available features: + [✓ or ✗] Route search with pricing (needs SerpAPI) + [✓ or ✗] D-class verification (needs ExpertFlyer + Playwright) + [✓] Itinerary validation (always available) + [✓] Cost estimation (always available) + [✓] NTP calculation (always available) + [✓] Booking script generation (always available) +``` + +Suggest next step based on what's configured: +- If everything configured: "You're all set! Run `/rtw-plan` to start planning your trip." +- If SerpAPI only: "Run `/rtw-plan` to start planning. Add ExpertFlyer later for D-class checks." +- If nothing configured: "The core optimizer works without external services. Run `/rtw-plan` to start." diff --git a/.claude/commands/rtw-setup.md b/.claude/commands/rtw-setup.md new file mode 100644 index 0000000..a7b5f20 --- /dev/null +++ b/.claude/commands/rtw-setup.md @@ -0,0 +1,60 @@ +--- +description: Set up development environment +allowed-tools: Bash(uv *), Bash(python3 *), Bash(which *), Read +--- + +# Environment Setup + +First-time setup wizard for the RTW Optimizer. Run each step and report status. + +## Step 1: Install Dependencies + +Run: `uv sync` + +If this fails, check: +- Is `uv` installed? (`which uv`) +- Is Python 3.11+ available? (`python3 --version`) + +Report: installed package count or error. + +## Step 2: Verify CLI + +Run: `python3 -m rtw --help` + +Confirm the CLI loads and shows the command list. If it fails, suggest `uv sync` or check Python version. + +Report: pass/fail with command count. + +## Step 3: Optional — Playwright for ExpertFlyer + +Check if Playwright is installed: `python3 -c "import playwright" 2>/dev/null` + +If not installed, mention: +"Optional: For D-class availability checking via ExpertFlyer, install Playwright:" +``` +uv run playwright install chromium +``` + +This is only needed for the `verify` command. Skip if not using ExpertFlyer. + +## Step 4: Quick Test Run + +Run: `uv run pytest -x -q -m "not slow and not integration" 2>&1 | tail -5` + +Report: tests passed/failed. + +## Summary + +``` +Setup Complete +============== +Dependencies: [installed] +CLI: [working — N commands] +Playwright: [installed / not installed (optional)] +Tests: [N passed] +``` + +Suggest next step: +- If no ExpertFlyer credentials: "Run `python3 -m rtw login expertflyer` to set up D-class checking" +- If no trip state: "Run `/rtw-plan` to start planning your RTW trip" +- If returning user: "Run `/rtw-status` to see where you left off" diff --git a/.claude/commands/rtw-status.md b/.claude/commands/rtw-status.md new file mode 100644 index 0000000..2bc5a9f --- /dev/null +++ b/.claude/commands/rtw-status.md @@ -0,0 +1,56 @@ +--- +description: Show project status and trip planning state +model: haiku +allowed-tools: Bash(git *), Bash(python3 -m rtw*), Bash(uv run pytest*), Bash(ls *), Bash(wc *), Bash(cat *), Read +--- + +# Project Status Dashboard + +Show a quick orientation dashboard. Keep output compact and scannable. + +## Step 1: Git Status + +Run: `git branch --show-current` and `git log --oneline -5` + +Show current branch and last 5 commits. + +## Step 2: Test Count + +Run: `uv run pytest --collect-only -q 2>&1 | tail -1` + +Show total test count. + +## Step 3: ExpertFlyer Credentials + +Run: `python3 -m rtw login status --json 2>/dev/null || echo '{"has_credentials": false}'` + +Show whether ExpertFlyer credentials are configured. + +## Step 4: Trip Planning State + +Check if `.claude/rtw-state.local.md` exists. If it does, read it and show: +- Current stage (planning, search-complete, analyzed, booking-ready) +- Origin, ticket type, cities + +If no state file, show "No active trip plan. Run `/rtw-plan` to start." + +## Step 5: Last Search + +Check if `~/.rtw/last_search.json` exists. If it does, show its age (file modification time) and a 1-line summary. + +If the file is older than 24 hours, note: "Search results are stale — consider re-running `/rtw-search`." + +## Report Format + +``` +RTW Optimizer Status +==================== +Branch: [branch-name] +Tests: [N] collected +ExpertFlyer: [configured / not configured] +Trip state: [stage or "no active plan"] +Last search: [age, summary or "none"] + +Recent commits: + [last 5 commits] +``` diff --git a/.claude/commands/rtw-verify.md b/.claude/commands/rtw-verify.md new file mode 100644 index 0000000..e84f0a6 --- /dev/null +++ b/.claude/commands/rtw-verify.md @@ -0,0 +1,43 @@ +--- +description: Run tests and lint checks +model: haiku +allowed-tools: Bash(uv run pytest*), Bash(uv run ruff*), Bash(ruff *) +--- + +# Project Verification + +Run the test suite and lint checks. Report a clear pass/fail summary. + +## Step 1: Run Tests + +Run: `uv run pytest -x -q` + +Capture the output. Note: +- Total tests passed/failed/skipped +- If any failures, show the first failure with file path and assertion + +## Step 2: Run Lint + +Run: `ruff check rtw/ tests/` + +Capture the output. Note: +- Total errors found (or clean) +- If errors, show the first 5 with file:line and rule code + +## Step 3: Report Summary + +Present a compact checklist: + +``` +Project Verification +==================== +Tests: [PASS NN passed] or [FAIL NN passed, NN failed] +Lint: [PASS clean] or [FAIL NN errors] +``` + +If everything passes, say "All clear — safe to commit." + +If anything fails: +- Show the specific failures +- Suggest the fix command (e.g., `ruff check --fix rtw/` for auto-fixable lint) +- For test failures, suggest running the specific test file with `-x -v` for details diff --git a/.claude/rules/rules-engine.md b/.claude/rules/rules-engine.md new file mode 100644 index 0000000..6ed08cd --- /dev/null +++ b/.claude/rules/rules-engine.md @@ -0,0 +1,15 @@ +--- +paths: + - "rtw/rules/**" +--- + +# Rules Engine Guidelines + +- NEVER invent or guess fare rule constraints. All rules derive from IATA Rule 3015. +- Before modifying any rule, read `docs/01-fare-rules.md` for the authoritative source text. +- For optimization context, see `docs/12-rtw-optimization-guide.md`. +- Each rule is a function in a separate file: `segments.py`, `carriers.py`, `direction.py`, etc. +- Rules return a list of `RuleResult` with severity: `error` (blocks validation) or `warning` (informational). +- The validator (`rtw/validator.py`) builds a `ValidationContext` then calls each rule. Rules do NOT call each other. +- Continent assignments use `rtw/continents.py` overrides (e.g., Egypt = EU_ME, Guam = Asia). Never hardcode continent for an airport. +- Test rule changes with: `uv run pytest tests/test_rules/ -x` diff --git a/.claude/rules/testing.md b/.claude/rules/testing.md new file mode 100644 index 0000000..9f3d5de --- /dev/null +++ b/.claude/rules/testing.md @@ -0,0 +1,16 @@ +--- +paths: + - "tests/**" +--- + +# Testing Conventions + +- NEVER use mocks for API responses or domain logic. Tests use real data from `tests/fixtures/`. +- Mocks are ONLY acceptable for: system keyring access, ExpertFlyer HTTP sessions, and external service credentials. +- Test files mirror source structure: `rtw/cost.py` → `tests/test_cost.py`, `rtw/rules/segments.py` → `tests/test_rules/test_segments.py` +- Use `pytest.approx()` for floating-point comparisons (costs, distances, percentages). +- Fixtures live in `tests/fixtures/` as YAML files. Load them with `Path(__file__).parent / "fixtures" / "name.yaml"`. +- Mark slow tests with `@pytest.mark.slow`, integration tests with `@pytest.mark.integration`. +- Run focused: `uv run pytest tests/test_cost.py -x` (one file, stop on first failure). +- Run fast: `uv run pytest -m "not slow and not integration" -x` +- All models are Pydantic v2 — test serialization with `model_dump(mode="json")` and `model_validate(data)`. diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 0000000..b0bdc47 --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,23 @@ +{ + "permissions": { + "allow": [ + "Bash(uv run pytest*)", + "Bash(uv run ruff*)", + "Bash(uv sync*)", + "Bash(python3 -m rtw*)", + "Bash(python3 -c *)", + "Bash(ruff *)", + "Bash(git *)", + "Bash(which *)", + "Bash(ls *)", + "Bash(cat *)", + "Bash(head *)", + "Bash(wc *)" + ], + "deny": [ + "Bash(rm -rf *)", + "Bash(git push --force*)", + "Bash(git reset --hard*)" + ] + } +} diff --git a/.gitignore b/.gitignore index f1f18b0..a1dd6f2 100644 --- a/.gitignore +++ b/.gitignore @@ -35,3 +35,8 @@ Thumbs.db # AI task management ai/ + +# Claude Code local files +.claude/rules/ralph-dev-* +.claude/settings.local.json +CLAUDE.local.md diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..203bc04 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,142 @@ +# RTW Optimizer + +oneworld Explorer round-the-world ticket optimizer. Validates itineraries against IATA Rule 3015, estimates costs + surcharges, calculates BA NTP, analyzes segment value, generates phone booking scripts, searches for optimal routes, and verifies D-class availability via ExpertFlyer. + +## Tech Stack + +| Component | Version/Tool | +|-----------|-------------| +| Language | Python 3.11+ | +| CLI | Typer + Rich | +| Models | Pydantic v2 | +| Package mgr | uv (use `uv run`, `uv sync`) | +| Tests | pytest (796 tests) | +| Lint | ruff | +| Scraping | Playwright + httpx | + +## Quick Commands + +```bash +uv run pytest # Run all tests +uv run pytest tests/test_cost.py -x # Run one test file, stop on first failure +uv run pytest -m "not slow" -x # Skip slow/integration tests +ruff check rtw/ tests/ # Lint check +python3 -m rtw --help # Show all CLI commands +python3 -m rtw validate FILE.yaml # Validate itinerary +python3 -m rtw search --cities LHR,NRT,JFK --origin SYD --type DONE4 +python3 -m rtw verify # Verify D-class availability (needs ExpertFlyer) +``` + +## CLI Commands + +| Command | Purpose | +|---------|---------| +| `validate` | Check itinerary against Rule 3015 constraints | +| `cost` | Estimate base fare + YQ surcharges per segment | +| `ntp` | Calculate BA New Tier Points earnings | +| `value` | Per-segment value analysis (cost vs distance) | +| `booking` | Generate phone booking script + GDS commands | +| `analyze` | Full pipeline: validate + cost + NTP + value | +| `search` | Find valid RTW route options across carriers | +| `verify` | D-class availability check via ExpertFlyer | +| `continent` | Airport → continent/tariff conference lookup | +| `show` | Pretty-print itinerary segments | +| `new` | Output blank YAML itinerary template | +| `scrape` | Scrape flight prices (Google Flights via SerpAPI) | +| `config` | Manage settings (API keys, defaults) | +| `cache` | Manage scrape result cache | +| `login` | Manage ExpertFlyer credentials (keyring) | + +## Module Map + +| Module | Path | Purpose | +|--------|------|---------| +| CLI | `rtw/cli.py` | All Typer commands and display logic | +| Models | `rtw/models.py` | Itinerary, Segment, Ticket, CabinClass, TicketType | +| Validator | `rtw/validator.py` | Rule 3015 orchestrator — builds ValidationContext, runs rules | +| Rules | `rtw/rules/` | Individual rule files (segments, carriers, direction, continents, etc.) | +| Cost | `rtw/cost.py` | Fare lookup + YQ surcharge calculation | +| NTP | `rtw/ntp.py` | BA New Tier Points estimator | +| Value | `rtw/value.py` | Per-segment value rating (cost vs distance) | +| Booking | `rtw/booking.py` | Phone script + GDS command generator | +| Search | `rtw/search/` | Route search engine (models, scorer, display) | +| Verify | `rtw/verify/` | D-class verification (models, state, orchestrator) | +| Scraper | `rtw/scraper/` | Google Flights (SerpAPI) + ExpertFlyer scrapers | +| Continents | `rtw/continents.py` | Airport → continent mapping with overrides | +| Distance | `rtw/distance.py` | Great-circle distance calculator | +| Data | `rtw/data/` | YAML reference: carriers, fares, continents, hubs | +| Output | `rtw/output/` | Rich + plain text formatters | + +## Domain Vocabulary + +| Term | Meaning | +|------|---------| +| RTW | Round-the-world ticket (oneworld Explorer) | +| Rule 3015 | IATA fare rule governing RTW ticket construction | +| DONE4 / DONE3 | Business class, 4 or 3 continents | +| LONE4 / LONE3 | Economy class, 4 or 3 continents | +| NTP | New Tier Points — BA frequent flyer earning metric | +| YQ | Carrier-imposed fuel/insurance surcharge | +| D-class | Booking class for oneworld Explorer award-like fare | +| TC1 / TC2 / TC3 | IATA Tariff Conferences: Americas / Europe+Africa+Middle East / Asia+Pacific | +| SWP | South West Pacific sub-area within TC3 | +| Surface sector | Overland segment (not flown, counts toward routing but not fare) | +| Stopover | City where traveler stays >24 hours | +| Transfer | Connection in a city (<24 hours) | +| Backtrack | Returning to a previously visited tariff conference (restricted by Rule 3015) | +| ExpertFlyer | Third-party tool for checking airline seat availability | +| GDS | Global Distribution System (Amadeus/Sabre) used by booking agents | + +## Conventions + +- **Invocation**: Always use `python3 -m rtw`, never `rtw` directly +- **Testing**: NEVER use mocks for API responses — tests use real data from `tests/fixtures/`. Mocks only for credentials and external service calls. +- **Test structure**: Test files mirror source: `rtw/cost.py` → `tests/test_cost.py` +- **Models**: All data models are Pydantic v2 BaseModel. Use `model_dump(mode="json")` for serialization. +- **YAML**: Itinerary files use YAML format. See `python3 -m rtw new` for template. +- **Credentials**: ExpertFlyer in system keyring (`python3 -m rtw login expertflyer`), SerpAPI via env var (`export SERPAPI_API_KEY=...`). Run `/rtw-init` to set up both. +- **State files**: Search results saved to `~/.rtw/last_search.json`. Trip planning state in `.claude/rtw-state.local.md`. +- **Rules engine**: Each rule is a separate file in `rtw/rules/`. Rules return `RuleResult` with severity. Never invent fare rules — read `docs/01-fare-rules.md` for authoritative source. +- **Continent overrides**: Some airports have non-obvious continent assignments (e.g., Egypt = EU_ME, Guam = Asia). See `rtw/continents.py`. + +## Reference Files + +| File | Content | +|------|---------| +| `docs/ARCHITECTURE.md` | Full architecture documentation (15KB) | +| `docs/01-fare-rules.md` | Authoritative IATA Rule 3015 fare rules | +| `docs/12-rtw-optimization-guide.md` | RTW trip optimization strategies | +| `rtw/data/carriers.yaml` | oneworld carrier list with alliance status | +| `rtw/data/fares.yaml` | Base fare table by origin and ticket type | +| `rtw/data/continents.yaml` | Airport → continent/TC mappings | + +## Slash Commands + +**Domain workflow** (interactive, multi-step): + +| Command | Description | Model | +|---------|-------------|-------| +| `/rtw-plan` | Plan an RTW trip interactively | opus | +| `/rtw-search` | Search for itinerary options | sonnet | +| `/rtw-analyze` | Run full analysis pipeline | sonnet | +| `/rtw-booking` | Generate phone booking script | sonnet | +| `/rtw-compare` | Compare fares across origin cities | sonnet | +| `/rtw-lookup` | Airport continent/TC lookup | haiku | + +**Developer tools** (fast, non-interactive): + +| Command | Description | Model | +|---------|-------------|-------| +| `/rtw-init` | First-time credential & environment setup | sonnet | +| `/rtw-verify` | Run tests + lint check | haiku | +| `/rtw-status` | Project status dashboard | haiku | +| `/rtw-setup` | Install dependencies & run smoke test | sonnet | +| `/rtw-help` | Command inventory + domain primer | haiku | + +**First time?** Run `/rtw-init` to configure SerpAPI + ExpertFlyer credentials and verify the environment. + +**Typical workflow**: `/rtw-plan` → `/rtw-search` → `/rtw-verify` (D-class) → `/rtw-analyze` → `/rtw-booking` + +## Notes + +If `.claude/rules/` contains `ralph-dev-*` files, ignore them — they are from an unrelated project and not part of this codebase. diff --git a/README.md b/README.md new file mode 100644 index 0000000..e35a8c0 --- /dev/null +++ b/README.md @@ -0,0 +1,411 @@ +# RTW Optimizer + +A command-line tool for optimizing [oneworld Explorer](https://www.oneworld.com/flights/round-the-world-fares) round-the-world tickets. Validates itineraries against IATA Rule 3015, estimates costs and carrier surcharges, calculates BA Avios/NTP earnings, rates per-segment value, searches for optimal routes with live pricing, verifies D-class seat availability, and generates phone booking scripts. + +## Why This Exists + +oneworld Explorer fares let you fly around the world on oneworld airlines (British Airways, Cathay Pacific, Qantas, JAL, American Airlines, Qatar, etc.) for a flat fare based on the number of continents visited. A business class ticket visiting 4 continents starts at ~$4,000 from Cairo or ~$10,500 from New York. + +The catch: these tickets are governed by [IATA Rule 3015](docs/01-fare-rules.md), a complex set of constraints around direction of travel, continent crossings, backtracking, carrier requirements, and segment limits. Building a valid itinerary by hand means juggling 15+ rules simultaneously while checking seat availability across a dozen airlines. + +This tool automates all of that. + +## What It Does + +``` +Plan your trip Search routes Check availability Analyze costs Book it + /rtw-plan --> /rtw-search --> rtw verify --> /rtw-analyze --> /rtw-booking +``` + +| Feature | What It Does | +|---------|-------------| +| **Validate** | Check any itinerary against all Rule 3015 constraints with clear pass/fail per rule | +| **Cost** | Look up base fares by origin city + estimate YQ surcharges per carrier per segment | +| **NTP** | Calculate British Airways New Tier Points earnings for each segment | +| **Value** | Rate each segment's value (cost vs great-circle distance) as Excellent/Good/Low | +| **Search** | Generate valid RTW routes, score them, and optionally check live Google Flights pricing | +| **Verify** | Check D-class seat availability on ExpertFlyer — see exactly which flights have seats | +| **Booking** | Generate a phone script with GDS commands for calling the AA RTW desk | + +## Quick Start + +### Prerequisites + +- Python 3.11+ +- [uv](https://docs.astral.sh/uv/) (Python package manager) + +### Install + +```bash +git clone https://github.com/kavanaghpatrick/rtw-optimizer.git +cd rtw-optimizer +uv sync +``` + +### Verify It Works + +```bash +python3 -m rtw --help # Show all commands +uv run pytest -x -q # Run test suite (796 tests) +``` + +### Optional: API Keys + +Two optional services enable advanced features: + +| Service | What For | How to Set Up | +|---------|----------|--------------| +| [SerpAPI](https://serpapi.com) | Live Google Flights pricing in search results | `export SERPAPI_API_KEY=your_key` in `~/.zshrc` | +| [ExpertFlyer](https://www.expertflyer.com) | D-class seat availability checking | `python3 -m rtw login expertflyer` (stores in system keyring) | + +Without these, the core optimizer (validate, cost, NTP, value, booking) works fully. Search works but without live pricing. Verify requires ExpertFlyer. + +If using ExpertFlyer, also install the browser automation driver: + +```bash +uv run playwright install chromium +``` + +## Usage + +### Search for Routes + +Find the best RTW itineraries visiting specific cities: + +```bash +# Quick search (no live pricing) +python3 -m rtw search --cities LHR,NRT,JFK --origin SYD --type DONE4 --skip-availability + +# With live Google Flights pricing (needs SerpAPI key) +python3 -m rtw search --cities LHR,NRT,JFK --origin SYD --type DONE4 + +# Auto-verify D-class on top results (needs ExpertFlyer) +python3 -m rtw search --cities HKG,LHR,JFK --origin SYD --verify-dclass --top 3 +``` + +### Validate an Itinerary + +Check an itinerary YAML file against Rule 3015: + +```bash +python3 -m rtw validate itinerary.yaml +``` + +Rules checked include: direction of travel, continent coverage, segment limits, carrier requirements, backtracking restrictions, surface sector rules, and more. + +### Full Analysis Pipeline + +Run validate + cost + NTP + value in one command: + +```bash +python3 -m rtw analyze itinerary.yaml +``` + +### Estimate Costs + +```bash +python3 -m rtw cost itinerary.yaml +``` + +Shows base fare, per-segment YQ surcharges, and total cost. Highlights high-YQ carriers and suggests lower-surcharge alternatives. + +### Compare Fares Across Origins + +The same RTW ticket costs wildly different amounts depending on where you start: + +```bash +python3 -c " +from rtw.cost import CostEstimator +from rtw.models import TicketType +e = CostEstimator() +for r in e.compare_origins(TicketType('DONE4'))[:5]: + print(f\"{r['origin']:>5} ({r['name']:<20}) \${r['fare_usd']:>8,.0f}\") +" +``` + +``` + CAI (Cairo ) $4,000 + JNB (Johannesburg ) $5,000 + CMB (Colombo ) $5,200 + OSL (Oslo ) $5,400 + NRT (Tokyo Narita ) $6,360 +``` + +A DONE4 (business, 4 continents) ticket from Cairo costs $4,000 vs $10,500 from New York -- a positioning flight to Cairo can save $6,500. + +### Verify D-Class Availability + +After running a search, verify which flights actually have D-class seats: + +```bash +python3 -m rtw verify # Verify all options from last search +python3 -m rtw verify --option 1 # Verify specific option +python3 -m rtw verify --quiet # Summary only, no per-flight detail +``` + +Output shows per-segment D-class status (D0-D9), per-flight availability with departure times and aircraft, and flags TIGHT segments (2 or fewer available flights). + +### Look Up Airports + +```bash +python3 -m rtw continent LHR NRT JFK SYD HKG DOH +``` + +``` + LHR: EU_ME (TC2) + NRT: Asia (TC3) + JFK: N_America (TC1) + SYD: SWP (TC3) + HKG: Asia (TC3) + DOH: EU_ME (TC2) +``` + +### Generate Booking Script + +```bash +python3 -m rtw booking itinerary.yaml +``` + +Generates a complete phone script for calling the AA RTW desk (1-800-433-7300), including what to say, each segment's details, and Amadeus GDS commands the agent can use directly. + +## Itinerary Format + +Itineraries are YAML files. Here's an example: + +```yaml +ticket: + type: DONE4 + cabin: business + origin: SYD + +segments: + - from: SYD + to: HKG + carrier: CX + type: stopover + + - from: HKG + to: LHR + carrier: CX + type: stopover + + - from: LHR + to: JFK + carrier: BA + type: stopover + + - from: JFK + to: LAX + carrier: AA + type: transfer + + - from: LAX + to: SYD + carrier: QF + type: stopover +``` + +Key fields: +- **type**: `stopover` (stay >24h) or `transfer` (<24h connection) or `surface` (overland, not flown) +- **carrier**: Two-letter IATA airline code (must be a oneworld member) +- **from/to**: IATA airport codes + +## Ticket Types + +| Type | Class | Continents | Example fare (Cairo) | +|------|-------|-----------|---------------------| +| DONE3 | Business | 3 | $3,500 | +| DONE4 | Business | 4 | $4,000 | +| DONE5 | Business | 5 | $4,500 | +| DONE6 | Business | 6 | $5,000 | +| LONE3 | Economy | 3 | $2,200 | +| LONE4 | Economy | 4 | $2,500 | +| LONE5 | Economy | 5 | $2,800 | +| LONE6 | Economy | 6 | $3,100 | + +Fares vary significantly by origin city. Use the cost comparison feature to find the cheapest starting point. + +## oneworld Carriers + +| Airline | Code | Hub | YQ Level | +|---------|------|-----|----------| +| British Airways | BA | LHR | High | +| Cathay Pacific | CX | HKG | Medium | +| Qantas | QF | SYD | High | +| Japan Airlines | JL | NRT/HND | Low | +| American Airlines | AA | DFW/JFK | Low | +| Qatar Airways | QR | DOH | Medium | +| Iberia | IB | MAD | Low | +| Finnair | AY | HEL | Low | +| Malaysia Airlines | MH | KUL | Low | +| Royal Jordanian | RJ | AMM | Low | +| SriLankan Airlines | UL | CMB | Low | +| Fiji Airways | FJ | NAN | Low | +| LATAM (Chile) | LA | SCL | Medium | + +Low-YQ carriers (JL, AA, AY, IB) can save hundreds of dollars per segment compared to high-YQ carriers (BA, QF). + +## Using with Claude Code + +This project includes a full [Claude Code](https://claude.ai/claude-code) integration. When you open the project in Claude Code, it automatically loads project context, domain knowledge, and 11 slash commands. + +### First-Time Setup + +``` +/rtw-init +``` + +This walks you through setting up SerpAPI and ExpertFlyer credentials, installing dependencies, and running a smoke test. + +### Slash Commands + +**Trip planning workflow:** + +| Command | What It Does | +|---------|-------------| +| `/rtw-plan` | Interactive trip planner — picks origin, cities, dates step by step | +| `/rtw-search` | Search for routes (accepts city codes or reads from saved plan) | +| `/rtw-analyze` | Full pipeline on an itinerary: validate + cost + NTP + value | +| `/rtw-booking` | Generate phone booking script with GDS commands | +| `/rtw-compare` | Compare ticket prices across origin cities | +| `/rtw-lookup` | Quick airport-to-continent lookup | + +**Developer tools:** + +| Command | What It Does | +|---------|-------------| +| `/rtw-init` | First-time credential and environment setup | +| `/rtw-verify` | Run tests + lint check | +| `/rtw-status` | Project status dashboard (branch, tests, credentials, trip state) | +| `/rtw-setup` | Install dependencies and run smoke test | +| `/rtw-help` | Show all commands with descriptions and domain primer | + +### Typical Workflow + +1. `/rtw-plan` -- Answer questions about origin, cities, dates, ticket type +2. `/rtw-search` -- Claude runs the search and shows ranked options +3. `rtw verify` -- Check D-class availability on the best options +4. `/rtw-analyze` -- Full cost/NTP/value analysis +5. `/rtw-booking` -- Generate the script to call AA and book it + +Claude understands the domain vocabulary (Rule 3015, NTP, YQ, D-class, tariff conferences) and can explain trade-offs, suggest alternatives, and help debug validation failures. + +## Project Structure + +``` +rtw/ +├── cli.py # All Typer CLI commands +├── models.py # Pydantic models (Itinerary, Segment, Ticket, etc.) +├── validator.py # Rule 3015 validation orchestrator +├── rules/ # Individual rule implementations +│ ├── segments.py # Segment count limits +│ ├── carriers.py # oneworld carrier requirements +│ ├── direction.py # Direction-of-travel rules +│ ├── continents.py # Continent crossing validation +│ └── ... +├── cost.py # Fare lookup + YQ calculation +├── ntp.py # BA New Tier Points estimator +├── value.py # Per-segment value analysis +├── booking.py # Phone script + GDS command generator +├── search/ # Route search engine +│ ├── models.py # Search-specific models +│ ├── generator.py # Route generation +│ ├── scorer.py # Route ranking +│ └── display.py # Search result formatting +├── verify/ # D-class availability verification +│ ├── models.py # DClassResult, FlightAvailability, etc. +│ ├── verifier.py # ExpertFlyer verification orchestrator +│ └── state.py # Search state persistence +├── scraper/ # External data sources +│ ├── serpapi_flights.py # Google Flights via SerpAPI +│ ├── expertflyer.py # ExpertFlyer scraper (Playwright) +│ └── cache.py # Response caching +├── continents.py # Airport → continent mapping +├── distance.py # Great-circle distance calculator +├── data/ # Reference YAML files +│ ├── carriers.yaml # oneworld carrier data +│ ├── fares.yaml # Base fare tables +│ └── continents.yaml # Airport-continent mappings +└── output/ # Rich + plain text formatters +``` + +## Development + +### Running Tests + +```bash +uv run pytest # All tests (796) +uv run pytest tests/test_cost.py -x # Single file, stop on failure +uv run pytest -m "not slow" -x # Skip slow tests +uv run pytest -k "test_validate" -v # Filter by name, verbose +``` + +### Linting + +```bash +ruff check rtw/ tests/ # Check for issues +ruff check --fix rtw/ tests/ # Auto-fix what's possible +``` + +### Adding a New Rule + +1. Create a new file in `rtw/rules/` (e.g., `my_rule.py`) +2. Implement a function that takes a `ValidationContext` and returns `list[RuleResult]` +3. Register it in `rtw/validator.py` +4. Add tests in `tests/test_rules/` +5. Reference the authoritative source in `docs/01-fare-rules.md` + +### Adding a New CLI Command + +1. Add a function in `rtw/cli.py` decorated with `@app.command()` +2. Use Typer for argument parsing and Rich for output +3. Add `--json`, `--plain`, `--verbose`, `--quiet` flags for consistency +4. Add tests using `typer.testing.CliRunner` + +## Key Concepts + +### Rule 3015 + +The IATA fare rule that governs round-the-world ticket construction. Key constraints: + +- **Direction**: Must travel consistently eastbound or westbound (no zigzagging) +- **Continents**: Must visit the exact number of continents your ticket covers (3, 4, 5, or 6) +- **Backtracking**: Cannot return to a tariff conference (TC1/TC2/TC3) once you've left it (with exceptions for the return to origin) +- **Segments**: Maximum 16 flown segments +- **Surface sectors**: Allowed but count toward routing constraints +- **Carriers**: All flown segments must be on oneworld member airlines + +See [docs/01-fare-rules.md](docs/01-fare-rules.md) for the complete rule reference. + +### Tariff Conferences + +The world is divided into three IATA Tariff Conferences: + +| Conference | Regions | +|-----------|---------| +| **TC1** | North America, South America, Caribbean, Hawaii | +| **TC2** | Europe, Middle East, Africa | +| **TC3** | Asia, South West Pacific (Australia, NZ), Japan, Indian subcontinent | + +Your ticket type (DONE**4**, LONE**3**, etc.) specifies how many of these conferences you must visit. + +### D-Class Availability + +oneworld Explorer tickets are booked in **D class** -- a special booking class that shows availability separately from regular economy/business. A flight might have plenty of business class seats for sale but zero D-class seats available. + +The `verify` command checks ExpertFlyer to see the actual D-class inventory: +- **D9** = 9 seats available (wide open) +- **D5** = 5 seats +- **D0** = no seats (sold out in D class) + +### YQ Surcharges + +Airlines add fuel/insurance surcharges (YQ) on top of the base fare. These vary dramatically: +- **BA** London-New York: ~$500-800 per segment +- **JL** Tokyo-London: ~$50 per segment + +Choosing low-YQ carriers (JAL, American, Finnair, Iberia) over high-YQ carriers (British Airways, Qantas) can save thousands on a multi-segment RTW ticket. + +## License + +MIT diff --git a/pyproject.toml b/pyproject.toml index f7f4623..4deb5e1 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -13,6 +13,7 @@ dependencies = [ "fast-flights>=2.2", "playwright>=1.40", "keyring>=25.0", + "requests>=2.28", ] [project.optional-dependencies] diff --git a/rtw/cli.py b/rtw/cli.py index 9c9d581..7047044 100644 --- a/rtw/cli.py +++ b/rtw/cli.py @@ -4,15 +4,17 @@ booking script generation, and scraping for RTW itineraries. """ -from __future__ import annotations - import difflib import logging import sys from pathlib import Path -from typing import Annotated +from typing import TYPE_CHECKING, Annotated, Optional import typer + +if TYPE_CHECKING: + from rtw.search.models import ScoredCandidate + from rtw.verify.models import VerifyOption, VerifyResult import yaml from pydantic import ValidationError @@ -46,9 +48,16 @@ no_args_is_help=True, ) +login_app = typer.Typer( + name="login", + help="Manage service logins.", + no_args_is_help=True, +) + app.add_typer(scrape_app, name="scrape") app.add_typer(config_app, name="config") app.add_typer(cache_app, name="cache") +app.add_typer(login_app, name="login") # --------------------------------------------------------------------------- @@ -546,6 +555,7 @@ def new_template( @scrape_app.command(name="prices") def scrape_prices( file: str = typer.Argument(help="Path to itinerary YAML file"), + backend: Annotated[str, typer.Option("--backend", "-b", help="Flight search backend: auto, serpapi, fast-flights, playwright")] = "auto", json: JsonFlag = False, plain: PlainFlag = False, verbose: VerboseFlag = False, @@ -553,6 +563,27 @@ def scrape_prices( ) -> None: """Search Google Flights prices for all segments.""" _setup_logging(verbose, quiet) + + # Validate backend + from rtw.scraper.google_flights import SearchBackend + try: + search_backend = SearchBackend(backend) + except ValueError: + valid = ", ".join(b.value for b in SearchBackend) + _error_panel(f"Invalid backend '{backend}'. Choose from: {valid}") + raise typer.Exit(code=2) + + if search_backend == SearchBackend.SERPAPI: + from rtw.scraper.serpapi_flights import serpapi_available + if not serpapi_available(): + _error_panel( + "SERPAPI_API_KEY not set.\n\n" + "1. Sign up at https://serpapi.com (free tier: 250 searches/mo)\n" + "2. Set the key: export SERPAPI_API_KEY=your_key_here\n\n" + "Or use --backend auto to try other backends." + ) + raise typer.Exit(code=2) + try: itinerary = _load_itinerary(file) from rtw.scraper.batch import search_with_fallback @@ -560,7 +591,7 @@ def scrape_prices( import json as json_mod cache = ScrapeCache() - results = search_with_fallback(itinerary, cache) + results = search_with_fallback(itinerary, cache, backend=search_backend) if json: data = [] @@ -595,6 +626,21 @@ def scrape_prices( except typer.BadParameter: raise except Exception as exc: + from rtw.scraper.serpapi_flights import SerpAPIAuthError, SerpAPIQuotaError + if isinstance(exc, SerpAPIAuthError): + _error_panel( + "SerpAPI authentication failed.\n\n" + "Check your key at https://serpapi.com/manage-api-key\n" + "Or use --backend auto to try other backends." + ) + raise typer.Exit(code=2) + if isinstance(exc, SerpAPIQuotaError): + _error_panel( + "SerpAPI monthly quota exceeded.\n\n" + "Upgrade at https://serpapi.com/pricing\n" + "or use --backend auto to fall back to other search methods." + ) + raise typer.Exit(code=2) _error_panel(str(exc)) raise typer.Exit(code=2) @@ -677,6 +723,482 @@ def cache_clear() -> None: typer.echo("Scrape cache cleared.") +# --------------------------------------------------------------------------- +# Login commands +# --------------------------------------------------------------------------- + + +@login_app.command(name="expertflyer") +def login_expertflyer( + verbose: VerboseFlag = False, + quiet: QuietFlag = False, +) -> None: + """Store ExpertFlyer credentials for D-class availability checks. + + Prompts for email and password, stores them in the system keyring, + then tests the login by connecting to ExpertFlyer. + """ + _setup_logging(verbose, quiet) + + try: + import keyring + except ImportError: + _error_panel("keyring library not available. Install with: pip install keyring") + raise typer.Exit(code=1) + + # Check existing credentials + existing = keyring.get_password("expertflyer.com", "username") + if existing and not quiet: + typer.echo(f"Existing credentials found for: {existing}") + if not typer.confirm("Replace with new credentials?"): + typer.echo("Keeping existing credentials.") + return + + # Prompt for credentials + username = typer.prompt("ExpertFlyer email") + password = typer.prompt("ExpertFlyer password", hide_input=True) + + keyring.set_password("expertflyer.com", "username", username) + keyring.set_password("expertflyer.com", "password", password) + typer.echo("Credentials saved to system keyring.") + + # Test login + if not quiet: + typer.echo("Testing login...") + try: + from rtw.scraper.expertflyer import ExpertFlyerScraper + + with ExpertFlyerScraper() as scraper: + scraper._ensure_logged_in() + typer.echo("Login test successful.") + except Exception as exc: + typer.echo(f"Warning: login test failed ({exc}). Credentials saved anyway.", err=True) + + +@login_app.command(name="status") +def login_status( + json: JsonFlag = False, +) -> None: + """Check ExpertFlyer credential status.""" + import json as json_mod + + has_creds = False + username = None + try: + import keyring + + username = keyring.get_password("expertflyer.com", "username") + password = keyring.get_password("expertflyer.com", "password") + has_creds = bool(username and password) + except ImportError: + pass + + if json: + data = { + "has_credentials": has_creds, + "username": username, + } + typer.echo(json_mod.dumps(data, indent=2)) + return + + if has_creds: + typer.echo(f"ExpertFlyer credentials: configured ({username})") + else: + typer.echo("ExpertFlyer credentials: not configured") + typer.echo("Run `rtw login expertflyer` to set up.") + + +@login_app.command(name="clear") +def login_clear() -> None: + """Clear saved ExpertFlyer credentials.""" + try: + import keyring + + keyring.delete_password("expertflyer.com", "username") + keyring.delete_password("expertflyer.com", "password") + typer.echo("ExpertFlyer credentials cleared from keyring.") + except ImportError: + _error_panel("keyring library not available.") + raise typer.Exit(code=1) + except Exception: + typer.echo("No credentials to clear.") + + +# --------------------------------------------------------------------------- +# Verify command +# --------------------------------------------------------------------------- + + +def _scored_to_verify_option( + scored: "ScoredCandidate", option_id: int +) -> "VerifyOption": + """Convert a ScoredCandidate to a VerifyOption for D-class checking.""" + from rtw.verify.models import SegmentVerification, VerifyOption + + segments = [] + for i, seg in enumerate(scored.candidate.itinerary.segments): + seg_type = "SURFACE" if seg.is_surface else "FLOWN" + segments.append( + SegmentVerification( + index=i, + segment_type=seg_type, + origin=seg.from_airport, + destination=seg.to_airport, + carrier=seg.carrier, + flight_number=seg.flight, + target_date=seg.date, + ) + ) + return VerifyOption(option_id=option_id, segments=segments) + + +def _display_verify_result(result: "VerifyResult", quiet: bool = False) -> None: + """Display verification result in rich or plain format.""" + from rtw.verify.models import DClassStatus + + try: + from rich.console import Console + from rich.table import Table + + console = Console(stderr=True) + table = Table( + title=f"Option {result.option_id} — D-Class Verification", + show_lines=False, + ) + table.add_column("#", justify="right", style="dim") + table.add_column("Route", style="bold") + table.add_column("Carrier") + table.add_column("Date") + table.add_column("D-Class", justify="center") + table.add_column("Seats", justify="center") + + for seg in result.segments: + route = f"{seg.origin}→{seg.destination}" + carrier = seg.carrier or "—" + date_str = str(seg.target_date) if seg.target_date else "—" + + if seg.segment_type == "SURFACE": + table.add_row( + str(seg.index + 1), route, "SURFACE", "—", "—", "—", + style="dim", + ) + continue + + if seg.dclass is None: + table.add_row( + str(seg.index + 1), route, carrier, date_str, "?", "?", + ) + continue + + status = seg.dclass.status + display = seg.dclass.display_code + seats = str(seg.dclass.seats) if status in ( + DClassStatus.AVAILABLE, DClassStatus.NOT_AVAILABLE + ) else "—" + + if status == DClassStatus.AVAILABLE: + style = "green" + elif status == DClassStatus.NOT_AVAILABLE: + style = "red" + elif status == DClassStatus.ERROR: + style = "yellow" + else: + style = "dim" + + # TIGHT badge: ≤2 flights with D availability + tight = "" + if seg.dclass.flights and seg.dclass.available_count <= 2: + tight = " [red bold]TIGHT[/red bold]" + + table.add_row( + str(seg.index + 1), route, carrier, date_str, + f"[{style}]{display}[/{style}]{tight}", + f"[{style}]{seats}[/{style}]", + ) + + # Per-flight sub-rows: show available flights (D>0) + if seg.dclass.flights and not quiet: + avail = seg.dclass.available_flights + for flt in avail: + flt_label = flt.flight_number or f"{flt.carrier or carrier}?" + flt_dep = flt.depart_time or "" + flt_aircraft = f" ({flt.aircraft})" if flt.aircraft else "" + stops_str = f" +{flt.stops}" if flt.stops else "" + table.add_row( + "", + f" [dim]{flt_label}{stops_str}{flt_aircraft}[/dim]", + "", + f" [dim]{flt_dep}[/dim]", + f"[green]D{flt.seats}[/green]", + "", + ) + # Show count of D0 flights + d0_count = seg.dclass.flight_count - seg.dclass.available_count + if d0_count > 0: + table.add_row( + "", f" [dim]({d0_count} more at D0)[/dim]", + "", "", "", "", + ) + + # Alternate date hint for unavailable segments + if ( + status == DClassStatus.NOT_AVAILABLE + and seg.dclass.best_alternate + ): + alt = seg.dclass.best_alternate + table.add_row( + "", "", "", f" [dim]Try {alt.date}[/dim]", + f"[cyan]D{alt.seats}[/cyan]", + f"[cyan]{alt.seats}[/cyan]", + ) + + console.print(table) + + # Summary line + if result.fully_bookable: + console.print( + f"[green bold]All {result.confirmed}/{result.total_flown} " + f"flown segments have D-class availability.[/green bold]" + ) + else: + console.print( + f"[yellow]{result.confirmed}/{result.total_flown} flown segments " + f"confirmed ({result.percentage:.0f}%).[/yellow]" + ) + + except ImportError: + # Plain text fallback + typer.echo(f"Option {result.option_id} — D-Class Verification", err=True) + for seg in result.segments: + route = f"{seg.origin}-{seg.destination}" + if seg.segment_type == "SURFACE": + typer.echo(f" {seg.index + 1}. {route}: SURFACE", err=True) + continue + if seg.dclass: + tight = " TIGHT" if seg.dclass.flights and seg.dclass.available_count <= 2 else "" + typer.echo( + f" {seg.index + 1}. {route} {seg.carrier or '??'}: " + f"{seg.dclass.display_code} ({seg.dclass.seats} seats){tight}", + err=True, + ) + # Per-flight sub-rows + if seg.dclass.flights and not quiet: + for flt in seg.dclass.available_flights: + flt_label = flt.flight_number or f"{flt.carrier or seg.carrier or '??'}?" + flt_dep = flt.depart_time or "" + stops_str = f" +{flt.stops}" if flt.stops else "" + typer.echo( + f" {flt_label}{stops_str} {flt_dep} D{flt.seats}", + err=True, + ) + d0_count = seg.dclass.flight_count - seg.dclass.available_count + if d0_count > 0: + typer.echo(f" ({d0_count} more at D0)", err=True) + if ( + seg.dclass.status == DClassStatus.NOT_AVAILABLE + and seg.dclass.best_alternate + ): + alt = seg.dclass.best_alternate + typer.echo( + f" Try {alt.date}: D{alt.seats}", + err=True, + ) + else: + typer.echo(f" {seg.index + 1}. {route}: not checked", err=True) + + pct = f"{result.percentage:.0f}%" if result.total_flown else "n/a" + typer.echo( + f" Result: {result.confirmed}/{result.total_flown} confirmed ({pct})", + err=True, + ) + + +def _display_verify_summary(results: list) -> None: + """Show a summary panel after batch verify.""" + try: + from rich.console import Console + from rich.panel import Panel + from rich.text import Text + + console = Console(stderr=True) + lines = Text() + + for vr in results: + label = f"Option {vr.option_id}: {vr.confirmed}/{vr.total_flown} D-class" + if vr.fully_bookable: + lines.append(f" {label} ", style="green bold") + lines.append("(fully bookable)\n", style="green") + elif vr.percentage >= 50: + lines.append(f" {label} ", style="yellow") + lines.append(f"({vr.percentage:.0f}%)\n", style="yellow") + else: + lines.append(f" {label} ", style="red") + lines.append(f"({vr.percentage:.0f}%)\n", style="red") + + console.print(Panel(lines, title="D-Class Summary", border_style="blue")) + + except ImportError: + typer.echo("--- D-Class Summary ---", err=True) + for vr in results: + status = "BOOKABLE" if vr.fully_bookable else f"{vr.percentage:.0f}%" + typer.echo( + f" Option {vr.option_id}: {vr.confirmed}/{vr.total_flown} ({status})", + err=True, + ) + + +@app.command() +def verify( + option_ids: Annotated[ + Optional[list[int]], typer.Argument(help="Option IDs to verify (1-based). Omit for top 3.") + ] = None, + booking_class: Annotated[str, typer.Option("--class", "-c", help="Booking class")] = "D", + no_cache: Annotated[bool, typer.Option("--no-cache", help="Skip cache")] = False, + json: JsonFlag = False, + plain: PlainFlag = False, + verbose: VerboseFlag = False, + quiet: QuietFlag = False, +) -> None: + """Verify D-class availability for saved search results. + + Uses ExpertFlyer to check booking class availability on each flown + segment. Requires a prior `rtw search` and `rtw login expertflyer`. + """ + _setup_logging(verbose, quiet) + + try: + from rtw.verify.state import SearchState + from rtw.scraper.expertflyer import ExpertFlyerScraper, _get_credentials + from rtw.scraper.cache import ScrapeCache + from rtw.verify.verifier import DClassVerifier + import json as json_mod + + # Load last search result + state = SearchState() + search_result = state.load() + if search_result is None: + _error_panel( + "No saved search results found.\n\n" + "Run `rtw search` first, then `rtw verify`." + ) + raise typer.Exit(code=1) + + age = state.state_age_minutes() + if age and age > 60 and not quiet: + typer.echo( + f"Warning: search results are {age:.0f} minutes old. " + "Consider re-running `rtw search`.", + err=True, + ) + + # Determine which options to verify + if option_ids is None: + ids = list(range(1, min(4, len(search_result.options) + 1))) + else: + ids = option_ids + + # Validate IDs + for oid in ids: + if oid < 1 or oid > len(search_result.options): + _error_panel( + f"Option {oid} does not exist. " + f"Available: 1-{len(search_result.options)}" + ) + raise typer.Exit(code=2) + + # Check credentials + if _get_credentials() is None: + _error_panel( + "No ExpertFlyer credentials found.\n\n" + "Run `rtw login expertflyer` to set up." + ) + raise typer.Exit(code=1) + + # Build verifier with context-managed scraper + with ExpertFlyerScraper() as scraper: + verifier = DClassVerifier( + scraper=scraper, + cache=ScrapeCache(), + booking_class=booking_class, + ) + + # Convert and verify with Rich progress + results = [] + use_rich_progress = not json and not quiet and not plain + + for oid in ids: + scored = search_result.options[oid - 1] + option = _scored_to_verify_option(scored, oid) + route_label = ( + f"{option.segments[0].origin}→...→{option.segments[-1].destination}" + if option.segments else "?" + ) + + if use_rich_progress: + try: + from rich.console import Console + from rich.status import Status + + console = Console(stderr=True) + status = Status( + f"Option {oid}: checking {route_label}...", + console=console, + spinner="dots", + ) + status.start() + + def _progress(current, total, seg, _s=status, _oid=oid): + label = seg.dclass.display_code if seg.dclass else "..." + _s.update( + f"Option {_oid}: {seg.origin}→{seg.destination} " + f"[{current}/{total}] {label}" + ) + + vr = verifier.verify_option( + option, progress_cb=_progress, no_cache=no_cache + ) + status.stop() + except ImportError: + # Fall back to plain echo + use_rich_progress = False + vr = verifier.verify_option(option, no_cache=no_cache) + elif not quiet and not json: + typer.echo(f"Verifying option {oid} ({route_label})...", err=True) + + def _progress(current, total, seg): + status = seg.dclass.display_code if seg.dclass else "..." + typer.echo( + f" [{current}/{total}] {seg.origin}→{seg.destination}: {status}", + err=True, + ) + + vr = verifier.verify_option( + option, progress_cb=_progress, no_cache=no_cache + ) + else: + vr = verifier.verify_option(option, no_cache=no_cache) + + results.append(vr) + + if not json and not quiet: + _display_verify_result(vr) + typer.echo("", err=True) + + # Summary panel for batch verify + if not json and not quiet and len(results) > 1: + _display_verify_summary(results) + + if json: + data = [r.model_dump(mode="json") for r in results] + typer.echo(json_mod.dumps(data, indent=2)) + + except typer.Exit: + raise + except Exception as exc: + _error_panel(str(exc)) + raise typer.Exit(code=2) + + # --------------------------------------------------------------------------- # T049: Search command # --------------------------------------------------------------------------- @@ -693,6 +1215,9 @@ def search( top_n: Annotated[int, typer.Option("--top", "-n", help="Max results")] = 10, rank_by: Annotated[str, typer.Option("--rank-by", help="Ranking strategy")] = "availability", skip_availability: Annotated[bool, typer.Option("--skip-availability", help="Skip availability check")] = False, + nonstop: Annotated[bool, typer.Option("--nonstop", help="Show only nonstop flights")] = False, + backend: Annotated[str, typer.Option("--backend", "-b", help="Flight search backend: auto, serpapi, fast-flights, playwright")] = "auto", + verify_dclass: Annotated[bool, typer.Option("--verify-dclass", help="Auto-verify D-class on top results via ExpertFlyer")] = False, export: Annotated[int, typer.Option("--export", "-e", help="Export option N as YAML")] = 0, json: JsonFlag = False, plain: PlainFlag = False, @@ -713,6 +1238,26 @@ def search( _error_panel("Missing --origin. Example: --origin CAI") raise typer.Exit(code=2) + # Validate backend + from rtw.scraper.google_flights import SearchBackend + try: + search_backend = SearchBackend(backend) + except ValueError: + valid = ", ".join(b.value for b in SearchBackend) + _error_panel(f"Invalid backend '{backend}'. Choose from: {valid}") + raise typer.Exit(code=2) + + if search_backend == SearchBackend.SERPAPI: + from rtw.scraper.serpapi_flights import serpapi_available + if not serpapi_available(): + _error_panel( + "SERPAPI_API_KEY not set.\n\n" + "1. Sign up at https://serpapi.com (free tier: 250 searches/mo)\n" + "2. Set the key: export SERPAPI_API_KEY=your_key_here\n\n" + "Or use --backend auto to try other backends." + ) + raise typer.Exit(code=2) + try: from datetime import date as Date @@ -781,7 +1326,10 @@ def search( from rtw.scraper.cache import ScrapeCache from rtw.search.availability import AvailabilityChecker - checker = AvailabilityChecker(cache=ScrapeCache(), cabin=cabin) + max_stops = 0 if nonstop else None + checker = AvailabilityChecker( + cache=ScrapeCache(), cabin=cabin, max_stops=max_stops, backend=search_backend, + ) check_count = min(3, len(ranked)) if not quiet and not json: @@ -815,6 +1363,41 @@ def _progress(idx, total, seg_info, result): base_fare_usd=base_fare_usd, ) + # Save search state for `rtw verify` + from rtw.verify.state import SearchState + SearchState().save(result) + + # Phase 6.5: Optional D-class verification + if verify_dclass: + from rtw.scraper.expertflyer import ExpertFlyerScraper, _get_credentials + + if _get_credentials() is None: + if not quiet: + typer.echo( + "Skipping D-class verification: no ExpertFlyer credentials. " + "Run `rtw login expertflyer` first.", + err=True, + ) + else: + from rtw.verify.verifier import DClassVerifier + + verify_count = min(3, len(ranked)) + if not quiet and not json: + typer.echo( + f"\nVerifying D-class for top {verify_count} options...", + err=True, + ) + with ExpertFlyerScraper() as ef_scraper: + dclass_verifier = DClassVerifier( + scraper=ef_scraper, + cache=ScrapeCache(), + ) + for i in range(verify_count): + option = _scored_to_verify_option(ranked[i], i + 1) + vr = dclass_verifier.verify_option(option) + if not quiet and not json: + _display_verify_result(vr, quiet=quiet) + if json: typer.echo(format_search_json(result)) elif export > 0: @@ -837,6 +1420,21 @@ def _progress(idx, total, seg_info, result): except typer.Exit: raise except Exception as exc: + from rtw.scraper.serpapi_flights import SerpAPIAuthError, SerpAPIQuotaError + if isinstance(exc, SerpAPIAuthError): + _error_panel( + "SerpAPI authentication failed.\n\n" + "Check your key at https://serpapi.com/manage-api-key\n" + "Or use --backend auto to try other backends." + ) + raise typer.Exit(code=2) + if isinstance(exc, SerpAPIQuotaError): + _error_panel( + "SerpAPI monthly quota exceeded.\n\n" + "Upgrade at https://serpapi.com/pricing\n" + "or use --backend auto to fall back to other search methods." + ) + raise typer.Exit(code=2) _error_panel(str(exc)) raise typer.Exit(code=2) diff --git a/rtw/output/search_formatter.py b/rtw/output/search_formatter.py index e5f5f13..8f51885 100644 --- a/rtw/output/search_formatter.py +++ b/rtw/output/search_formatter.py @@ -22,6 +22,25 @@ def _format_usd(amount: float) -> str: return f"${amount:,.0f}" +def _format_stops(stops: Optional[int]) -> str: + """Human-readable stops label.""" + if stops is None: + return "" + if stops == 0: + return "nonstop" + if stops == 1: + return "1 stop" + return f"{stops} stops" + + +def _format_duration(minutes: Optional[int]) -> str: + """Format minutes as 'Xh YYm' (e.g., '12h40').""" + if minutes is None: + return "" + hours, mins = divmod(minutes, 60) + return f"{hours}h{mins:02d}" + + def _fare_summary_rich(opt: ScoredCandidate) -> Optional[str]: """Build Rich-formatted fare comparison line for an option.""" fc = opt.fare_comparison @@ -155,22 +174,32 @@ def format_search_results_rich(result: SearchResult) -> str: table.add_column("#", style="dim", width=3) table.add_column("Route", style="cyan", width=10) table.add_column("Carrier", width=4) + table.add_column("Flight", width=8) table.add_column("Date", width=12) + table.add_column("Stops", width=12) table.add_column("Availability", width=12) for i, seg in enumerate(segs): route = f"{seg.from_airport}-{seg.to_airport}" carrier = seg.carrier or "??" + flight_str = "" date_str = "" avail_str = "-" + stops_str = "" if i < len(route_segs) and route_segs[i].availability: avail = route_segs[i].availability avail_str = f"[{_status_color(avail.status)}]{_status_label(avail.status)}[/]" if avail.date: date_str = avail.date.strftime("%b %d") + stops_str = _format_stops(avail.stops) + dur = _format_duration(avail.duration_minutes) + if dur: + stops_str = f"{stops_str} {dur}".strip() + if avail.flight_number: + flight_str = avail.flight_number - table.add_row(str(i + 1), route, carrier, date_str, avail_str) + table.add_row(str(i + 1), route, carrier, flight_str, date_str, stops_str, avail_str) console.print(table) fare_line = _fare_summary_rich(opt) @@ -227,12 +256,23 @@ def format_search_results_plain(result: SearchResult) -> str: carrier = seg.carrier or "??" avail = "-" date_str = "" + stops_str = "" + flight_str = "" if i < len(route_segs) and route_segs[i].availability: a = route_segs[i].availability avail = _status_label(a.status) if a.date: date_str = a.date.strftime("%b %d") - lines.append(f" {i + 1:>2}. {route:<10} {carrier:<4} {date_str:<10} {avail}") + stops_str = _format_stops(a.stops) + dur = _format_duration(a.duration_minutes) + if dur: + stops_str = f"{stops_str} {dur}".strip() + if a.flight_number: + flight_str = a.flight_number + lines.append( + f" {i + 1:>2}. {route:<10} {carrier:<4} {flight_str:<9}" + f"{date_str:<10} {avail:<12} {stops_str}" + ) fare_line = _fare_summary_plain(opt) if fare_line: lines.append(f" {fare_line}") @@ -262,6 +302,11 @@ def format_search_json(result: SearchResult) -> str: seg_data["date"] = a.date.isoformat() if a.price_usd: seg_data["price_usd"] = a.price_usd + if a.stops is not None: + seg_data["stops"] = a.stops + seg_data["source"] = a.source + seg_data["flight_number"] = a.flight_number + seg_data["duration_minutes"] = a.duration_minutes segs_data.append(seg_data) opt_data: dict = { diff --git a/rtw/scraper/batch.py b/rtw/scraper/batch.py index d6271ca..cb6ab25 100644 --- a/rtw/scraper/batch.py +++ b/rtw/scraper/batch.py @@ -13,7 +13,7 @@ from rtw.models import Itinerary from rtw.scraper.cache import ScrapeCache from rtw.scraper.expertflyer import ExpertFlyerScraper -from rtw.scraper.google_flights import FlightPrice, search_fast_flights +from rtw.scraper.google_flights import FlightPrice, SearchBackend, search_fast_flights logger = logging.getLogger(__name__) @@ -21,12 +21,14 @@ async def search_itinerary_prices( itinerary: Itinerary, cache: Optional[ScrapeCache] = None, + backend: SearchBackend = SearchBackend.AUTO, ) -> list[Optional[FlightPrice]]: """Search prices for all flown segments in an itinerary. Args: itinerary: The RTW itinerary to price. cache: Optional ScrapeCache for caching results. + backend: Which search backend to use. Returns: List of FlightPrice (or None) for each segment. Surface segments @@ -54,18 +56,14 @@ async def search_itinerary_prices( except Exception: pass # Invalid cache entry, re-fetch - # Try searching + # Try searching with cascade try: - price = search_fast_flights( - origin=seg.from_airport, - dest=seg.to_airport, - date=seg.date, - cabin=itinerary.ticket.cabin.value, + price = _search_segment_price( + seg.from_airport, seg.to_airport, seg.date, + itinerary.ticket.cabin.value, backend, ) if price is not None: - # Cache the result from dataclasses import asdict - cache.set(cache_key, asdict(price)) results.append(price) except Exception as exc: @@ -80,6 +78,56 @@ async def search_itinerary_prices( return results +def _search_segment_price(origin, dest, seg_date, cabin, backend): + """Search a single segment using the configured backend.""" + if backend == SearchBackend.AUTO: + search_fns = _auto_price_cascade() + elif backend == SearchBackend.SERPAPI: + search_fns = [("serpapi", _try_serpapi_price)] + elif backend == SearchBackend.FAST_FLIGHTS: + search_fns = [("fast-flights", _try_fast_flights_price)] + elif backend == SearchBackend.PLAYWRIGHT: + search_fns = [("playwright", _try_playwright_price)] + else: + search_fns = _auto_price_cascade() + + for name, fn in search_fns: + try: + result = fn(origin, dest, seg_date, cabin) + if result is not None: + return result + except Exception as exc: + if backend != SearchBackend.AUTO: + raise + logger.debug("Batch cascade %s failed for %s-%s: %s", name, origin, dest, exc) + + return None + + +def _auto_price_cascade(): + """Build cascade for batch pricing: serpapi -> fast-flights (no Playwright — too slow).""" + fns = [] + from rtw.scraper.serpapi_flights import serpapi_available + if serpapi_available(): + fns.append(("serpapi", _try_serpapi_price)) + fns.append(("fast-flights", _try_fast_flights_price)) + return fns + + +def _try_serpapi_price(origin, dest, seg_date, cabin): + from rtw.scraper.serpapi_flights import search_serpapi + return search_serpapi(origin=origin, dest=dest, date=seg_date, cabin=cabin) + + +def _try_fast_flights_price(origin, dest, seg_date, cabin): + return search_fast_flights(origin, dest, seg_date, cabin) + + +def _try_playwright_price(origin, dest, seg_date, cabin): + from rtw.scraper.google_flights import search_playwright_sync + return search_playwright_sync(origin, dest, seg_date, cabin) + + async def check_itinerary_availability( itinerary: Itinerary, booking_class: str = "D", @@ -132,6 +180,7 @@ async def check_itinerary_availability( def search_with_fallback( itinerary: Itinerary, cache: Optional[ScrapeCache] = None, + backend: SearchBackend = SearchBackend.AUTO, ) -> list[Optional[FlightPrice]]: """Synchronous wrapper for search_itinerary_prices. @@ -141,6 +190,7 @@ def search_with_fallback( Args: itinerary: The RTW itinerary to price. cache: Optional ScrapeCache for caching results. + backend: Which search backend to use. Returns: List of FlightPrice (or None) for each segment. @@ -152,12 +202,10 @@ def search_with_fallback( loop = None if loop is not None and loop.is_running(): - # Already in an async context - can't nest event loops - # Return empty results rather than crash logger.warning("Cannot run async search from within running event loop") return [None] * len(itinerary.segments) - return asyncio.run(search_itinerary_prices(itinerary, cache)) + return asyncio.run(search_itinerary_prices(itinerary, cache, backend)) except Exception as exc: logger.warning("search_with_fallback failed: %s", exc) diff --git a/rtw/scraper/expertflyer.py b/rtw/scraper/expertflyer.py index 4d4e3f2..9437337 100644 --- a/rtw/scraper/expertflyer.py +++ b/rtw/scraper/expertflyer.py @@ -1,150 +1,638 @@ -"""ExpertFlyer availability scraper. +"""ExpertFlyer D-class availability scraper. -Checks award/premium class availability on ExpertFlyer using Playwright. -Credentials are retrieved from macOS Keychain via the keyring library. +Checks booking class availability on ExpertFlyer using Playwright with +programmatic Auth0 login via credentials from macOS Keychain. -All functions degrade gracefully when credentials or services are unavailable. +Constructs results URLs directly (no form filling) and parses the +HTML table. + +Key discovery: ExpertFlyer results URL is directly constructable: + /air/availability/results?origin=LHR&destination=HKG&...&classFilter=D + +Table structure: single with per flight group, +each containing a connection header row and flight data rows. +D-class shown as "D9", "D5", "D0" etc in the Available Classes column. """ from __future__ import annotations +import datetime import logging -from datetime import date as Date -from typing import Optional +import random +import re +import time +from typing import TYPE_CHECKING, Optional +from urllib.parse import quote_plus + +if TYPE_CHECKING: + from rtw.verify.models import DClassResult logger = logging.getLogger(__name__) -_EXPERTFLYER_SERVICE = "expertflyer.com" +_EXPERTFLYER_BASE = "https://www.expertflyer.com" +_RESULTS_URL = f"{_EXPERTFLYER_BASE}/air/availability/results" +_LOGIN_URL = f"{_EXPERTFLYER_BASE}/auth/login" + +# Rate limiting +_MIN_QUERY_INTERVAL = 5 # seconds between queries +_DAILY_SOFT_LIMIT = 50 # warn after this many queries + +# Retry config +_MAX_RETRIES = 3 +_RETRY_BASE_DELAY = 3 # seconds +_RETRY_JITTER = 0.2 # 20% random jitter + +# Timeouts +_PAGE_LOAD_TIMEOUT = 30000 # ms +_RESULTS_TIMEOUT = 15000 # ms + +# Regex for booking class availability: letter + digit (e.g., D9, D0) +_CLASS_PATTERN = re.compile(r"\b([A-Z])(\d)\b") + +# Keyring service name +_KEYRING_SERVICE = "expertflyer.com" + + +class ScrapeError(Exception): + """Error during ExpertFlyer scraping.""" + + def __init__(self, message: str, error_type: str = "UNKNOWN") -> None: + super().__init__(message) + self.error_type = error_type + + +class SessionExpiredError(ScrapeError): + """Session has expired, requiring re-login.""" + + def __init__(self, message: str = "ExpertFlyer session expired") -> None: + super().__init__(message, error_type="SESSION_EXPIRED") + + +def _get_credentials() -> tuple[str, str] | None: + """Retrieve ExpertFlyer credentials from macOS Keychain.""" + try: + import keyring + + username = keyring.get_password(_KEYRING_SERVICE, "username") + password = keyring.get_password(_KEYRING_SERVICE, "password") + if username and password: + return username, password + except Exception: + pass + return None class ExpertFlyerScraper: - """Scrape ExpertFlyer for seat availability. + """Scrape ExpertFlyer for D-class availability. + + Uses programmatic Auth0 login with credentials from macOS Keychain. + Maintains a persistent browser context across multiple queries to + keep the session alive. - Requires ExpertFlyer credentials stored in macOS Keychain: - keyring set expertflyer.com + Usage: + scraper = ExpertFlyerScraper() + with scraper: + result = scraper.check_availability("LHR", "HKG", date, "CX") + + Or for single queries (auto-manages lifecycle): + scraper = ExpertFlyerScraper() + result = scraper.check_availability("LHR", "HKG", date, "CX") """ - def __init__(self) -> None: - self._username: Optional[str] = None - self._password: Optional[str] = None + def __init__(self, session_path: Optional[str] = None) -> None: + self._session_path = session_path # Legacy: not used for login anymore + self._last_call_time: float = 0 + self._query_count: int = 0 + self._playwright = None + self._browser = None + self._context = None + self._page = None + self._logged_in = False - def _get_credentials(self) -> tuple[Optional[str], Optional[str]]: - """Retrieve ExpertFlyer credentials from system keyring. + def __enter__(self) -> "ExpertFlyerScraper": + self._ensure_browser() + return self - Returns: - Tuple of (username, password), either may be None. - """ - if self._username is not None: - return self._username, self._password + def __exit__(self, *args) -> None: + self.close() - try: - import keyring - except ImportError: - logger.info("keyring library not available - cannot retrieve ExpertFlyer credentials") - return None, None + def close(self) -> None: + """Close browser and cleanup.""" + if self._browser: + try: + self._browser.close() + except Exception: + pass + self._browser = None + if self._playwright: + try: + self._playwright.stop() + except Exception: + pass + self._playwright = None + self._context = None + self._page = None + self._logged_in = False + + def _ensure_browser(self) -> None: + """Launch browser and context if not already running.""" + if self._page is not None: + return + + from playwright.sync_api import sync_playwright + + self._playwright = sync_playwright().start() + self._browser = self._playwright.chromium.launch(headless=True) + self._context = self._browser.new_context( + viewport={"width": 1400, "height": 900}, + user_agent=( + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) " + "AppleWebKit/537.36 (KHTML, like Gecko) " + "Chrome/120.0.0.0 Safari/537.36" + ), + ) + self._page = self._context.new_page() + + def _login(self) -> bool: + """Programmatic Auth0 login using Keychain credentials.""" + creds = _get_credentials() + if creds is None: + logger.error( + "No ExpertFlyer credentials. " + "Run: rtw login expertflyer" + ) + return False + + username, password = creds + page = self._page + logger.info("Logging in to ExpertFlyer...") + page.goto(_LOGIN_URL, timeout=_PAGE_LOAD_TIMEOUT) + time.sleep(3) + + if "auth.expertflyer.com" not in page.url: + # Already logged in or unexpected page + if "www.expertflyer.com" in page.url: + self._logged_in = True + return True + return False + + # Fill Auth0 login form try: - # Convention: username stored as the 'account' in keyring - # with service "expertflyer.com" - username = keyring.get_password(_EXPERTFLYER_SERVICE, "username") - password = keyring.get_password(_EXPERTFLYER_SERVICE, "password") + # Email field + page.wait_for_selector( + 'input[name="email"], input[name="username"], input[type="email"]', + timeout=10000, + ) + email_input = ( + page.query_selector('input[name="email"]') + or page.query_selector('input[name="username"]') + or page.query_selector('input[type="email"]') + ) + if not email_input: + logger.error("Email input not found on Auth0 page") + return False - if not username or not password: - logger.info("ExpertFlyer credentials not found in keyring") - return None, None + email_input.fill(username) - self._username = username - self._password = password - return username, password + # Click continue (Auth0 may split email/password screens) + submit = page.query_selector('button[type="submit"]') + if submit: + submit.click() + time.sleep(2) + + # Password field + pwd_input = ( + page.query_selector('input[name="password"]') + or page.query_selector('input[type="password"]') + ) + if not pwd_input: + logger.error("Password input not found") + return False + + pwd_input.fill(password) + + # Submit login + submit = page.query_selector('button[type="submit"]') + if submit: + submit.click() + time.sleep(5) + + # Verify login succeeded + if "www.expertflyer.com" in page.url: + self._logged_in = True + logger.info("ExpertFlyer login successful") + return True + + # May still be on auth page with redirect pending + time.sleep(3) + if "www.expertflyer.com" in page.url: + self._logged_in = True + logger.info("ExpertFlyer login successful (delayed redirect)") + return True + + logger.error("Login failed. URL: %s", page.url[:80]) + return False except Exception as exc: - logger.warning("Failed to retrieve ExpertFlyer credentials: %s", exc) - return None, None + logger.error("Login error: %s", exc) + return False - def credentials_available(self) -> bool: - """Check whether ExpertFlyer credentials are configured.""" - username, password = self._get_credentials() - return username is not None and password is not None + def _ensure_logged_in(self) -> None: + """Ensure we have a browser and are logged in.""" + self._ensure_browser() + if not self._logged_in: + if not self._login(): + raise SessionExpiredError("Failed to log in to ExpertFlyer") - async def check_availability( + def _check_session_expired(self, page) -> None: + """Raise SessionExpiredError if redirected to login.""" + url = page.url + if "auth.expertflyer.com" in url or "/login" in url: + self._logged_in = False + raise SessionExpiredError() + + def _build_results_url( self, origin: str, dest: str, - date: Date, - carrier: str, + date: datetime.date, booking_class: str = "D", - ) -> Optional[dict]: - """Check seat availability for a specific route and class. + carrier: str = "", + ) -> str: + """Construct the ExpertFlyer results URL directly.""" + dt = date.strftime("%Y-%m-%dT00:00") + params = { + "origin": origin.upper(), + "destination": dest.upper(), + "departureDateTime": dt, + "alliance": "none", + "airLineCodes": carrier.upper() if carrier else "", + "excludeCodeshares": "false", + "classFilter": booking_class.upper(), + "pcc": "USA (Default)", + "resultsDisplay": "single", + } + qs = "&".join(f"{k}={quote_plus(str(v))}" for k, v in params.items()) + return f"{_RESULTS_URL}?{qs}" + + def _rate_limit_wait(self) -> None: + """Enforce minimum interval between queries.""" + elapsed = time.time() - self._last_call_time + if elapsed < _MIN_QUERY_INTERVAL: + wait = _MIN_QUERY_INTERVAL - elapsed + random.uniform(0.5, 2.0) + logger.debug("Rate limit: waiting %.1fs", wait) + time.sleep(wait) + + def check_availability( + self, + origin: str, + dest: str, + date: datetime.date, + carrier: str = "", + booking_class: str = "D", + ) -> Optional["DClassResult"]: + """Check D-class availability for a route on a date. + + Logs in automatically if needed. Reuses the browser context + across multiple calls for efficiency. Args: - origin: 3-letter IATA airport code. - dest: 3-letter IATA airport code. - date: Flight date. - carrier: 2-letter airline code. - booking_class: Booking class to check (default "D" for business award). + origin: 3-letter IATA origin airport. + dest: 3-letter IATA destination airport. + date: Target flight date. + carrier: 2-letter airline code (empty = all carriers). + booking_class: Booking class to check (default "D"). Returns: - Dict with availability info, or None if unavailable/failed. - Example: {"origin": "LHR", "dest": "NRT", "carrier": "JL", - "class": "D", "available": True, "seats": 2} + DClassResult with availability info, or None if login failed. + + Raises: + SessionExpiredError: If login fails or session expires mid-batch. + ScrapeError: On parse or navigation errors (after retries). """ - username, password = self._get_credentials() - if not username or not password: - logger.info("Skipping ExpertFlyer check - no credentials") + from rtw.verify.models import DClassResult, DClassStatus + + # Check credentials exist before starting + if _get_credentials() is None and not self._session_path: + logger.info("No ExpertFlyer credentials configured") return None - try: - from rtw.scraper import BrowserManager + self._rate_limit_wait() + self._query_count += 1 + if self._query_count == _DAILY_SOFT_LIMIT: + logger.warning( + "ExpertFlyer soft limit reached (%d queries). " + "Consider spacing out checks.", + _DAILY_SOFT_LIMIT, + ) - if not BrowserManager.available(): - logger.info("Playwright not available for ExpertFlyer scraping") - return None + url = self._build_results_url(origin, dest, date, booking_class, carrier) + last_error: Optional[Exception] = None - async with BrowserManager() as browser: - return await self._scrape_availability( - browser, origin, dest, date, carrier, booking_class, username, password + for attempt in range(1, _MAX_RETRIES + 1): + try: + self._ensure_logged_in() + result = self._fetch_and_parse( + url, origin, dest, date, carrier, booking_class ) + self._last_call_time = time.time() + return result + except SessionExpiredError: + if attempt < _MAX_RETRIES: + # Try re-login once + logger.warning("Session expired, attempting re-login...") + self._logged_in = False + continue + raise + except ScrapeError as exc: + last_error = exc + if attempt < _MAX_RETRIES: + delay = _RETRY_BASE_DELAY * (2 ** (attempt - 1)) + jitter = delay * random.uniform(-_RETRY_JITTER, _RETRY_JITTER) + wait = delay + jitter + logger.warning( + "ExpertFlyer attempt %d/%d failed: %s. Retrying in %.1fs", + attempt, + _MAX_RETRIES, + exc, + wait, + ) + time.sleep(wait) + except Exception as exc: + last_error = exc + if attempt < _MAX_RETRIES: + time.sleep(_RETRY_BASE_DELAY) - except Exception as exc: - logger.warning( - "ExpertFlyer check failed for %s %s-%s: %s", - carrier, - origin, - dest, - exc, - ) - return None + # All retries exhausted + self._last_call_time = time.time() + return DClassResult( + status=DClassStatus.ERROR, + seats=0, + carrier=carrier or "??", + origin=origin, + destination=dest, + target_date=date, + error_message=str(last_error), + ) - async def _scrape_availability( + def _fetch_and_parse( self, - browser, + url: str, origin: str, dest: str, - date: Date, + date: datetime.date, carrier: str, booking_class: str, - username: str, - password: str, - ) -> Optional[dict]: - """Internal: Perform the actual ExpertFlyer scrape. - - NOTE: This is a P2 stub. Full implementation would: - 1. Navigate to ExpertFlyer login page - 2. Authenticate with username/password - 3. Search for the route/date/carrier - 4. Parse the availability grid for the booking class - 5. Return structured availability data + ) -> "DClassResult": + """Navigate to results URL and parse the availability table. - Returns: - Dict with availability info, or None. + Uses the persistent page — no new browser launch per call. """ - logger.info( - "ExpertFlyer stub: would check %s %s-%s class %s on %s", - carrier, - origin, - dest, - booking_class, - date, + page = self._page + + logger.info("ExpertFlyer: fetching %s→%s on %s", origin, dest, date) + page.goto(url, timeout=_PAGE_LOAD_TIMEOUT) + + # Wait for table or detect session expiry + time.sleep(2) + self._check_session_expired(page) + + # Wait for results table + try: + page.wait_for_selector( + "table.w-full.bg-white.shadow-md", + timeout=_RESULTS_TIMEOUT, + ) + except Exception: + # Check if session expired during load + self._check_session_expired(page) + raise ScrapeError( + f"Results table not found for {origin}→{dest}", + error_type="PARSE_ERROR", + ) + + # Parse the results + return self._parse_results_table( + page, origin, dest, date, carrier, booking_class ) - # Stub: actual implementation would scrape the ExpertFlyer site + + def _parse_results_table( + self, + page, + origin: str, + dest: str, + date: datetime.date, + carrier: str, + booking_class: str, + ) -> "DClassResult": + """Parse the ExpertFlyer results table for per-flight D-class data.""" + from rtw.verify.models import DClassResult, DClassStatus, FlightAvailability + + flights: list[FlightAvailability] = [] + + try: + rows = page.query_selector_all("tr.hover\\:bg-sky-50") + for row in rows: + text = row.evaluate("el => (el.innerText || '')") + + # Extract D-class seats + d_match = re.search(rf"\b{booking_class}(\d)\b", text) + if d_match is None: + continue + seats = int(d_match.group(1)) + + # Extract carrier + flight number (carrier is first field in row) + flight_carrier = None + flight_num = None + fn_match = re.match(r"\s*([A-Z\d]{2})\s*\n\s*(\d{1,4})\b", text) + if fn_match: + flight_carrier = fn_match.group(1) + flight_num = f"{fn_match.group(1)}{fn_match.group(2)}" + + # Extract stops (digit after flight number area) + stops = 0 + stops_match = re.search(r"\b(\d)\s*\n", text) + if stops_match: + stops = int(stops_match.group(1)) + + # Extract departure/arrival times + times = re.findall( + r"(\d{2}/\d{2}/\d{2}\s+\d{1,2}:\d{2}\s+[AP]M)", text + ) + + # Extract airports (3-letter codes in the row) + airports = re.findall(r"\b([A-Z]{3})\b", text) + # Filter to likely IATA codes (exclude common non-airport 3-letter combos) + iata_airports = [ + a for a in airports + if a not in ("Daily", "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat") + ] + + # Extract aircraft type + aircraft = None + ac_match = re.search(r"\b(3\d{2}|7[2-8]\w|A\d{2}\w?|E\d{2})\b", text) + if ac_match: + aircraft = ac_match.group(1) + + flights.append(FlightAvailability( + carrier=flight_carrier or carrier or None, + flight_number=flight_num, + origin=iata_airports[0] if len(iata_airports) > 0 else origin, + destination=iata_airports[1] if len(iata_airports) > 1 else dest, + depart_time=times[0] if len(times) > 0 else None, + arrive_time=times[1] if len(times) > 1 else None, + aircraft=aircraft, + seats=seats, + booking_class=booking_class, + stops=stops, + )) + + except Exception as exc: + logger.warning("Per-row extraction failed, falling back to body text: %s", exc) + + # Fallback: if per-row extraction got nothing, use body-text regex + if not flights: + body_text = page.evaluate("() => document.body.innerText") + pattern = re.compile(rf"\b{booking_class}(\d)\b") + matches = pattern.findall(body_text) + + if not matches: + return DClassResult( + status=DClassStatus.NOT_AVAILABLE, + seats=0, + carrier=carrier or "??", + origin=origin, + destination=dest, + target_date=date, + ) + + seat_counts = [int(m) for m in matches] + best_seats = max(seat_counts) + flight_number = self._extract_flight_number(page, carrier) + + status = DClassStatus.AVAILABLE if best_seats > 0 else DClassStatus.NOT_AVAILABLE + return DClassResult( + status=status, + seats=best_seats, + flight_number=flight_number, + carrier=carrier or "??", + origin=origin, + destination=dest, + target_date=date, + ) + + # Deduplicate flights by flight_number + depart_time + seen = set() + unique_flights = [] + for f in flights: + key = (f.flight_number, f.depart_time) + if key not in seen: + seen.add(key) + unique_flights.append(f) + flights = unique_flights + + # Build result from per-flight data + best_seats = max(f.seats for f in flights) + # Set flight_number to the best-D-class flight + best_flight = max(flights, key=lambda f: (f.seats, -(len(f.depart_time or "")))) + flight_number = best_flight.flight_number + + status = DClassStatus.AVAILABLE if best_seats > 0 else DClassStatus.NOT_AVAILABLE + + return DClassResult( + status=status, + seats=best_seats, + flight_number=flight_number, + carrier=carrier or "??", + origin=origin, + destination=dest, + target_date=date, + flights=flights, + ) + + def _extract_flight_number(self, page, carrier: str) -> Optional[str]: + """Try to extract the first flight number from results.""" + try: + rows = page.query_selector_all("tr.hover\\:bg-sky-50") + if rows: + text = rows[0].evaluate( + "el => (el.innerText || '').substring(0, 100)" + ) + if carrier: + match = re.search( + rf"\b{re.escape(carrier)}\s*(\d{{1,4}})\b", text + ) + if match: + return f"{carrier}{match.group(1)}" + match = re.search(r"\b([A-Z]{2})\s*(\d{1,4})\b", text) + if match: + return f"{match.group(1)}{match.group(2)}" + except Exception: + pass return None + + +def parse_availability_html( + html: str, booking_class: str = "D" +) -> list[dict]: + """Parse ExpertFlyer results HTML for availability data. + + Standalone parser for testing with HTML fixtures. + + Returns list of dicts with: carrier, flight_number, origin, destination, + depart_time, arrive_time, aircraft, frequency, reliability, seats. + """ + results = [] + tbody_pattern = re.compile( + r"]*table-custom-hover-group[^>]*>(.*?)", + re.DOTALL, + ) + row_pattern = re.compile( + r"]*hover:bg-sky-50[^>]*>(.*?)", + re.DOTALL, + ) + + for tbody_match in tbody_pattern.finditer(html): + tbody_html = tbody_match.group(1) + + for row_match in row_pattern.finditer(tbody_html): + row_html = row_match.group(1) + row_text = re.sub(r"<[^>]+>", " ", row_html) + row_text = re.sub(r"\s+", " ", row_text).strip() + + class_pattern = re.compile(rf"\b{booking_class}(\d)\b") + class_match = class_pattern.search(row_text) + seats = int(class_match.group(1)) if class_match else None + + carrier_match = re.search(r"\b([A-Z]{2})\b", row_text) + flight_match = re.search(r"\b([A-Z]{2})\s+(\d{1,4})\b", row_text) + + airports = re.findall( + r'cursor-pointer text-sky-600[^>]*>([A-Z]{3})<', row_html + ) + + times = re.findall( + r"(\d{2}/\d{2}/\d{2}\s+\d{1,2}:\d{2}\s+[AP]M)", row_text + ) + + aircraft_match = re.search(r"\b(\d{2}[A-Z0-9]|[A-Z]\d{2})\b", row_text) + + result = { + "carrier": carrier_match.group(1) if carrier_match else None, + "flight_number": ( + f"{flight_match.group(1)}{flight_match.group(2)}" + if flight_match + else None + ), + "origin": airports[0] if len(airports) > 0 else None, + "destination": airports[1] if len(airports) > 1 else None, + "depart_time": times[0] if len(times) > 0 else None, + "arrive_time": times[1] if len(times) > 1 else None, + "aircraft": aircraft_match.group(1) if aircraft_match else None, + "seats": seats, + "booking_class": booking_class, + } + results.append(result) + + return results diff --git a/rtw/scraper/google_flights.py b/rtw/scraper/google_flights.py index 04bc1b6..aa0eaec 100644 --- a/rtw/scraper/google_flights.py +++ b/rtw/scraper/google_flights.py @@ -10,15 +10,63 @@ import time from dataclasses import dataclass from datetime import date as Date +from enum import Enum from typing import Optional logger = logging.getLogger(__name__) +# --------------------------------------------------------------------------- +# Constants +# --------------------------------------------------------------------------- + # Rate limiting: minimum seconds between scrape calls _RATE_LIMIT_SECONDS = 2.0 _last_call_time: float = 0.0 -# Oneworld carriers for filtering +# Retry / timeout constants +_MAX_ATTEMPTS = 2 +_RETRY_BACKOFF_S = 5.0 +_PAGE_LOAD_TIMEOUT_MS = 30000 +_CARD_WAIT_TIMEOUT_MS = 10000 +_CONSENT_VISIBILITY_TIMEOUT_MS = 500 +_MAX_EXPAND_CLICKS = 5 +_EXPAND_WAIT_MS = 2500 +_EXPAND_PHASE_TIMEOUT_MS = 15000 + +# CSS selectors — centralised so changes happen in one place +_SELECTORS = { + "flight_card": "li.pIav2d", + "show_more": "button[aria-label*='more flights'], button[aria-label*='More flights']", + "airline": ".sSHqwe", + "price": ".YMlIz", + "stops": ".EfT7Ae .ogfYpf", + "stops_alt": ".VG3hNb", + "departure": ".wtDjR .zxVSec", + "arrival": ".XWcVob .zxVSec", + "duration": ".gvkrdb", +} + +# Consent-dismiss selectors — tried in order, first visible wins +_CONSENT_SELECTORS = [ + 'button[aria-label="Accept all"]', + "button:has-text('Accept all')", + "button:has-text('Agree')", + "button:has-text('Alle akzeptieren')", + "button:has-text('Tout accepter')", + "button#agree", + "button#consent", +] + +# User-agent for browser context +_USER_AGENT = ( + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) " + "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36" +) + +# --------------------------------------------------------------------------- +# Oneworld carrier data +# --------------------------------------------------------------------------- + _ONEWORLD_CARRIERS = { "american", "british airways", "cathay pacific", "finnair", "iberia", "japan airlines", "jal", "malaysia airlines", "qantas", "qatar airways", @@ -29,7 +77,6 @@ "ul", "as", "fj", "wy", "s7", } -# Carrier name -> IATA code mapping _CARRIER_IATA = { "american": "AA", "british airways": "BA", "cathay pacific": "CX", "finnair": "AY", "iberia": "IB", "japan airlines": "JL", "jal": "JL", @@ -38,6 +85,42 @@ "alaska": "AS", "fiji airways": "FJ", "oman air": "WY", "s7 airlines": "S7", } +# --------------------------------------------------------------------------- +# Error types +# --------------------------------------------------------------------------- + + +class ScrapeFailureReason(str, Enum): + """Categorised reasons a scrape attempt can fail.""" + + TIMEOUT = "timeout" + CONSENT_BLOCKED = "consent_blocked" + NO_RESULTS = "no_results" + PARSE_ERROR = "parse_error" + BROWSER_ERROR = "browser_error" + + +class ScrapeError(Exception): + """Structured error from the Playwright scraper.""" + + def __init__(self, reason: ScrapeFailureReason, message: str, route: str = ""): + self.reason = reason + self.route = route + super().__init__(f"[{reason.value}] {route}: {message}") + +# --------------------------------------------------------------------------- +# Data model +# --------------------------------------------------------------------------- + + +class SearchBackend(str, Enum): + """Which flight search backend to use.""" + + AUTO = "auto" + SERPAPI = "serpapi" + FAST_FLIGHTS = "fast-flights" + PLAYWRIGHT = "playwright" + @dataclass class FlightPrice: @@ -49,7 +132,16 @@ class FlightPrice: price_usd: float cabin: str # "economy", "business", "first" date: Optional[Date] = None - source: str = "google_flights" # "fast_flights" or "playwright" + source: str = "google_flights" # "fast_flights", "playwright", or "serpapi" + stops: Optional[int] = None + # New fields (populated by SerpAPI, None for other backends) + flight_number: Optional[str] = None + duration_minutes: Optional[int] = None + airline_name: Optional[str] = None + +# --------------------------------------------------------------------------- +# Shared helpers +# --------------------------------------------------------------------------- def _rate_limit() -> None: @@ -62,6 +154,33 @@ def _rate_limit() -> None: _last_call_time = time.time() +def _extract_carrier_iata(carrier_text: str) -> str: + """Extract IATA code from carrier name text.""" + text = carrier_text.lower().strip() + for name, code in _CARRIER_IATA.items(): + if name in text: + return code + return carrier_text[:2].upper() if len(carrier_text) >= 2 else "??" + + +def _is_oneworld(carrier_text: str) -> bool: + """Check if any oneworld carrier appears in the text.""" + text = carrier_text.lower() + return any(ow in text for ow in _ONEWORLD_CARRIERS) + + +def _parse_price(text: str) -> Optional[float]: + """Extract USD price from text like '$5,026'.""" + m = re.search(r"\$([\d,]+)", text) + if m: + return float(m.group(1).replace(",", "")) + return None + +# --------------------------------------------------------------------------- +# fast-flights search (unchanged) +# --------------------------------------------------------------------------- + + def search_fast_flights( origin: str, dest: str, @@ -88,7 +207,6 @@ def search_fast_flights( _rate_limit() try: - # Map cabin names to fast-flights seat literals cabin_map = { "economy": "economy", "premium_economy": "premium-economy", @@ -110,7 +228,6 @@ def search_fast_flights( logger.info("No fast-flights results for %s-%s on %s", origin, dest, date) return None - # Take the cheapest result best = min(result.flights, key=lambda f: f.price or float("inf")) if best.price is None: return None @@ -126,53 +243,144 @@ def search_fast_flights( ) except Exception as exc: - # Truncate consent wall spam from error message msg = str(exc).split("\n")[0][:100] logger.debug("fast-flights failed for %s-%s: %s", origin, dest, msg) return None +# --------------------------------------------------------------------------- +# Playwright helpers +# --------------------------------------------------------------------------- -def _extract_carrier_iata(carrier_text: str) -> str: - """Extract IATA code from carrier name text.""" - text = carrier_text.lower().strip() - for name, code in _CARRIER_IATA.items(): - if name in text: - return code - return carrier_text[:2].upper() if len(carrier_text) >= 2 else "??" +def _dismiss_consent(page) -> bool: + """Try to dismiss cookie/consent dialogs. -def _is_oneworld(carrier_text: str) -> bool: - """Check if any oneworld carrier appears in the text.""" - text = carrier_text.lower() - return any(ow in text for ow in _ONEWORLD_CARRIERS) + Iterates through ``_CONSENT_SELECTORS`` and clicks the first visible button. + Returns True if a consent button was found and clicked, False otherwise. + """ + for selector in _CONSENT_SELECTORS: + try: + btn = page.locator(selector).first + if btn.is_visible(timeout=_CONSENT_VISIBILITY_TIMEOUT_MS): + btn.click() + page.wait_for_load_state("networkidle", timeout=8000) + logger.debug("Dismissed consent via: %s", selector) + return True + except Exception: + continue + logger.debug("No consent dialog found (or already dismissed)") + return False + + +def _expand_all_results(page) -> int: + """Click 'show more flights' until all results are visible. + + Returns the total number of flight cards after expansion. + """ + start = time.monotonic() + clicks = 0 + selector = _SELECTORS["show_more"] + + while clicks < _MAX_EXPAND_CLICKS: + elapsed_ms = (time.monotonic() - start) * 1000 + if elapsed_ms > _EXPAND_PHASE_TIMEOUT_MS: + logger.debug("Expansion phase timed out after %.1fs", elapsed_ms / 1000) + break + try: + btn = page.locator(selector).first + btn.wait_for(state="visible", timeout=3000) + btn.click() + clicks += 1 + page.wait_for_timeout(_EXPAND_WAIT_MS) + except Exception: + break # No more buttons visible + + count = len(page.locator(_SELECTORS["flight_card"]).all()) + logger.debug("Found %d flight cards after %d expansion clicks", count, clicks) + return count + + +def _parse_stops(card_element) -> Optional[int]: + """Extract the number of stops from a flight card. + + Returns 0 for nonstop, N for N stops, or None if unparseable. + """ + # Strategy 1: use dedicated selector + try: + stops_el = card_element.locator(_SELECTORS["stops"]).first + stops_text = stops_el.inner_text(timeout=1000) + except Exception: + # Strategy 2: regex on full card text + try: + stops_text = card_element.inner_text() + except Exception: + return None + if not stops_text: + return None -def _parse_price(text: str) -> Optional[float]: - """Extract USD price from text like '$5,026'.""" - m = re.search(r"\$([\d,]+)", text) + text_lower = stops_text.lower() + if "nonstop" in text_lower: + return 0 + + m = re.search(r"(\d+)\s*stops?", text_lower) if m: - return float(m.group(1).replace(",", "")) + return int(m.group(1)) + return None -def search_playwright_sync( +def _parse_flight_card(card, origin: str, dest: str, date: Date, cabin: str) -> Optional[dict]: + """Parse a single flight result card into a dict. + + Returns dict with keys: price, carrier_text, carrier_code, stops. + Returns None if the card cannot be parsed (missing price, too few lines, etc). + """ + try: + text = card.inner_text() + except Exception: + return None + + lines = [l.strip() for l in text.split("\n") if l.strip()] + if len(lines) < 4: + return None + + price = _parse_price(text) + if price is None: + return None + + # Lines layout: time, -, time, carrier(s), duration, route, stops, price + carrier_text = lines[3] if len(lines) > 3 else "" + carrier_code = _extract_carrier_iata(carrier_text) + stops = _parse_stops(card) + + return { + "price": price, + "carrier_text": carrier_text, + "carrier_code": carrier_code, + "stops": stops, + } + +# --------------------------------------------------------------------------- +# Playwright search (refactored) +# --------------------------------------------------------------------------- + + +def _search_playwright_impl( origin: str, dest: str, date: Date, cabin: str = "business", oneworld_only: bool = True, + max_stops: Optional[int] = None, ) -> Optional[FlightPrice]: - """Scrape Google Flights via Playwright (sync). + """Single-attempt Playwright search against Google Flights. - Returns the cheapest flight found, optionally filtered to oneworld carriers. + Raises ScrapeError on failure so the retry wrapper can decide whether to retry. """ - try: - from playwright.sync_api import sync_playwright - except ImportError: - logger.info("Playwright not installed") - return None + from playwright.sync_api import sync_playwright - _rate_limit() + route = f"{origin}-{dest}" cabin_query = {"business": "business+class", "first": "first+class"}.get( cabin.lower(), "" @@ -186,57 +394,67 @@ def search_playwright_sync( try: with sync_playwright() as p: browser = p.chromium.launch(headless=True) - page = browser.new_page() + context = browser.new_context( + viewport={"width": 1280, "height": 800}, + user_agent=_USER_AGENT, + ) + page = context.new_page() try: - page.goto(url, timeout=30000) + page.goto(url, timeout=_PAGE_LOAD_TIMEOUT_MS) - # Dismiss cookie consent if present + _dismiss_consent(page) + + # Wait for flight cards instead of flat timeout try: - btn = page.locator("button:has-text('Accept all')").first - if btn.is_visible(timeout=2000): - btn.click() - page.wait_for_load_state("networkidle", timeout=8000) + page.wait_for_selector( + _SELECTORS["flight_card"], timeout=_CARD_WAIT_TIMEOUT_MS + ) except Exception: - pass + raise ScrapeError( + ScrapeFailureReason.NO_RESULTS, + f"No flight cards appeared within {_CARD_WAIT_TIMEOUT_MS}ms", + route=route, + ) - # Wait for flight results to render - page.wait_for_timeout(4000) + # Expand hidden results before parsing + _expand_all_results(page) - results = page.locator("li.pIav2d").all() + results = page.locator(_SELECTORS["flight_card"]).all() if not results: - logger.info("No flight results found for %s-%s on %s", origin, dest, date) - return None + raise ScrapeError( + ScrapeFailureReason.NO_RESULTS, + "Flight card selector matched 0 elements", + route=route, + ) best: Optional[FlightPrice] = None - for result in results: - text = result.inner_text() - lines = [l.strip() for l in text.split("\n") if l.strip()] - if len(lines) < 4: + for card in results: + parsed = _parse_flight_card(card, origin, dest, date, cabin) + if parsed is None: continue - price = _parse_price(text) - if price is None: + # Carrier filter + if oneworld_only and not _is_oneworld(parsed["carrier_text"]): continue - # Lines layout: time, –, time, carrier(s), duration, route, stops, price - carrier_text = lines[3] if len(lines) > 3 else "" - - if oneworld_only and not _is_oneworld(carrier_text): + # Stops filter + if max_stops is not None and parsed["stops"] is not None and parsed["stops"] > max_stops: continue - carrier_code = _extract_carrier_iata(carrier_text) + fp = FlightPrice( + origin=origin.upper(), + dest=dest.upper(), + carrier=parsed["carrier_code"], + price_usd=parsed["price"], + cabin=cabin, + date=date, + source="playwright", + stops=parsed["stops"], + ) - if best is None or price < best.price_usd: - best = FlightPrice( - origin=origin.upper(), - dest=dest.upper(), - carrier=carrier_code, - price_usd=price, - cabin=cabin, - date=date, - source="playwright", - ) + if best is None or fp.price_usd < best.price_usd: + best = fp if best: logger.info( @@ -244,15 +462,68 @@ def search_playwright_sync( origin, dest, best.carrier, best.price_usd, ) else: - logger.info("No %sflights for %s-%s on %s", - "oneworld " if oneworld_only else "", origin, dest, date) + logger.info( + "No %sflights for %s-%s on %s", + "oneworld " if oneworld_only else "", origin, dest, date, + ) return best finally: page.close() + context.close() browser.close() + except ScrapeError: + raise # Let the retry wrapper handle it except Exception as exc: - logger.warning("Playwright search failed for %s-%s: %s", origin, dest, exc) + raise ScrapeError( + ScrapeFailureReason.BROWSER_ERROR, + str(exc)[:200], + route=route, + ) from exc + + +def search_playwright_sync( + origin: str, + dest: str, + date: Date, + cabin: str = "business", + oneworld_only: bool = True, + max_stops: Optional[int] = None, +) -> Optional[FlightPrice]: + """Scrape Google Flights via Playwright (sync) with retry. + + Returns the cheapest flight found, optionally filtered to oneworld carriers. + Returns None if Playwright is not installed. + """ + try: + from playwright.sync_api import sync_playwright # noqa: F401 + except ImportError: + logger.info("Playwright not installed") return None + + _rate_limit() + + route = f"{origin}-{dest}" + last_error: Optional[ScrapeError] = None + + for attempt in range(1, _MAX_ATTEMPTS + 1): + try: + return _search_playwright_impl(origin, dest, date, cabin, oneworld_only, max_stops) + except ScrapeError as e: + if e.reason == ScrapeFailureReason.CONSENT_BLOCKED: + logger.warning("Consent blocked for %s, not retrying: %s", route, e) + return None + last_error = e + if attempt < _MAX_ATTEMPTS: + logger.warning( + "Attempt %d/%d failed for %s: %s. Retrying in %.0fs", + attempt, _MAX_ATTEMPTS, route, e, _RETRY_BACKOFF_S, + ) + time.sleep(_RETRY_BACKOFF_S) + + # Exhausted retries — log and return None for backward compatibility + logger.warning("Playwright search failed for %s after %d attempts: %s", + route, _MAX_ATTEMPTS, last_error) + return None diff --git a/rtw/scraper/serpapi_flights.py b/rtw/scraper/serpapi_flights.py new file mode 100644 index 0000000..655fa27 --- /dev/null +++ b/rtw/scraper/serpapi_flights.py @@ -0,0 +1,218 @@ +"""Google Flights search via SerpAPI (structured JSON API). + +Provides reliable flight pricing without browser automation or scraping risk. +Requires SERPAPI_API_KEY environment variable. Degrades gracefully to None when +the key is not set or the API is unavailable. +""" + +from __future__ import annotations + +import logging +import os +from datetime import date as Date +from typing import Optional + +import requests + +from rtw.scraper.google_flights import FlightPrice, _CARRIER_IATA, _rate_limit + +logger = logging.getLogger(__name__) + +# --------------------------------------------------------------------------- +# Constants +# --------------------------------------------------------------------------- + +_SERPAPI_BASE_URL = "https://serpapi.com/search" +_SERPAPI_ENGINE = "google_flights" +_SERPAPI_TIMEOUT_S = 15 + +_CABIN_MAP = { + "economy": 1, + "premium_economy": 2, + "business": 3, + "first": 4, +} + +# SerpAPI stops param: 0=any, 1=nonstop, 2=1-stop-or-fewer, 3=2-stops-or-fewer +_STOPS_MAP = { + 0: 1, # nonstop only + 1: 2, # 1 stop or fewer + 2: 3, # 2 stops or fewer +} + +# --------------------------------------------------------------------------- +# Exceptions +# --------------------------------------------------------------------------- + + +class SerpAPIError(Exception): + """Base exception for SerpAPI errors.""" + + +class SerpAPIAuthError(SerpAPIError): + """HTTP 401 — invalid or missing API key.""" + + +class SerpAPIQuotaError(SerpAPIError): + """HTTP 429 — monthly search quota exceeded.""" + + +# --------------------------------------------------------------------------- +# Public API +# --------------------------------------------------------------------------- + + +def serpapi_available() -> bool: + """Check if SERPAPI_API_KEY is set and non-empty.""" + return bool(os.environ.get("SERPAPI_API_KEY", "").strip()) + + +def search_serpapi( + origin: str, + dest: str, + date: Date, + cabin: str = "business", + max_stops: Optional[int] = None, + oneworld_only: bool = True, +) -> Optional[FlightPrice]: + """Search Google Flights via SerpAPI. + + Returns the cheapest flight found, or None if unavailable. + Raises SerpAPIAuthError on 401, SerpAPIQuotaError on 429. + """ + api_key = os.environ.get("SERPAPI_API_KEY", "").strip() + if not api_key: + logger.debug("SERPAPI_API_KEY not set, skipping SerpAPI search") + return None + + _rate_limit() + + params: dict = { + "engine": _SERPAPI_ENGINE, + "api_key": api_key, + "departure_id": origin.upper(), + "arrival_id": dest.upper(), + "outbound_date": date.isoformat(), + "type": 2, # one-way + "travel_class": _CABIN_MAP.get(cabin.lower(), 3), + "currency": "USD", + "hl": "en", + "deep_search": "true", + } + + if oneworld_only: + params["include_airlines"] = "ONEWORLD" + + if max_stops is not None and max_stops in _STOPS_MAP: + params["stops"] = _STOPS_MAP[max_stops] + + try: + resp = requests.get(_SERPAPI_BASE_URL, params=params, timeout=_SERPAPI_TIMEOUT_S) + except requests.Timeout: + logger.warning("SerpAPI timeout for %s-%s", origin, dest) + return None + except requests.RequestException as exc: + logger.warning("SerpAPI network error for %s-%s: %s", origin, dest, exc) + return None + + if resp.status_code == 401: + raise SerpAPIAuthError("Invalid or missing SERPAPI_API_KEY") + if resp.status_code == 429: + raise SerpAPIQuotaError("Monthly SerpAPI search quota exceeded") + if resp.status_code >= 400: + logger.warning("SerpAPI HTTP %d for %s-%s", resp.status_code, origin, dest) + return None + + try: + data = resp.json() + except ValueError: + logger.warning("SerpAPI returned non-JSON for %s-%s", origin, dest) + return None + + # Check for API-level error in response body + if data.get("error"): + logger.warning("SerpAPI error for %s-%s: %s", origin, dest, data["error"]) + return None + + return _parse_serpapi_response(data, origin, dest, date, cabin) + + +# --------------------------------------------------------------------------- +# Response parsing +# --------------------------------------------------------------------------- + + +def _parse_serpapi_response( + data: dict, + origin: str, + dest: str, + date: Date, + cabin: str, +) -> Optional[FlightPrice]: + """Extract cheapest flight from SerpAPI response.""" + best = data.get("best_flights", []) + other = data.get("other_flights", []) + all_options = best + other + + if not all_options: + logger.info("SerpAPI: no flights for %s-%s on %s", origin, dest, date) + return None + + # Find cheapest option with a valid price + cheapest = None + for option in all_options: + price = option.get("price") + if price is None: + continue + if cheapest is None or price < cheapest.get("price", float("inf")): + cheapest = option + + if cheapest is None: + logger.info("SerpAPI: no priced flights for %s-%s on %s", origin, dest, date) + return None + + # Extract carrier from first flight leg + flights = cheapest.get("flights", []) + if not flights: + return None + + first_leg = flights[0] + airline_name = first_leg.get("airline", "") + flight_number = first_leg.get("flight_number", "") + carrier_code = _extract_carrier_iata_from_serpapi(airline_name) + + # Stops = number of layovers (0 = nonstop) + layovers = cheapest.get("layovers", []) + stops = len(layovers) + + # Duration + duration_minutes = cheapest.get("total_duration") + + logger.info( + "SerpAPI found %s-%s: %s $%.0f (%s)", + origin, dest, carrier_code, cheapest["price"], + "nonstop" if stops == 0 else f"{stops} stop{'s' if stops > 1 else ''}", + ) + + return FlightPrice( + origin=origin.upper(), + dest=dest.upper(), + carrier=carrier_code, + price_usd=float(cheapest["price"]), + cabin=cabin, + date=date, + source="serpapi", + stops=stops, + flight_number=flight_number or None, + duration_minutes=duration_minutes, + airline_name=airline_name or None, + ) + + +def _extract_carrier_iata_from_serpapi(airline_name: str) -> str: + """Map airline name from SerpAPI to IATA code.""" + text = airline_name.lower().strip() + for name, code in _CARRIER_IATA.items(): + if name in text: + return code + return airline_name[:2].upper() if len(airline_name) >= 2 else "??" diff --git a/rtw/search/availability.py b/rtw/search/availability.py index 2f00197..ca78c70 100644 --- a/rtw/search/availability.py +++ b/rtw/search/availability.py @@ -8,6 +8,7 @@ from rtw.models import SegmentType from rtw.scraper.cache import ScrapeCache +from rtw.scraper.google_flights import SearchBackend from rtw.search.models import ( AvailabilityStatus, ScoredCandidate, @@ -23,9 +24,13 @@ class AvailabilityChecker: """Checks flight availability for candidate itinerary segments.""" - def __init__(self, cache: Optional[ScrapeCache] = None, cabin: str = "business"): + def __init__(self, cache: Optional[ScrapeCache] = None, cabin: str = "business", + max_stops: Optional[int] = None, + backend: SearchBackend = SearchBackend.AUTO): self._cache = cache or ScrapeCache() self._cabin = cabin + self._max_stops = max_stops + self._backend = backend def check_candidate( self, @@ -93,42 +98,92 @@ def _check_segment( price_usd=cached.get("price_usd"), carrier=cached.get("carrier"), date=seg_date, + source=cached.get("source"), + flight_number=cached.get("flight_number"), + duration_minutes=cached.get("duration_minutes"), ) - # Try scraper if seg_date is None: return SegmentAvailability(status=AvailabilityStatus.UNKNOWN) - try: - from rtw.scraper.google_flights import search_fast_flights, search_playwright_sync + result = self._search_with_cascade(from_apt, to_apt, seg_date, cabin) - result = search_fast_flights(from_apt, to_apt, seg_date, cabin) - if result is None: - result = search_playwright_sync(from_apt, to_apt, seg_date, cabin) - - if result is not None: - avail = SegmentAvailability( - status=AvailabilityStatus.AVAILABLE, - price_usd=result.price_usd, - carrier=result.carrier, - date=seg_date, - ) - # Store in cache - self._cache.set(cache_key, { - "status": avail.status.value, - "price_usd": avail.price_usd, - "carrier": avail.carrier, - }, ttl_hours=6) - return avail - else: - return SegmentAvailability( - status=AvailabilityStatus.UNKNOWN, - date=seg_date, - ) - - except Exception as exc: - logger.warning("Availability check failed for %s-%s: %s", from_apt, to_apt, exc) - return SegmentAvailability(status=AvailabilityStatus.UNKNOWN, date=seg_date) + if result is not None: + avail = SegmentAvailability( + status=AvailabilityStatus.AVAILABLE, + price_usd=result.price_usd, + carrier=result.carrier, + date=seg_date, + stops=result.stops, + source=result.source, + flight_number=result.flight_number, + duration_minutes=result.duration_minutes, + ) + self._cache.set(cache_key, { + "status": avail.status.value, + "price_usd": avail.price_usd, + "carrier": avail.carrier, + "source": avail.source, + "flight_number": avail.flight_number, + "duration_minutes": avail.duration_minutes, + }, ttl_hours=6) + return avail + + return SegmentAvailability(status=AvailabilityStatus.UNKNOWN, date=seg_date) + + def _search_with_cascade(self, from_apt, to_apt, seg_date, cabin): + """Search for flights using configured backend(s).""" + backend = self._backend + + if backend == SearchBackend.AUTO: + search_fns = self._auto_cascade_fns() + elif backend == SearchBackend.SERPAPI: + search_fns = [("serpapi", self._try_serpapi)] + elif backend == SearchBackend.FAST_FLIGHTS: + search_fns = [("fast-flights", self._try_fast_flights)] + elif backend == SearchBackend.PLAYWRIGHT: + search_fns = [("playwright", self._try_playwright)] + else: + search_fns = self._auto_cascade_fns() + + for name, fn in search_fns: + try: + result = fn(from_apt, to_apt, seg_date, cabin) + if result is not None: + return result + except Exception as exc: + if backend != SearchBackend.AUTO: + raise + logger.debug("Cascade %s failed for %s-%s: %s", name, from_apt, to_apt, exc) + + return None + + def _auto_cascade_fns(self): + """Build cascade function list for AUTO mode.""" + fns = [] + from rtw.scraper.serpapi_flights import serpapi_available + if serpapi_available(): + fns.append(("serpapi", self._try_serpapi)) + fns.append(("fast-flights", self._try_fast_flights)) + fns.append(("playwright", self._try_playwright)) + return fns + + def _try_serpapi(self, from_apt, to_apt, seg_date, cabin): + from rtw.scraper.serpapi_flights import search_serpapi + return search_serpapi( + origin=from_apt, dest=to_apt, date=seg_date, + cabin=cabin, max_stops=self._max_stops, + ) + + def _try_fast_flights(self, from_apt, to_apt, seg_date, cabin): + from rtw.scraper.google_flights import search_fast_flights + return search_fast_flights(from_apt, to_apt, seg_date, cabin) + + def _try_playwright(self, from_apt, to_apt, seg_date, cabin): + from rtw.scraper.google_flights import search_playwright_sync + return search_playwright_sync( + from_apt, to_apt, seg_date, cabin, max_stops=self._max_stops, + ) def _assign_dates( self, diff --git a/rtw/search/models.py b/rtw/search/models.py index c9e975a..b3fc23f 100644 --- a/rtw/search/models.py +++ b/rtw/search/models.py @@ -68,6 +68,12 @@ class SegmentAvailability(BaseModel): price_usd: Optional[float] = None carrier: Optional[str] = None date: Optional[datetime.date] = None + stops: Optional[int] = None + error_reason: Optional[str] = None + # New fields (populated when SerpAPI is the source) + source: Optional[str] = None + flight_number: Optional[str] = None + duration_minutes: Optional[int] = None class RouteSegment(BaseModel): diff --git a/rtw/verify/__init__.py b/rtw/verify/__init__.py new file mode 100644 index 0000000..04645f5 --- /dev/null +++ b/rtw/verify/__init__.py @@ -0,0 +1,24 @@ +"""D-class fare verification for oneworld Explorer RTW tickets. + +Uses ExpertFlyer to verify booking class availability on candidate +itinerary segments. Business class RTW tickets require D-class on +every flown segment. +""" + +from rtw.verify.models import ( + AlternateDateResult, + DClassResult, + DClassStatus, + SegmentVerification, + VerifyOption, + VerifyResult, +) + +__all__ = [ + "AlternateDateResult", + "DClassResult", + "DClassStatus", + "SegmentVerification", + "VerifyOption", + "VerifyResult", +] diff --git a/rtw/verify/models.py b/rtw/verify/models.py new file mode 100644 index 0000000..3c63803 --- /dev/null +++ b/rtw/verify/models.py @@ -0,0 +1,156 @@ +"""Pydantic models for D-class verification results.""" + +import datetime +from enum import Enum +from typing import Callable, Optional + +from pydantic import BaseModel, Field + + +class DClassStatus(str, Enum): + """Status of a D-class availability check.""" + + AVAILABLE = "available" + NOT_AVAILABLE = "not_available" + UNKNOWN = "unknown" + ERROR = "error" + CACHED = "cached" + + +class AlternateDateResult(BaseModel): + """D-class availability on an alternate date (±3 days).""" + + date: datetime.date + seats: int = Field(ge=0, le=9) + offset_days: int = Field(ge=-3, le=3) + + +class FlightAvailability(BaseModel): + """D-class availability for a single flight.""" + + carrier: Optional[str] = None + flight_number: Optional[str] = None + origin: Optional[str] = None + destination: Optional[str] = None + depart_time: Optional[str] = None + arrive_time: Optional[str] = None + aircraft: Optional[str] = None + seats: int = Field(default=0, ge=0, le=9) + booking_class: str = "D" + stops: int = Field(default=0, ge=0) + + +class DClassResult(BaseModel): + """Result of a D-class check for a single flight segment.""" + + status: DClassStatus + seats: int = Field(default=0, ge=0, le=9) + flight_number: Optional[str] = None + carrier: str + origin: str = Field(min_length=3, max_length=3) + destination: str = Field(min_length=3, max_length=3) + target_date: datetime.date + checked_at: datetime.datetime = Field( + default_factory=lambda: datetime.datetime.now(datetime.timezone.utc) + ) + from_cache: bool = False + error_message: Optional[str] = None + alternate_dates: list[AlternateDateResult] = Field(default_factory=list) + flights: list[FlightAvailability] = Field(default_factory=list) + + @property + def available(self) -> bool: + return self.status == DClassStatus.AVAILABLE and self.seats > 0 + + @property + def available_flights(self) -> list[FlightAvailability]: + """Flights with D-class seats > 0, sorted by seats desc then departure.""" + avail = [f for f in self.flights if f.seats > 0] + return sorted(avail, key=lambda f: (-f.seats, f.depart_time or "")) + + @property + def flight_count(self) -> int: + return len(self.flights) + + @property + def available_count(self) -> int: + return len(self.available_flights) + + @property + def display_code(self) -> str: + """Short display code: D9 (3 avl), D0, D?, D!""" + if self.status == DClassStatus.ERROR: + return "D!" + if self.status == DClassStatus.UNKNOWN: + return "D?" + if self.flights: + return f"D{self.seats} ({self.available_count} avl)" + return f"D{self.seats}" + + @property + def best_alternate(self) -> Optional[AlternateDateResult]: + """Best alternate date with highest seat count, or None.""" + available = [a for a in self.alternate_dates if a.seats > 0] + if not available: + return None + return max(available, key=lambda a: (a.seats, -abs(a.offset_days))) + + +class SegmentVerification(BaseModel): + """Verification result for one segment of an itinerary.""" + + index: int + segment_type: str # FLOWN, SURFACE, TRANSIT + origin: str = Field(min_length=3, max_length=3) + destination: str = Field(min_length=3, max_length=3) + carrier: Optional[str] = None + flight_number: Optional[str] = None + target_date: Optional[datetime.date] = None + dclass: Optional[DClassResult] = None + + +class VerifyOption(BaseModel): + """An itinerary option to verify D-class for.""" + + option_id: int + segments: list[SegmentVerification] = Field(default_factory=list) + + +class VerifyResult(BaseModel): + """Complete D-class verification result for one itinerary option.""" + + option_id: int + segments: list[SegmentVerification] = Field(default_factory=list) + + @property + def flown_segments(self) -> list[SegmentVerification]: + return [s for s in self.segments if s.segment_type == "FLOWN"] + + @property + def confirmed(self) -> int: + """Count of flown segments with D-class available.""" + return sum( + 1 + for s in self.flown_segments + if s.dclass and s.dclass.status == DClassStatus.AVAILABLE + ) + + @property + def total_flown(self) -> int: + return len(self.flown_segments) + + @property + def percentage(self) -> float: + if self.total_flown == 0: + return 0.0 + return self.confirmed / self.total_flown * 100 + + @property + def fully_bookable(self) -> bool: + if self.total_flown == 0: + return True # Vacuously true + return self.confirmed == self.total_flown + + +# Type alias for progress callbacks +ProgressCallback = Callable[[int, int, SegmentVerification], None] diff --git a/rtw/verify/session.py b/rtw/verify/session.py new file mode 100644 index 0000000..5df51a5 --- /dev/null +++ b/rtw/verify/session.py @@ -0,0 +1,149 @@ +"""ExpertFlyer session management via Playwright storage_state.""" + +import logging +import os +import time +from pathlib import Path +from typing import Optional + +logger = logging.getLogger(__name__) + +_DEFAULT_SESSION_PATH = Path.home() / ".rtw" / "expertflyer_session.json" +_SESSION_MAX_AGE_HOURS = 24 +_LOGIN_POLL_INTERVAL = 2 # seconds +_EXPERTFLYER_BASE = "https://www.expertflyer.com" + + +class SessionManager: + """Manages ExpertFlyer browser session persistence. + + Uses Playwright's storage_state to save/restore cookies and + localStorage after a manual login in a headed browser. + """ + + def __init__( + self, + session_path: Optional[Path] = None, + max_age_hours: float = _SESSION_MAX_AGE_HOURS, + ) -> None: + self.session_path = session_path or _DEFAULT_SESSION_PATH + self.max_age_hours = max_age_hours + + def has_session(self) -> bool: + """Check if a valid (non-expired) session file exists.""" + if not self.session_path.exists(): + return False + age = self.session_age_hours() + if age is None: + return False + return age < self.max_age_hours + + def session_age_hours(self) -> Optional[float]: + """Return session file age in hours, or None if no file.""" + if not self.session_path.exists(): + return None + mtime = self.session_path.stat().st_mtime + age_seconds = time.time() - mtime + return age_seconds / 3600 + + def get_storage_state_path(self) -> Optional[Path]: + """Return session path if valid, None if expired or missing.""" + if self.has_session(): + return self.session_path + return None + + def clear_session(self) -> None: + """Delete the session file.""" + if self.session_path.exists(): + self.session_path.unlink() + logger.info("Session cleared: %s", self.session_path) + + def login_interactive(self, timeout_seconds: int = 120) -> bool: + """Launch a headed browser for manual ExpertFlyer login. + + Opens a Chromium window, navigates to ExpertFlyer, and waits + for the user to log in. Once login is detected (URL change from + auth.expertflyer.com back to www.expertflyer.com), the session + cookies are saved. + + Returns True if login succeeded, False on timeout. + """ + try: + from playwright.sync_api import sync_playwright + except ImportError: + logger.error("Playwright not installed") + return False + + # Ensure parent directory exists + self.session_path.parent.mkdir(parents=True, exist_ok=True) + + with sync_playwright() as p: + browser = p.chromium.launch(headless=False) + context = browser.new_context( + viewport={"width": 1200, "height": 800}, + user_agent=( + "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) " + "AppleWebKit/537.36 (KHTML, like Gecko) " + "Chrome/120.0.0.0 Safari/537.36" + ), + ) + page = context.new_page() + + logger.info("Navigating to ExpertFlyer...") + page.goto(_EXPERTFLYER_BASE, timeout=30000) + time.sleep(2) + + # Wait for user to complete login + deadline = time.time() + timeout_seconds + logged_in = False + + try: + while time.time() < deadline: + time.sleep(_LOGIN_POLL_INTERVAL) + try: + url = page.url + # Login detected: URL is on www.expertflyer.com (not auth.) + # and page has session cookies + if ( + "www.expertflyer.com" in url + and "auth.expertflyer.com" not in url + and "/login" not in url + ): + # Check for authenticated cookies + # ExpertFlyer uses __txn_* tokens on www + # and auth0 cookies on auth subdomain + cookies = context.cookies() + auth_cookies = [ + c + for c in cookies + if ( + c["name"].startswith("__txn_") + or c["name"] == "auth0" + ) + and "expertflyer.com" in c.get("domain", "") + ] + if auth_cookies: + logged_in = True + break + except Exception: + # Page might be navigating + continue + except KeyboardInterrupt: + logger.info("Login cancelled by user") + + if logged_in: + # Save session + context.storage_state(path=str(self.session_path)) + # Set restrictive permissions + try: + os.chmod(self.session_path, 0o600) + except OSError: + pass + logger.info("Session saved: %s", self.session_path) + + try: + browser.close() + except Exception: + pass + + return logged_in diff --git a/rtw/verify/state.py b/rtw/verify/state.py new file mode 100644 index 0000000..c27ce72 --- /dev/null +++ b/rtw/verify/state.py @@ -0,0 +1,78 @@ +"""Persistence for last search results, enabling `rtw verify` without re-searching.""" + +import json +import logging +import time +from pathlib import Path +from typing import TYPE_CHECKING, Optional + +from pydantic import ValidationError + +from rtw.search.models import SearchResult + +if TYPE_CHECKING: + from rtw.search.models import ScoredCandidate + +logger = logging.getLogger(__name__) + +_DEFAULT_STATE_PATH = Path.home() / ".rtw" / "last_search.json" + + +class SearchState: + """Saves and loads the most recent search result for verification.""" + + def __init__(self, state_path: Optional[Path] = None) -> None: + self.state_path = state_path or _DEFAULT_STATE_PATH + + def save(self, result: SearchResult) -> None: + """Serialize SearchResult to JSON file.""" + self.state_path.parent.mkdir(parents=True, exist_ok=True) + data = result.model_dump(mode="json") + data["_saved_at"] = time.time() + self.state_path.write_text( + json.dumps(data, indent=2, default=str), encoding="utf-8" + ) + logger.info("Search state saved: %s", self.state_path) + + def load(self) -> Optional[SearchResult]: + """Deserialize from file. Returns None if missing or corrupted.""" + if not self.state_path.exists(): + return None + try: + raw = json.loads(self.state_path.read_text(encoding="utf-8")) + # Remove our metadata key before validation + raw.pop("_saved_at", None) + return SearchResult.model_validate(raw) + except (json.JSONDecodeError, ValidationError, KeyError) as exc: + logger.warning("Failed to load search state: %s", exc) + return None + + def get_option(self, option_id: int) -> Optional["ScoredCandidate"]: + """Fetch a specific option by 1-based ID. + + Args: + option_id: 1-based index (as shown in CLI output). + + Returns: + ScoredCandidate or None if not found. + """ + result = self.load() + if result is None: + return None + idx = option_id - 1 # Convert to 0-based + if 0 <= idx < len(result.options): + return result.options[idx] + return None + + def state_age_minutes(self) -> Optional[float]: + """Return age of the state file in minutes, or None.""" + if not self.state_path.exists(): + return None + mtime = self.state_path.stat().st_mtime + return (time.time() - mtime) / 60 + + @property + def option_count(self) -> int: + """Number of options in saved state, 0 if no state.""" + result = self.load() + return len(result.options) if result else 0 diff --git a/rtw/verify/verifier.py b/rtw/verify/verifier.py new file mode 100644 index 0000000..8cf092d --- /dev/null +++ b/rtw/verify/verifier.py @@ -0,0 +1,205 @@ +"""D-class verification orchestrator. + +Coordinates the scraper, cache, and progress reporting to verify +D-class availability across all flown segments of an itinerary option. +""" + +import logging +import time +from typing import Optional + +from rtw.scraper.cache import ScrapeCache +from rtw.scraper.expertflyer import ExpertFlyerScraper, SessionExpiredError +from rtw.verify.models import ( + DClassResult, + DClassStatus, + ProgressCallback, + SegmentVerification, + VerifyOption, + VerifyResult, +) + +logger = logging.getLogger(__name__) + +_CACHE_TTL_HOURS = 24 +_CACHE_KEY_PREFIX = "dclass" + + +class DClassVerifier: + """Verify D-class availability for itinerary segments. + + Checks each flown segment against ExpertFlyer, using the cache + to avoid redundant queries. Surface segments are skipped. + """ + + def __init__( + self, + scraper: ExpertFlyerScraper, + cache: Optional[ScrapeCache] = None, + booking_class: str = "D", + ) -> None: + self.scraper = scraper + self.cache = cache or ScrapeCache() + self.booking_class = booking_class + self._session_expired = False + + def _cache_key(self, seg: SegmentVerification) -> str: + """Build cache key for a segment.""" + return ( + f"{_CACHE_KEY_PREFIX}_{seg.carrier}_{seg.origin}_" + f"{seg.destination}_{seg.target_date}_{self.booking_class}" + ) + + def _check_cache(self, seg: SegmentVerification) -> Optional[DClassResult]: + """Look up cached result for a segment.""" + if self.cache is None: + return None + key = self._cache_key(seg) + cached = self.cache.get(key) + if cached is None: + return None + try: + result = DClassResult.model_validate(cached) + result.from_cache = True + result.status = ( + DClassStatus.AVAILABLE + if result.seats > 0 + else DClassStatus.NOT_AVAILABLE + ) + return result + except Exception: + return None + + def _store_cache(self, seg: SegmentVerification, result: DClassResult) -> None: + """Cache a D-class result.""" + if self.cache is None: + return + key = self._cache_key(seg) + self.cache.set(key, result.model_dump(mode="json"), ttl_hours=_CACHE_TTL_HOURS) + + def verify_option( + self, + option: VerifyOption, + progress_cb: Optional[ProgressCallback] = None, + no_cache: bool = False, + ) -> VerifyResult: + """Verify D-class for all flown segments in one option. + + Surface segments are skipped (not sent to scraper). + On SessionExpiredError, remaining segments are marked UNKNOWN. + On individual segment errors, that segment is marked ERROR + and verification continues. + """ + result = VerifyResult(option_id=option.option_id, segments=[]) + + for seg in option.segments: + # Copy segment for result + verified = seg.model_copy() + + if seg.segment_type == "SURFACE": + result.segments.append(verified) + if progress_cb: + progress_cb(len(result.segments), len(option.segments), verified) + continue + + if self._session_expired: + # Session died mid-batch — mark remaining as unknown + verified.dclass = DClassResult( + status=DClassStatus.UNKNOWN, + seats=0, + carrier=seg.carrier or "??", + origin=seg.origin, + destination=seg.destination, + target_date=seg.target_date, + error_message="Session expired during batch", + ) + result.segments.append(verified) + if progress_cb: + progress_cb(len(result.segments), len(option.segments), verified) + continue + + # Check cache first + if not no_cache: + cached = self._check_cache(seg) + if cached is not None: + verified.dclass = cached + result.segments.append(verified) + if progress_cb: + progress_cb( + len(result.segments), len(option.segments), verified + ) + continue + + # Call scraper + try: + start = time.time() + dclass = self.scraper.check_availability( + origin=seg.origin, + dest=seg.destination, + date=seg.target_date, + carrier=seg.carrier or "", + booking_class=self.booking_class, + ) + elapsed = time.time() - start + logger.debug( + "ExpertFlyer check %s→%s: %s (%.1fs)", + seg.origin, + seg.destination, + dclass.display_code if dclass else "None", + elapsed, + ) + + if dclass: + verified.dclass = dclass + self._store_cache(seg, dclass) + else: + verified.dclass = DClassResult( + status=DClassStatus.UNKNOWN, + seats=0, + carrier=seg.carrier or "??", + origin=seg.origin, + destination=seg.destination, + target_date=seg.target_date, + error_message="Scraper returned None (no session?)", + ) + + except SessionExpiredError as exc: + self._session_expired = True + verified.dclass = DClassResult( + status=DClassStatus.UNKNOWN, + seats=0, + carrier=seg.carrier or "??", + origin=seg.origin, + destination=seg.destination, + target_date=seg.target_date, + error_message=str(exc), + ) + except Exception as exc: + verified.dclass = DClassResult( + status=DClassStatus.ERROR, + seats=0, + carrier=seg.carrier or "??", + origin=seg.origin, + destination=seg.destination, + target_date=seg.target_date, + error_message=str(exc), + ) + + result.segments.append(verified) + if progress_cb: + progress_cb(len(result.segments), len(option.segments), verified) + + return result + + def verify_batch( + self, + options: list[VerifyOption], + progress_cb: Optional[ProgressCallback] = None, + no_cache: bool = False, + ) -> list[VerifyResult]: + """Verify D-class for multiple itinerary options sequentially.""" + results = [] + for option in options: + result = self.verify_option(option, progress_cb, no_cache) + results.append(result) + return results diff --git a/scripts/validate_harness.py b/scripts/validate_harness.py new file mode 100644 index 0000000..97923e2 --- /dev/null +++ b/scripts/validate_harness.py @@ -0,0 +1,139 @@ +"""Validate Claude Code harness files for correctness.""" + +import json +import sys +from pathlib import Path + +ROOT = Path(__file__).parent.parent +PASS = "\033[32mPASS\033[0m" +FAIL = "\033[31mFAIL\033[0m" +errors = [] + + +def check(name: str, condition: bool, detail: str = ""): + if condition: + print(f" {PASS} {name}") + else: + msg = f"{name}: {detail}" if detail else name + errors.append(msg) + print(f" {FAIL} {name}" + (f" — {detail}" if detail else "")) + + +def main(): + print("Harness Validation") + print("=" * 40) + + # --- File existence --- + print("\n1. File Existence") + files = { + "CLAUDE.md": ROOT / "CLAUDE.md", + ".claude/settings.json": ROOT / ".claude" / "settings.json", + ".claude/rules/testing.md": ROOT / ".claude" / "rules" / "testing.md", + ".claude/rules/rules-engine.md": ROOT / ".claude" / "rules" / "rules-engine.md", + ".claude/commands/rtw-verify.md": ROOT / ".claude" / "commands" / "rtw-verify.md", + ".claude/commands/rtw-status.md": ROOT / ".claude" / "commands" / "rtw-status.md", + ".claude/commands/rtw-setup.md": ROOT / ".claude" / "commands" / "rtw-setup.md", + ".claude/commands/rtw-help.md": ROOT / ".claude" / "commands" / "rtw-help.md", + } + for name, path in files.items(): + check(name, path.exists(), "file not found") + + # --- CLAUDE.md checks --- + print("\n2. CLAUDE.md Content") + claude_md = (ROOT / "CLAUDE.md").read_text() + lines = claude_md.splitlines() + check(f"Line count ({len(lines)} lines)", 100 <= len(lines) <= 170, + f"expected 100-170, got {len(lines)}") + + required_sections = [ + "## Tech Stack", "## Quick Commands", "## CLI Commands", + "## Module Map", "## Domain Vocabulary", "## Conventions", + "## Reference Files", "## Slash Commands", + ] + for section in required_sections: + check(f"Section: {section}", section in claude_md, "missing") + + tech_items = ["Python 3.11", "Typer", "Rich", "Pydantic", "uv", "pytest", "ruff"] + for item in tech_items: + check(f"Tech stack: {item}", item in claude_md, "not mentioned") + + check("Command: uv run pytest", "uv run pytest" in claude_md) + check("Command: ruff check", "ruff check" in claude_md) + check("Command: python3 -m rtw", "python3 -m rtw" in claude_md) + check("ralph-dev defensive line", "ralph-dev" in claude_md) + + # --- settings.json checks --- + print("\n3. settings.json") + settings_path = ROOT / ".claude" / "settings.json" + if settings_path.exists(): + try: + settings = json.loads(settings_path.read_text()) + check("Valid JSON", True) + except json.JSONDecodeError as e: + check("Valid JSON", False, str(e)) + settings = {} + + perms = settings.get("permissions", {}) + allow = perms.get("allow", []) + deny = perms.get("deny", []) + check(f"Allow count ({len(allow)})", 10 <= len(allow) <= 14, + f"expected 10-14, got {len(allow)}") + check(f"Deny count ({len(deny)})", len(deny) == 3, + f"expected 3, got {len(deny)}") + + allow_str = " ".join(allow) + check("Allow: pytest", "pytest" in allow_str) + check("Allow: ruff", "ruff" in allow_str) + check("Allow: git", "git" in allow_str) + + deny_str = " ".join(deny) + check("Deny: rm -rf", "rm -rf" in deny_str) + check("Deny: push --force", "push --force" in deny_str) + check("Deny: reset --hard", "reset --hard" in deny_str) + + # No bare Bash pattern + check("No bare Bash pattern", not any(p == "Bash" for p in allow)) + + # --- Command frontmatter checks --- + print("\n4. Command Frontmatter") + commands = ["rtw-init", "rtw-verify", "rtw-status", "rtw-setup", "rtw-help"] + for cmd in commands: + path = ROOT / ".claude" / "commands" / f"{cmd}.md" + if path.exists(): + content = path.read_text() + check(f"{cmd}: has frontmatter", content.startswith("---")) + check(f"{cmd}: has description", "description:" in content) + + # --- Path-scoped rules checks --- + print("\n5. Path-Scoped Rules") + for rule_name, expected_glob in [("testing.md", "tests/"), ("rules-engine.md", "rtw/rules/")]: + path = ROOT / ".claude" / "rules" / rule_name + if path.exists(): + content = path.read_text() + check(f"{rule_name}: has paths frontmatter", "paths:" in content) + check(f"{rule_name}: correct glob", expected_glob in content) + + # --- .gitignore checks --- + print("\n6. .gitignore") + gitignore = (ROOT / ".gitignore").read_text() + check("ralph-dev-* entry", "ralph-dev-*" in gitignore) + check("settings.local.json entry", "settings.local.json" in gitignore) + check("CLAUDE.local.md entry", "CLAUDE.local.md" in gitignore) + # Negative checks: team-shared files NOT ignored + check("CLAUDE.md NOT ignored (no exact match)", + "\nCLAUDE.md\n" not in f"\n{gitignore}\n" or "CLAUDE.local.md" in gitignore) + + # --- Summary --- + print(f"\n{'=' * 40}") + if errors: + print(f"\033[31m{len(errors)} FAILED\033[0m checks:") + for e in errors: + print(f" - {e}") + sys.exit(1) + else: + print(f"\033[32mAll checks passed!\033[0m") + sys.exit(0) + + +if __name__ == "__main__": + main() diff --git a/tests/fixtures/ef_results_lhr_hkg_d.html b/tests/fixtures/ef_results_lhr_hkg_d.html new file mode 100644 index 0000000..d1819ac --- /dev/null +++ b/tests/fixtures/ef_results_lhr_hkg_d.html @@ -0,0 +1,19 @@ + +ExpertFlyer Results Fixture + +
+

Flight Availability Results

+
+
+ Departing LHR on 02/19/26 12:00 AM for HKG — Flying on booking class D +
+
Flight
Stops
Depart
Arrive
Aircraft
Frequency
Reliability
Available Classes
(Hover over the class code for details)
Actions
0 Connections
CX
252
0
LHR
02/19/26 11:00 AM
HKG
02/20/26 7:40 AM
77W
Daily
85% / 20m
D9
0 Connections
CX
238
0
LHR
02/19/26 4:50 PM
HKG
02/20/26 1:20 PM
351
Daily
86% / 19m
D9
0 Connections
CX
250
0
LHR
02/19/26 5:50 PM
HKG
02/20/26 2:35 PM
77W
Daily
92% / 15m
D9
0 Connections
BA
31
0
LHR
02/19/26 5:55 PM
HKG
02/20/26 2:50 PM
351
Tu, Th, St
65% / 30m
D5
0 Connections
IB
(
BA
)
3524
0
LHR
02/19/26 5:55 PM
HKG
02/20/26 2:50 PM
351
Tu, Th, St
65% / 30m
D3
0 Connections
CX
256
0
LHR
02/19/26 8:15 PM
HKG
02/20/26 4:55 PM
359
Daily
75% / 23m
D9
0 Connections
CX
254
0
LHR
02/19/26 10:05 PM
HKG
02/20/26 6:45 PM
77W
Daily
90% / 120m
D9
1 Connection
CX
(
BA
)
7114
0
LHR
02/19/26 7:10 AM
BRU
02/19/26 9:25 AM
319
Th
NA / NA
D9
CX
294
0
BRU
02/19/26 11:30 AM
HKG
02/20/26 6:00 AM
359
M, Tu, Th, St
NA / NA
D9
1 Connection
LH
(
VL
)
4205
0
LHR
02/19/26 7:40 AM
MUC
02/19/26 10:30 AM
32N
Su, Tu, W, Th, F, St
NA / NA
D9
CX
300
0
MUC
02/19/26 12:00 PM
HKG
02/20/26 6:00 AM
359
M, Tu, Th, St
NA / NA
D9
+ diff --git a/tests/test_cli_verify.py b/tests/test_cli_verify.py new file mode 100644 index 0000000..2f26cb5 --- /dev/null +++ b/tests/test_cli_verify.py @@ -0,0 +1,253 @@ +"""CLI tests for login, verify, and --verify-dclass commands.""" + +import datetime +import json +from unittest.mock import MagicMock, patch + +from typer.testing import CliRunner + +from rtw.cli import app + +runner = CliRunner() + + +class TestLoginHelp: + """Test login sub-app help output.""" + + def test_login_help(self): + result = runner.invoke(app, ["login", "--help"]) + assert result.exit_code == 0 + assert "expertflyer" in result.output + + def test_login_expertflyer_help(self): + result = runner.invoke(app, ["login", "expertflyer", "--help"]) + assert result.exit_code == 0 + assert "ExpertFlyer" in result.output + assert "credential" in result.output.lower() + + def test_login_status_help(self): + result = runner.invoke(app, ["login", "status", "--help"]) + assert result.exit_code == 0 + + def test_login_clear_help(self): + result = runner.invoke(app, ["login", "clear", "--help"]) + assert result.exit_code == 0 + + +class TestLoginStatus: + """Test login status command.""" + + @patch("rtw.cli.keyring", create=True) + def test_status_no_credentials(self, mock_keyring): + mock_keyring.get_password = MagicMock(return_value=None) + + # Need to patch the import inside the function + with patch.dict("sys.modules", {"keyring": mock_keyring}): + result = runner.invoke(app, ["login", "status"]) + assert result.exit_code == 0 + assert "not configured" in result.output + + @patch("rtw.cli.keyring", create=True) + def test_status_has_credentials(self, mock_keyring): + def fake_get(service, key): + return {"username": "test@test.com", "password": "secret"}.get(key) + + mock_keyring.get_password = fake_get + + with patch.dict("sys.modules", {"keyring": mock_keyring}): + result = runner.invoke(app, ["login", "status"]) + assert result.exit_code == 0 + assert "configured" in result.output + + @patch("rtw.cli.keyring", create=True) + def test_status_json(self, mock_keyring): + def fake_get(service, key): + return {"username": "user@test.com", "password": "pw"}.get(key) + + mock_keyring.get_password = fake_get + + with patch.dict("sys.modules", {"keyring": mock_keyring}): + result = runner.invoke(app, ["login", "status", "--json"]) + assert result.exit_code == 0 + data = json.loads(result.output) + assert data["has_credentials"] is True + assert data["username"] == "user@test.com" + + +class TestLoginClear: + """Test login clear command.""" + + def test_clear(self): + mock_keyring = MagicMock() + with patch.dict("sys.modules", {"keyring": mock_keyring}): + result = runner.invoke(app, ["login", "clear"]) + assert result.exit_code == 0 + assert "cleared" in result.output + + +class TestVerifyHelp: + """Test verify command help.""" + + def test_verify_help(self): + result = runner.invoke(app, ["verify", "--help"]) + assert result.exit_code == 0 + assert "D-class" in result.output + assert "ExpertFlyer" in result.output + + +class TestVerifyNoState: + """Test verify command without prior search.""" + + @patch("rtw.verify.state.SearchState") + def test_verify_no_state(self, mock_state_cls): + state = MagicMock() + state.load.return_value = None + mock_state_cls.return_value = state + + result = runner.invoke(app, ["verify"]) + assert result.exit_code == 1 + assert "No saved search" in result.output or result.exit_code == 1 + + +class TestVerifyNoCreds: + """Test verify command without ExpertFlyer credentials.""" + + @patch("rtw.scraper.expertflyer._get_credentials", return_value=None) + @patch("rtw.verify.state.SearchState") + def test_verify_no_creds(self, mock_state_cls, mock_creds): + from rtw.models import CabinClass, Itinerary, Ticket, TicketType + from rtw.search.models import ( + CandidateItinerary, Direction, ScoredCandidate, SearchQuery, SearchResult, + ) + + query = SearchQuery( + cities=["SYD", "HKG", "LHR", "JFK"], + origin="SYD", + date_from=datetime.date(2026, 9, 1), + date_to=datetime.date(2026, 10, 15), + cabin=CabinClass.BUSINESS, + ticket_type=TicketType.DONE4, + ) + ticket = Ticket(type=TicketType.DONE4, cabin=CabinClass.BUSINESS, origin="SYD") + itin = Itinerary( + ticket=ticket, + segments=[ + {"from": "SYD", "to": "HKG", "carrier": "CX"}, + ], + ) + candidate = CandidateItinerary(itinerary=itin, direction=Direction.EASTBOUND) + scored = ScoredCandidate(candidate=candidate, rank=1) + sr = SearchResult( + query=query, candidates_generated=1, options=[scored], base_fare_usd=6299.0, + ) + + state = MagicMock() + state.load.return_value = sr + state.state_age_minutes.return_value = 5.0 + mock_state_cls.return_value = state + + result = runner.invoke(app, ["verify"]) + assert result.exit_code == 1 + assert "credential" in result.output.lower() or "login" in result.output.lower() + + +class TestScoredToVerifyOption: + """Test the conversion helper.""" + + def test_conversion(self): + from rtw.cli import _scored_to_verify_option + from rtw.models import CabinClass, Itinerary, Ticket, TicketType + from rtw.search.models import CandidateItinerary, Direction, ScoredCandidate + + ticket = Ticket(type=TicketType.DONE4, cabin=CabinClass.BUSINESS, origin="SYD") + itin = Itinerary( + ticket=ticket, + segments=[ + {"from": "SYD", "to": "HKG", "carrier": "CX", "type": "stopover"}, + {"from": "HKG", "to": "BKK", "type": "surface"}, + {"from": "BKK", "to": "LHR", "carrier": "BA", "type": "stopover"}, + ], + ) + candidate = CandidateItinerary(itinerary=itin, direction=Direction.EASTBOUND) + scored = ScoredCandidate(candidate=candidate, rank=1) + + option = _scored_to_verify_option(scored, 1) + assert option.option_id == 1 + assert len(option.segments) == 3 + assert option.segments[0].segment_type == "FLOWN" + assert option.segments[0].origin == "SYD" + assert option.segments[0].destination == "HKG" + assert option.segments[0].carrier == "CX" + assert option.segments[1].segment_type == "SURFACE" + assert option.segments[2].segment_type == "FLOWN" + assert option.segments[2].carrier == "BA" + + +class TestDisplayVerifyResult: + """Test _display_verify_result shows per-flight sub-rows.""" + + def _make_verify_result(self): + from rtw.verify.models import ( + DClassResult, DClassStatus, FlightAvailability, + SegmentVerification, VerifyResult, + ) + flights = [ + FlightAvailability(carrier="CX", flight_number="CX252", seats=9, + depart_time="03/10/26 11:00 AM", aircraft="77W"), + FlightAvailability(carrier="CX", flight_number="CX254", seats=6, + depart_time="03/10/26 10:05 PM", aircraft="77W"), + FlightAvailability(carrier="CX", flight_number="CX256", seats=0, + depart_time="03/10/26 8:15 PM", aircraft="359"), + ] + dclass = DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + flights=flights, + ) + seg = SegmentVerification( + index=0, segment_type="FLOWN", origin="LHR", destination="HKG", + carrier="CX", target_date=datetime.date(2026, 3, 10), dclass=dclass, + ) + return VerifyResult(option_id=1, segments=[seg]) + + def test_display_code_in_output(self, capsys): + from rtw.cli import _display_verify_result + result = self._make_verify_result() + _display_verify_result(result) + captured = capsys.readouterr() + # Rich output goes to stderr + assert "D9" in captured.err + assert "2 avl" in captured.err + + def test_per_flight_rows_shown(self, capsys): + from rtw.cli import _display_verify_result + result = self._make_verify_result() + _display_verify_result(result) + captured = capsys.readouterr() + assert "CX252" in captured.err + assert "CX254" in captured.err + + def test_d0_count_shown(self, capsys): + from rtw.cli import _display_verify_result + result = self._make_verify_result() + _display_verify_result(result) + captured = capsys.readouterr() + assert "1 more at D0" in captured.err + + def test_tight_badge(self, capsys): + from rtw.cli import _display_verify_result + result = self._make_verify_result() + _display_verify_result(result) + captured = capsys.readouterr() + # 2 available flights → TIGHT badge + assert "TIGHT" in captured.err + + def test_quiet_hides_subrows(self, capsys): + from rtw.cli import _display_verify_result + result = self._make_verify_result() + _display_verify_result(result, quiet=True) + captured = capsys.readouterr() + # Should still show summary but not per-flight detail + assert "D9" in captured.err + assert "CX252" not in captured.err diff --git a/tests/test_scraper/conftest.py b/tests/test_scraper/conftest.py new file mode 100644 index 0000000..9ca0b96 --- /dev/null +++ b/tests/test_scraper/conftest.py @@ -0,0 +1,12 @@ +"""Shared fixtures for scraper tests.""" + +import pytest + + +@pytest.fixture(autouse=True) +def _no_rate_limit(monkeypatch): + """Disable rate limiting in SerpAPI tests to avoid 2s delays.""" + try: + monkeypatch.setattr("rtw.scraper.serpapi_flights._rate_limit", lambda: None) + except AttributeError: + pass # Module not imported yet, that's fine diff --git a/tests/test_scraper/test_expertflyer.py b/tests/test_scraper/test_expertflyer.py index 89be191..898a670 100644 --- a/tests/test_scraper/test_expertflyer.py +++ b/tests/test_scraper/test_expertflyer.py @@ -1,104 +1,122 @@ """Tests for ExpertFlyer scraper module.""" -from datetime import date +import datetime +from pathlib import Path from unittest.mock import patch import pytest -from rtw.scraper.expertflyer import ExpertFlyerScraper +from rtw.scraper.expertflyer import ( + ExpertFlyerScraper, + ScrapeError, + SessionExpiredError, + parse_availability_html, +) class TestExpertFlyerScraper: - """Test ExpertFlyerScraper graceful degradation.""" + """Test ExpertFlyerScraper session-based approach.""" - def test_credentials_unavailable_when_keyring_missing(self): - """credentials_available() returns False when keyring is not importable.""" + def test_scraper_init_no_session(self): scraper = ExpertFlyerScraper() - with patch.dict("sys.modules", {"keyring": None}): - # Force re-check by clearing cached credentials - scraper._username = None - scraper._password = None - assert scraper.credentials_available() is False - - def test_credentials_unavailable_when_not_configured(self): - """credentials_available() returns False when credentials not in keyring.""" - scraper = ExpertFlyerScraper() - with patch("rtw.scraper.expertflyer.keyring", create=True) as mock_keyring: - mock_keyring.get_password.return_value = None - scraper._username = None - scraper._password = None - - # Patch the import inside the method - import types - - mock_kr = types.ModuleType("keyring") - mock_kr.get_password = lambda service, key: None + assert scraper._session_path is None + assert scraper._query_count == 0 - with patch.dict("sys.modules", {"keyring": mock_kr}): - scraper._username = None - scraper._password = None - assert scraper.credentials_available() is False + def test_scraper_init_with_session(self, tmp_path): + path = str(tmp_path / "session.json") + scraper = ExpertFlyerScraper(session_path=path) + assert scraper._session_path == path - @pytest.mark.asyncio - async def test_check_availability_no_credentials(self): - """check_availability returns None when no credentials available.""" + @patch("rtw.scraper.expertflyer._get_credentials", return_value=None) + def test_check_availability_no_credentials(self, mock_creds): + """Returns None when no credentials configured.""" scraper = ExpertFlyerScraper() + result = scraper.check_availability( + origin="LHR", + dest="HKG", + date=datetime.date(2026, 3, 10), + carrier="CX", + ) + assert result is None - # Ensure no credentials - import types - - mock_kr = types.ModuleType("keyring") - mock_kr.get_password = lambda service, key: None - - with patch.dict("sys.modules", {"keyring": mock_kr}): - scraper._username = None - scraper._password = None - result = await scraper.check_availability( - origin="LHR", - dest="NRT", - date=date(2025, 6, 15), - carrier="JL", - booking_class="D", - ) - assert result is None - - @pytest.mark.asyncio - async def test_check_availability_no_playwright(self): - """check_availability returns None when Playwright not available.""" + def test_build_results_url(self): scraper = ExpertFlyerScraper() - # Set fake credentials - scraper._username = "testuser" - scraper._password = "testpass" - - with patch("rtw.scraper.BrowserManager") as mock_bm: - mock_bm.available.return_value = False - result = await scraper.check_availability( - origin="LHR", - dest="NRT", - date=date(2025, 6, 15), - carrier="JL", - ) - assert result is None - - def test_scraper_init(self): - """ExpertFlyerScraper initializes with no credentials.""" + url = scraper._build_results_url( + origin="LHR", + dest="HKG", + date=datetime.date(2026, 3, 10), + booking_class="D", + carrier="CX", + ) + assert "origin=LHR" in url + assert "destination=HKG" in url + assert "classFilter=D" in url + assert "airLineCodes=CX" in url + assert "resultsDisplay=single" in url + assert "/air/availability/results" in url + + def test_build_results_url_no_carrier(self): scraper = ExpertFlyerScraper() - assert scraper._username is None - assert scraper._password is None + url = scraper._build_results_url( + origin="SYD", + dest="LAX", + date=datetime.date(2026, 5, 1), + ) + assert "airLineCodes=" in url + assert "origin=SYD" in url + + def test_session_expired_error(self): + err = SessionExpiredError() + assert err.error_type == "SESSION_EXPIRED" + assert isinstance(err, ScrapeError) + + def test_scrape_error(self): + err = ScrapeError("timeout", error_type="TIMEOUT") + assert err.error_type == "TIMEOUT" + assert "timeout" in str(err) + + +class TestParseAvailabilityHtml: + """Test the standalone HTML parser.""" + + @pytest.fixture + def fixture_html(self): + path = Path(__file__).parent.parent / "fixtures" / "ef_results_lhr_hkg_d.html" + if not path.exists(): + pytest.skip("ExpertFlyer fixture not found") + return path.read_text(encoding="utf-8") + + def test_parse_real_fixture(self, fixture_html): + results = parse_availability_html(fixture_html, "D") + assert len(results) >= 7 + # All results should have carrier + carriers = [r["carrier"] for r in results if r["carrier"]] + assert "CX" in carriers + + def test_d_class_seats(self, fixture_html): + results = parse_availability_html(fixture_html, "D") + seats = [r["seats"] for r in results if r["seats"] is not None] + assert 9 in seats # CX flights had D9 + assert 5 in seats # BA 31 had D5 + + def test_empty_html(self): + results = parse_availability_html("", "D") + assert results == [] @pytest.mark.integration - @pytest.mark.asyncio - async def test_real_availability_check(self): - """Integration test: check real ExpertFlyer availability.""" - scraper = ExpertFlyerScraper() - if not scraper.credentials_available(): - pytest.skip("ExpertFlyer credentials not configured") - - result = await scraper.check_availability( + def test_real_availability_check(self): + """Integration test: requires valid ExpertFlyer session.""" + session_path = Path.home() / ".rtw" / "expertflyer_session.json" + if not session_path.exists(): + pytest.skip("ExpertFlyer session not configured") + + scraper = ExpertFlyerScraper(session_path=str(session_path)) + result = scraper.check_availability( origin="LHR", - dest="NRT", - date=date(2025, 9, 1), - carrier="JL", + dest="HKG", + date=datetime.date(2026, 3, 15), + carrier="CX", ) - # Result may be None (stub) but should not raise - assert result is None or isinstance(result, dict) + assert result is not None + assert result.origin == "LHR" + assert result.destination == "HKG" diff --git a/tests/test_scraper/test_google_flights.py b/tests/test_scraper/test_google_flights.py index e8d67dc..6d49477 100644 --- a/tests/test_scraper/test_google_flights.py +++ b/tests/test_scraper/test_google_flights.py @@ -1,24 +1,42 @@ """Tests for Google Flights scraper module.""" from datetime import date -from unittest.mock import patch +from unittest.mock import MagicMock, patch import pytest -from rtw.scraper.google_flights import FlightPrice, search_fast_flights, search +from rtw.scraper.google_flights import ( + FlightPrice, + ScrapeError, + ScrapeFailureReason, + _CONSENT_SELECTORS, + _MAX_ATTEMPTS, + _RETRY_BACKOFF_S, + _SELECTORS, + _dismiss_consent, + _expand_all_results, + _extract_carrier_iata, + _is_oneworld, + _parse_flight_card, + _parse_price, + _parse_stops, + search_fast_flights, + search_playwright_sync, +) + + +# --------------------------------------------------------------------------- +# Existing tests (fixed import, no TestSearch) +# --------------------------------------------------------------------------- class TestFlightPrice: """Test the FlightPrice dataclass.""" def test_create_flight_price(self): - """FlightPrice can be created with required fields.""" fp = FlightPrice( - origin="LHR", - dest="NRT", - carrier="JL", - price_usd=2500.00, - cabin="business", + origin="LHR", dest="NRT", carrier="JL", + price_usd=2500.00, cabin="business", ) assert fp.origin == "LHR" assert fp.dest == "NRT" @@ -27,39 +45,52 @@ def test_create_flight_price(self): assert fp.cabin == "business" assert fp.source == "google_flights" assert fp.date is None + assert fp.stops is None def test_create_with_all_fields(self): - """FlightPrice with all optional fields.""" fp = FlightPrice( - origin="LAX", - dest="SYD", - carrier="QF", - price_usd=4000.00, - cabin="first", - date=date(2025, 6, 15), - source="fast_flights", + origin="LAX", dest="SYD", carrier="QF", + price_usd=4000.00, cabin="first", + date=date(2025, 6, 15), source="fast_flights", ) assert fp.date == date(2025, 6, 15) assert fp.source == "fast_flights" + def test_flight_price_with_stops(self): + fp = FlightPrice( + origin="DOH", dest="SYD", carrier="QR", + price_usd=5026.0, cabin="business", stops=0, + ) + assert fp.stops == 0 + + def test_flight_price_stops_default_none(self): + fp = FlightPrice( + origin="DOH", dest="SYD", carrier="QR", + price_usd=5026.0, cabin="business", + ) + assert fp.stops is None + + def test_flight_price_stops_backward_compat(self): + """Existing positional keyword creation still works.""" + fp = FlightPrice( + origin="LHR", dest="NRT", carrier="JL", + price_usd=2500.0, cabin="business", + date=date(2025, 6, 15), source="playwright", + ) + assert fp.stops is None + assert fp.source == "playwright" + class TestSearchFastFlights: """Test search_fast_flights graceful degradation.""" def test_returns_none_when_library_unavailable(self): - """Returns None when fast-flights library is not importable.""" with patch.dict("sys.modules", {"fast_flights": None}): - # When the import inside search_fast_flights tries 'from fast_flights import ...', - # it will get ImportError because we set the module to None result = search_fast_flights("LHR", "NRT", date(2025, 6, 15), "business") - # The function catches ImportError and returns None assert result is None def test_returns_none_on_exception(self): - """Returns None on any exception during search.""" - # Mock fast_flights to raise an exception import types - mock_module = types.ModuleType("fast_flights") mock_module.FlightData = type("FlightData", (), {"__init__": lambda self, **kw: None}) mock_module.Passengers = type("Passengers", (), {"__init__": lambda self, **kw: None}) @@ -71,50 +102,384 @@ def test_returns_none_on_exception(self): assert result is None -class TestSearch: - """Test combined search function.""" +class TestRateLimiting: + """Test rate limiting infrastructure.""" + + def test_rate_limit_constant_defined(self): + from rtw.scraper.google_flights import _RATE_LIMIT_SECONDS + assert _RATE_LIMIT_SECONDS == 2.0 + + def test_rate_limit_function_exists(self): + from rtw.scraper.google_flights import _rate_limit + assert callable(_rate_limit) + + +# --------------------------------------------------------------------------- +# New tests for scraper robustness +# --------------------------------------------------------------------------- + + +class TestParsePrice: + """Test _parse_price pure function.""" + + def test_parse_price_simple(self): + assert _parse_price("$1,234") == 1234.0 + + def test_parse_price_no_comma(self): + assert _parse_price("$500") == 500.0 + + def test_parse_price_large(self): + assert _parse_price("$12,345") == 12345.0 + + def test_parse_price_in_text(self): + assert _parse_price("From $5,026 round trip") == 5026.0 + + def test_parse_price_no_dollar(self): + assert _parse_price("5026 USD") is None + + def test_parse_price_empty(self): + assert _parse_price("") is None + + +class TestParseStops: + """Test _parse_stops with mocked card locators.""" + + def _make_card(self, stops_text, selector_works=True): + card = MagicMock() + if selector_works: + stops_el = MagicMock() + stops_el.inner_text.return_value = stops_text + card.locator.return_value.first = stops_el + else: + card.locator.return_value.first.inner_text.side_effect = Exception("no element") + card.inner_text.return_value = f"Airline\n10:00\n$5,000\n{stops_text}" + return card + + def test_parse_stops_nonstop(self): + assert _parse_stops(self._make_card("Nonstop")) == 0 + + def test_parse_stops_one_stop(self): + assert _parse_stops(self._make_card("1 stop")) == 1 + + def test_parse_stops_two_stops(self): + assert _parse_stops(self._make_card("2 stops")) == 2 + + def test_parse_stops_three_stops(self): + assert _parse_stops(self._make_card("3 stops")) == 3 + + def test_parse_stops_nonstop_case_insensitive(self): + assert _parse_stops(self._make_card("NONSTOP")) == 0 + assert _parse_stops(self._make_card("NonStop")) == 0 + + def test_parse_stops_unparseable(self): + assert _parse_stops(self._make_card("Some random text")) is None + + def test_parse_stops_empty_text(self): + assert _parse_stops(self._make_card("")) is None + + def test_parse_stops_selector_fails_regex_fallback(self): + card = self._make_card("1 stop", selector_works=False) + assert _parse_stops(card) == 1 + + +class TestParseFlightCard: + """Test _parse_flight_card with mocked card elements.""" + + def _make_card(self, lines, stops_text="Nonstop"): + card = MagicMock() + card.inner_text.return_value = "\n".join(lines) + # Mock stops locator + stops_el = MagicMock() + stops_el.inner_text.return_value = stops_text + card.locator.return_value.first = stops_el + return card + + def test_parse_card_complete(self): + lines = ["10:00 AM", "-", "4:00 PM", "Qatar Airways", "8h 00m", "DOH-NRT", "Nonstop", "$5,026"] + result = _parse_flight_card(self._make_card(lines), "DOH", "NRT", date(2026, 9, 1), "business") + assert result is not None + assert result["price"] == 5026.0 + assert result["carrier_text"] == "Qatar Airways" + assert result["carrier_code"] == "QR" + assert result["stops"] == 0 + + def test_parse_card_no_price(self): + lines = ["10:00 AM", "-", "4:00 PM", "Qatar Airways", "8h 00m", "DOH-NRT", "Nonstop"] + result = _parse_flight_card(self._make_card(lines), "DOH", "NRT", date(2026, 9, 1), "business") + assert result is None + + def test_parse_card_inner_text_exception(self): + card = MagicMock() + card.inner_text.side_effect = Exception("detached") + result = _parse_flight_card(card, "DOH", "NRT", date(2026, 9, 1), "business") + assert result is None + + def test_parse_card_short_lines(self): + lines = ["$5,026", "Something"] + result = _parse_flight_card(self._make_card(lines, ""), "DOH", "NRT", date(2026, 9, 1), "business") + # < 4 lines, returns None + assert result is None + + def test_parse_card_four_lines_with_carrier(self): + lines = ["10:00 AM", "-", "4:00 PM", "Japan Airlines", "$4,500"] + result = _parse_flight_card(self._make_card(lines), "NRT", "LHR", date(2026, 9, 1), "business") + assert result is not None + assert result["carrier_text"] == "Japan Airlines" + assert result["carrier_code"] == "JL" + + def test_parse_card_oneworld_carrier(self): + lines = ["10:00 AM", "-", "4:00 PM", "Qatar Airways", "8h", "route", "Nonstop", "$5,026"] + result = _parse_flight_card(self._make_card(lines), "DOH", "NRT", date(2026, 9, 1), "business") + assert result is not None + assert "Qatar Airways" in result["carrier_text"] + + +class TestDismissConsent: + """Test _dismiss_consent with mocked page.""" - @pytest.mark.asyncio - async def test_search_returns_none_gracefully(self): - """Combined search returns None when all methods fail.""" - # Patch fast-flights to fail, and BrowserManager to not be available - with ( - patch("rtw.scraper.google_flights.search_fast_flights", return_value=None), - patch("rtw.scraper.BrowserManager") as mock_bm, - ): - mock_bm.available.return_value = False - result = await search("LHR", "NRT", date(2025, 6, 15)) + def test_dismiss_consent_accept_all(self): + page = MagicMock() + btn = MagicMock() + btn.is_visible.return_value = True + page.locator.return_value.first = btn + assert _dismiss_consent(page) is True + btn.click.assert_called_once() + + def test_dismiss_consent_no_dialog(self): + page = MagicMock() + btn = MagicMock() + btn.is_visible.side_effect = Exception("not found") + page.locator.return_value.first = btn + assert _dismiss_consent(page) is False + + def test_dismiss_consent_click_exception(self): + page = MagicMock() + btn = MagicMock() + btn.is_visible.return_value = True + btn.click.side_effect = Exception("click failed") + page.locator.return_value.first = btn + # First selector's click fails, rest have is_visible fail + call_count = [0] + original_is_visible = btn.is_visible + + def side_effect(*args, **kwargs): + call_count[0] += 1 + if call_count[0] == 1: + return True # First selector visible + raise Exception("not found") # Rest not found + + btn.is_visible.side_effect = side_effect + # Click fails on first, rest not visible — returns False + assert _dismiss_consent(page) is False + + +class TestExpandAllResults: + """Test _expand_all_results with mocked page.""" + + def test_expand_clicks_show_more(self): + page = MagicMock() + btn = MagicMock() + btn.wait_for.side_effect = [None, None, Exception("not found")] + page.locator.return_value.first = btn + page.locator.return_value.all.return_value = [MagicMock()] * 15 + count = _expand_all_results(page) + assert btn.click.call_count == 2 + assert count == 15 + + def test_expand_no_button(self): + page = MagicMock() + btn = MagicMock() + btn.wait_for.side_effect = Exception("not found") + page.locator.return_value.first = btn + page.locator.return_value.all.return_value = [MagicMock()] * 8 + count = _expand_all_results(page) + assert btn.click.call_count == 0 + assert count == 8 + + def test_expand_max_clicks_cap(self): + page = MagicMock() + btn = MagicMock() + btn.wait_for.return_value = None # Always visible + page.locator.return_value.first = btn + page.locator.return_value.all.return_value = [MagicMock()] * 50 + count = _expand_all_results(page) + assert btn.click.call_count == 5 # _MAX_EXPAND_CLICKS + + def test_expand_returns_card_count(self): + page = MagicMock() + btn = MagicMock() + btn.wait_for.side_effect = Exception("not found") + page.locator.return_value.first = btn + cards = [MagicMock() for _ in range(12)] + page.locator.return_value.all.return_value = cards + count = _expand_all_results(page) + assert count == 12 + + def test_expand_button_click_exception(self): + page = MagicMock() + btn = MagicMock() + btn.wait_for.return_value = None + btn.click.side_effect = Exception("click failed") + page.locator.return_value.first = btn + page.locator.return_value.all.return_value = [MagicMock()] * 10 + count = _expand_all_results(page) + # Stops after first failed click + assert count == 10 + + +class TestScrapeError: + """Test ScrapeError and ScrapeFailureReason.""" + + def test_scrape_error_is_exception(self): + e = ScrapeError(ScrapeFailureReason.TIMEOUT, "timed out") + assert isinstance(e, Exception) + + def test_scrape_error_reason(self): + e = ScrapeError(ScrapeFailureReason.CONSENT_BLOCKED, "blocked") + assert e.reason == ScrapeFailureReason.CONSENT_BLOCKED + + def test_scrape_error_route(self): + e = ScrapeError(ScrapeFailureReason.TIMEOUT, "msg", route="LHR-NRT") + assert e.route == "LHR-NRT" + + def test_scrape_error_str(self): + e = ScrapeError(ScrapeFailureReason.TIMEOUT, "timed out", route="LHR-NRT") + s = str(e) + assert "timeout" in s + assert "LHR-NRT" in s + + def test_scrape_failure_reason_enum_values(self): + assert ScrapeFailureReason.TIMEOUT.value == "timeout" + assert ScrapeFailureReason.CONSENT_BLOCKED.value == "consent_blocked" + assert ScrapeFailureReason.NO_RESULTS.value == "no_results" + assert ScrapeFailureReason.PARSE_ERROR.value == "parse_error" + assert ScrapeFailureReason.BROWSER_ERROR.value == "browser_error" + + +class TestRetryLogic: + """Test retry wrapper in search_playwright_sync.""" + + @patch("rtw.scraper.google_flights._search_playwright_impl") + @patch("rtw.scraper.google_flights.time.sleep") + def test_retry_succeeds_on_second_attempt(self, mock_sleep, mock_impl): + expected = FlightPrice( + origin="LHR", dest="NRT", carrier="JL", + price_usd=2500.0, cabin="business", + ) + mock_impl.side_effect = [ + ScrapeError(ScrapeFailureReason.TIMEOUT, "timeout", "LHR-NRT"), + expected, + ] + result = search_playwright_sync("LHR", "NRT", date(2026, 9, 1)) + assert result == expected + assert mock_impl.call_count == 2 + + @patch("rtw.scraper.google_flights._search_playwright_impl") + @patch("rtw.scraper.google_flights.time.sleep") + def test_retry_exhausted_returns_none(self, mock_sleep, mock_impl): + mock_impl.side_effect = ScrapeError( + ScrapeFailureReason.TIMEOUT, "timeout", "LHR-NRT" + ) + result = search_playwright_sync("LHR", "NRT", date(2026, 9, 1)) + assert result is None + assert mock_impl.call_count == _MAX_ATTEMPTS + + @patch("rtw.scraper.google_flights._search_playwright_impl") + @patch("rtw.scraper.google_flights.time.sleep") + def test_no_retry_on_consent_blocked(self, mock_sleep, mock_impl): + mock_impl.side_effect = ScrapeError( + ScrapeFailureReason.CONSENT_BLOCKED, "blocked", "LHR-NRT" + ) + result = search_playwright_sync("LHR", "NRT", date(2026, 9, 1)) + assert result is None + assert mock_impl.call_count == 1 + + @patch("rtw.scraper.google_flights._search_playwright_impl") + @patch("rtw.scraper.google_flights.time.sleep") + def test_retry_backoff_delay(self, mock_sleep, mock_impl): + mock_impl.side_effect = ScrapeError( + ScrapeFailureReason.TIMEOUT, "timeout", "LHR-NRT" + ) + search_playwright_sync("LHR", "NRT", date(2026, 9, 1)) + # Sleep called once between first and second attempt + mock_sleep.assert_called_with(_RETRY_BACKOFF_S) + + @patch("rtw.scraper.google_flights._search_playwright_impl") + @patch("rtw.scraper.google_flights.time.sleep") + def test_no_retry_on_success(self, mock_sleep, mock_impl): + expected = FlightPrice( + origin="LHR", dest="NRT", carrier="JL", + price_usd=2500.0, cabin="business", + ) + mock_impl.return_value = expected + result = search_playwright_sync("LHR", "NRT", date(2026, 9, 1)) + assert result == expected + assert mock_impl.call_count == 1 + + +class TestSearchPlaywrightSync: + """Integration-style tests for the full playwright search flow.""" + + def test_playwright_not_installed(self): + with patch.dict("sys.modules", {"playwright": None, "playwright.sync_api": None}): + result = search_playwright_sync("LHR", "NRT", date(2026, 9, 1)) assert result is None - @pytest.mark.asyncio - async def test_search_returns_fast_flights_result_first(self): - """Combined search returns fast-flights result when available.""" + @patch("rtw.scraper.google_flights._search_playwright_impl") + @patch("rtw.scraper.google_flights.time.sleep") + def test_backward_compat_no_max_stops(self, mock_sleep, mock_impl): + """Call without max_stops kwarg works.""" expected = FlightPrice( - origin="LHR", - dest="NRT", - carrier="JL", - price_usd=2500.00, - cabin="business", - source="fast_flights", + origin="LHR", dest="NRT", carrier="JL", + price_usd=2500.0, cabin="business", ) - with patch("rtw.scraper.google_flights.search_fast_flights", return_value=expected): - result = await search("LHR", "NRT", date(2025, 6, 15)) - assert result is not None - assert result.price_usd == 2500.00 - assert result.source == "fast_flights" + mock_impl.return_value = expected + result = search_playwright_sync("LHR", "NRT", date(2026, 9, 1)) + assert result == expected -class TestRateLimiting: - """Test rate limiting infrastructure.""" +class TestConstants: + """Test module-level constants exist with correct values.""" - def test_rate_limit_constant_defined(self): - """Rate limit constant is set to 2 seconds.""" - from rtw.scraper.google_flights import _RATE_LIMIT_SECONDS + def test_selectors_keys(self): + for key in ("flight_card", "show_more", "airline", "price", + "stops", "stops_alt", "departure", "arrival", "duration"): + assert key in _SELECTORS - assert _RATE_LIMIT_SECONDS == 2.0 + def test_consent_selectors_count(self): + assert len(_CONSENT_SELECTORS) == 7 - def test_rate_limit_function_exists(self): - """Rate limit function is importable.""" - from rtw.scraper.google_flights import _rate_limit + def test_retry_constants(self): + assert _MAX_ATTEMPTS == 2 + assert _RETRY_BACKOFF_S == 5.0 - assert callable(_rate_limit) + +class TestCarrierHelpers: + """Test carrier extraction and oneworld check.""" + + def test_extract_carrier_iata_qatar(self): + assert _extract_carrier_iata("Qatar Airways") == "QR" + + def test_extract_carrier_iata_jal(self): + assert _extract_carrier_iata("Japan Airlines") == "JL" + + def test_extract_carrier_iata_unknown(self): + assert _extract_carrier_iata("Unknown Carrier") == "UN" + + def test_is_oneworld_qatar(self): + assert _is_oneworld("Qatar Airways") is True + + def test_is_oneworld_non_member(self): + assert _is_oneworld("Lufthansa") is False + + def test_is_oneworld_ba(self): + assert _is_oneworld("British Airways") is True + + +@pytest.mark.live +class TestLiveSmoke: + """Live smoke tests — only run with `pytest -m live`.""" + + def test_live_smoke_placeholder(self): + """Placeholder — real live tests require Playwright install.""" + pytest.skip("Live tests disabled by default") diff --git a/tests/test_scraper/test_serpapi_flights.py b/tests/test_scraper/test_serpapi_flights.py new file mode 100644 index 0000000..46d576a --- /dev/null +++ b/tests/test_scraper/test_serpapi_flights.py @@ -0,0 +1,378 @@ +"""Tests for SerpAPI Google Flights integration.""" + +from datetime import date +from unittest.mock import patch, MagicMock + +import pytest +import requests + +from rtw.scraper.serpapi_flights import ( + SerpAPIAuthError, + SerpAPIError, + SerpAPIQuotaError, + _extract_carrier_iata_from_serpapi, + _parse_serpapi_response, + search_serpapi, + serpapi_available, + _CABIN_MAP, + _STOPS_MAP, +) +from rtw.scraper.google_flights import FlightPrice + +# --------------------------------------------------------------------------- +# Response fixtures +# --------------------------------------------------------------------------- + +RESPONSE_BASIC = { + "best_flights": [ + { + "flights": [{"airline": "Qatar Airways", "flight_number": "QR 807"}], + "price": 3200, + "total_duration": 810, + "layovers": [], + } + ], + "other_flights": [ + { + "flights": [{"airline": "Cathay Pacific", "flight_number": "CX 101"}], + "price": 2450, + "total_duration": 540, + "layovers": [], + } + ], +} + +RESPONSE_BEST_ONLY = { + "best_flights": [ + { + "flights": [{"airline": "British Airways", "flight_number": "BA 15"}], + "price": 1800, + "total_duration": 420, + "layovers": [], + } + ], + "other_flights": [], +} + +RESPONSE_EMPTY = {"best_flights": [], "other_flights": []} + +RESPONSE_NO_ARRAYS = {"search_metadata": {"status": "success"}} + +RESPONSE_API_ERROR = {"error": "Invalid API key"} + +RESPONSE_NO_PRICES = { + "best_flights": [ + {"flights": [{"airline": "Qatar Airways", "flight_number": "QR 807"}], "layovers": []} + ], + "other_flights": [], +} + +RESPONSE_UNKNOWN_AIRLINE = { + "best_flights": [ + { + "flights": [{"airline": "Zippy Air", "flight_number": "ZA 100"}], + "price": 999, + "total_duration": 300, + "layovers": [], + } + ], + "other_flights": [], +} + +RESPONSE_TWO_STOPS = { + "best_flights": [ + { + "flights": [ + {"airline": "American Airlines", "flight_number": "AA 100"}, + {"airline": "American Airlines", "flight_number": "AA 200"}, + {"airline": "American Airlines", "flight_number": "AA 300"}, + ], + "price": 1500, + "total_duration": 1200, + "layovers": [ + {"name": "Dallas/Fort Worth", "duration": 120}, + {"name": "Miami", "duration": 90}, + ], + } + ], + "other_flights": [], +} + +TEST_DATE = date(2025, 9, 15) + + +# --------------------------------------------------------------------------- +# TestSerpAPIAvailable +# --------------------------------------------------------------------------- + + +class TestSerpAPIAvailable: + def test_available_when_key_set(self, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key_123") + assert serpapi_available() is True + + def test_not_available_when_key_unset(self, monkeypatch): + monkeypatch.delenv("SERPAPI_API_KEY", raising=False) + assert serpapi_available() is False + + def test_not_available_when_key_empty(self, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", " ") + assert serpapi_available() is False + + +# --------------------------------------------------------------------------- +# TestSearchSerpAPI +# --------------------------------------------------------------------------- + + +def _mock_response(status_code=200, json_data=None, raise_timeout=False, bad_json=False): + """Create a mock requests.Response.""" + if raise_timeout: + raise requests.Timeout("Connection timed out") + resp = MagicMock() + resp.status_code = status_code + if bad_json: + resp.json.side_effect = ValueError("No JSON") + else: + resp.json.return_value = json_data or {} + return resp + + +class TestSearchSerpAPI: + def test_returns_none_without_key(self, monkeypatch): + monkeypatch.delenv("SERPAPI_API_KEY", raising=False) + result = search_serpapi("SYD", "HKG", TEST_DATE) + assert result is None + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_successful_search_picks_cheapest(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + mock_get.return_value = _mock_response(json_data=RESPONSE_BASIC) + result = search_serpapi("SYD", "HKG", TEST_DATE) + assert result is not None + assert result.price_usd == 2450 # other_flights cheaper than best_flights + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_successful_search_fields(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + mock_get.return_value = _mock_response(json_data=RESPONSE_BASIC) + result = search_serpapi("SYD", "HKG", TEST_DATE, cabin="business") + assert result.origin == "SYD" + assert result.dest == "HKG" + assert result.carrier == "CX" + assert result.cabin == "business" + assert result.date == TEST_DATE + assert result.source == "serpapi" + assert result.stops == 0 + assert result.flight_number == "CX 101" + assert result.duration_minutes == 540 + assert result.airline_name == "Cathay Pacific" + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_request_params_correct(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "my_key") + mock_get.return_value = _mock_response(json_data=RESPONSE_EMPTY) + search_serpapi("SYD", "LHR", TEST_DATE, cabin="business", oneworld_only=True) + args, kwargs = mock_get.call_args + params = kwargs.get("params", {}) + assert params["engine"] == "google_flights" + assert params["departure_id"] == "SYD" + assert params["arrival_id"] == "LHR" + assert params["outbound_date"] == "2025-09-15" + assert params["type"] == 2 + assert params["travel_class"] == 3 + assert params["currency"] == "USD" + assert params["deep_search"] == "true" + assert params["include_airlines"] == "ONEWORLD" + assert params["api_key"] == "my_key" + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_request_params_no_oneworld(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + mock_get.return_value = _mock_response(json_data=RESPONSE_EMPTY) + search_serpapi("SYD", "LHR", TEST_DATE, oneworld_only=False) + params = mock_get.call_args[1]["params"] + assert "include_airlines" not in params + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_request_params_nonstop(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + mock_get.return_value = _mock_response(json_data=RESPONSE_EMPTY) + search_serpapi("SYD", "LHR", TEST_DATE, max_stops=0) + params = mock_get.call_args[1]["params"] + assert params["stops"] == 1 # 0 -> 1 in SerpAPI mapping + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_auth_error_raises_on_401(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "bad_key") + mock_get.return_value = _mock_response(status_code=401) + with pytest.raises(SerpAPIAuthError): + search_serpapi("SYD", "LHR", TEST_DATE) + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_quota_error_raises_on_429(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + mock_get.return_value = _mock_response(status_code=429) + with pytest.raises(SerpAPIQuotaError): + search_serpapi("SYD", "LHR", TEST_DATE) + + @patch("rtw.scraper.serpapi_flights.requests.get", side_effect=requests.Timeout("timeout")) + def test_timeout_returns_none(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + result = search_serpapi("SYD", "LHR", TEST_DATE) + assert result is None + + @patch("rtw.scraper.serpapi_flights.requests.get", side_effect=requests.ConnectionError("fail")) + def test_network_error_returns_none(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + result = search_serpapi("SYD", "LHR", TEST_DATE) + assert result is None + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_http_500_returns_none(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + mock_get.return_value = _mock_response(status_code=500) + result = search_serpapi("SYD", "LHR", TEST_DATE) + assert result is None + + @patch("rtw.scraper.serpapi_flights.requests.get") + def test_malformed_json_returns_none(self, mock_get, monkeypatch): + monkeypatch.setenv("SERPAPI_API_KEY", "test_key") + mock_get.return_value = _mock_response(bad_json=True) + result = search_serpapi("SYD", "LHR", TEST_DATE) + assert result is None + + +# --------------------------------------------------------------------------- +# TestParseSerpAPIResponse +# --------------------------------------------------------------------------- + + +class TestParseSerpAPIResponse: + def test_picks_cheapest_across_both_arrays(self): + result = _parse_serpapi_response(RESPONSE_BASIC, "SYD", "HKG", TEST_DATE, "business") + assert result.price_usd == 2450 + + def test_best_only_response(self): + result = _parse_serpapi_response(RESPONSE_BEST_ONLY, "SYD", "LHR", TEST_DATE, "business") + assert result.price_usd == 1800 + assert result.carrier == "BA" + + def test_empty_both_arrays(self): + result = _parse_serpapi_response(RESPONSE_EMPTY, "SYD", "LHR", TEST_DATE, "business") + assert result is None + + def test_missing_arrays_returns_none(self): + result = _parse_serpapi_response(RESPONSE_NO_ARRAYS, "SYD", "LHR", TEST_DATE, "business") + assert result is None + + def test_api_error_in_body(self): + result = _parse_serpapi_response(RESPONSE_API_ERROR, "SYD", "LHR", TEST_DATE, "business") + assert result is None + + def test_no_prices_returns_none(self): + result = _parse_serpapi_response(RESPONSE_NO_PRICES, "SYD", "LHR", TEST_DATE, "business") + assert result is None + + def test_extracts_flight_number(self): + result = _parse_serpapi_response(RESPONSE_BASIC, "SYD", "HKG", TEST_DATE, "business") + assert result.flight_number == "CX 101" + + def test_extracts_duration(self): + result = _parse_serpapi_response(RESPONSE_BASIC, "SYD", "HKG", TEST_DATE, "business") + assert result.duration_minutes == 540 + + def test_extracts_stops_count(self): + result = _parse_serpapi_response(RESPONSE_BASIC, "SYD", "HKG", TEST_DATE, "business") + assert result.stops == 0 + + def test_two_stop_itinerary(self): + result = _parse_serpapi_response(RESPONSE_TWO_STOPS, "JFK", "MIA", TEST_DATE, "business") + assert result.stops == 2 + assert result.duration_minutes == 1200 + + +# --------------------------------------------------------------------------- +# TestExtractCarrierIATA +# --------------------------------------------------------------------------- + + +class TestExtractCarrierIATA: + def test_qatar_airways(self): + assert _extract_carrier_iata_from_serpapi("Qatar Airways") == "QR" + + def test_british_airways(self): + assert _extract_carrier_iata_from_serpapi("British Airways") == "BA" + + def test_cathay_pacific(self): + assert _extract_carrier_iata_from_serpapi("Cathay Pacific") == "CX" + + def test_japan_airlines(self): + assert _extract_carrier_iata_from_serpapi("Japan Airlines") == "JL" + + def test_unknown_airline_fallback(self): + assert _extract_carrier_iata_from_serpapi("Zippy Air") == "ZI" + + def test_case_insensitive(self): + assert _extract_carrier_iata_from_serpapi("QATAR AIRWAYS") == "QR" + + +# --------------------------------------------------------------------------- +# TestSerpAPIExceptions +# --------------------------------------------------------------------------- + + +class TestSerpAPIExceptions: + def test_auth_error_is_serpapi_error(self): + assert issubclass(SerpAPIAuthError, SerpAPIError) + + def test_quota_error_is_serpapi_error(self): + assert issubclass(SerpAPIQuotaError, SerpAPIError) + + def test_serpapi_error_is_exception(self): + assert issubclass(SerpAPIError, Exception) + + def test_auth_error_message(self): + exc = SerpAPIAuthError("bad key") + assert str(exc) == "bad key" + + def test_quota_error_message(self): + exc = SerpAPIQuotaError("exceeded") + assert str(exc) == "exceeded" + + +# --------------------------------------------------------------------------- +# TestCabinClassMapping +# --------------------------------------------------------------------------- + + +class TestCabinClassMapping: + def test_economy_maps_to_1(self): + assert _CABIN_MAP["economy"] == 1 + + def test_business_maps_to_3(self): + assert _CABIN_MAP["business"] == 3 + + def test_first_maps_to_4(self): + assert _CABIN_MAP["first"] == 4 + + def test_unknown_cabin_defaults_to_3(self): + assert _CABIN_MAP.get("ultra_first", 3) == 3 + + +# --------------------------------------------------------------------------- +# TestLiveIntegration (gated) +# --------------------------------------------------------------------------- + + +class TestLiveIntegration: + @pytest.mark.integration + @pytest.mark.slow + def test_live_search_known_route(self): + if not serpapi_available(): + pytest.skip("SERPAPI_API_KEY not set") + result = search_serpapi("SYD", "HKG", date(2025, 12, 1), cabin="business") + # May return None if no flights, but should not crash + if result is not None: + assert result.source == "serpapi" + assert result.price_usd > 0 diff --git a/tests/test_search/test_availability.py b/tests/test_search/test_availability.py index 9f5b4cb..2c96cbd 100644 --- a/tests/test_search/test_availability.py +++ b/tests/test_search/test_availability.py @@ -14,7 +14,7 @@ TicketType, ) from rtw.scraper.cache import ScrapeCache -from rtw.scraper.google_flights import FlightPrice +from rtw.scraper.google_flights import FlightPrice, SearchBackend from rtw.search.availability import AvailabilityChecker from rtw.search.models import ( AvailabilityStatus, @@ -183,3 +183,113 @@ def test_surface_segments_get_none_date(self, mock_search, tmp_cache): query = _make_query() dates = checker._assign_dates(cand, query) assert dates[1] is None + + +class TestAvailabilityCascade: + @patch("rtw.scraper.serpapi_flights.serpapi_available", return_value=True) + @patch("rtw.scraper.serpapi_flights.search_serpapi") + def test_auto_mode_tries_serpapi_first(self, mock_serpapi, mock_avail, tmp_cache): + mock_serpapi.return_value = FlightPrice( + origin="AAA", dest="BBB", carrier="QR", price_usd=2000, + cabin="business", source="serpapi", + ) + checker = AvailabilityChecker(cache=tmp_cache, backend=SearchBackend.AUTO) + result = checker._search_with_cascade("AAA", "BBB", FUTURE, "business") + assert result is not None + assert result.source == "serpapi" + mock_serpapi.assert_called_once() + + @patch("rtw.scraper.google_flights.search_fast_flights") + @patch("rtw.scraper.serpapi_flights.serpapi_available", return_value=True) + @patch("rtw.scraper.serpapi_flights.search_serpapi", return_value=None) + def test_auto_mode_falls_back_on_serpapi_none(self, mock_serpapi, mock_avail, mock_ff, tmp_cache): + mock_ff.return_value = FlightPrice( + origin="AAA", dest="BBB", carrier="AA", price_usd=1500, + cabin="business", source="fast_flights", + ) + checker = AvailabilityChecker(cache=tmp_cache, backend=SearchBackend.AUTO) + result = checker._search_with_cascade("AAA", "BBB", FUTURE, "business") + assert result is not None + assert result.source == "fast_flights" + + @patch("rtw.scraper.google_flights.search_fast_flights") + @patch("rtw.scraper.serpapi_flights.serpapi_available", return_value=True) + @patch("rtw.scraper.serpapi_flights.search_serpapi", side_effect=Exception("API down")) + def test_auto_mode_falls_back_on_serpapi_exception(self, mock_serpapi, mock_avail, mock_ff, tmp_cache): + mock_ff.return_value = FlightPrice( + origin="AAA", dest="BBB", carrier="AA", price_usd=1500, + cabin="business", source="fast_flights", + ) + checker = AvailabilityChecker(cache=tmp_cache, backend=SearchBackend.AUTO) + result = checker._search_with_cascade("AAA", "BBB", FUTURE, "business") + assert result is not None + assert result.source == "fast_flights" + + @patch("rtw.scraper.serpapi_flights.serpapi_available", return_value=False) + @patch("rtw.scraper.google_flights.search_fast_flights") + def test_auto_mode_skips_serpapi_without_key(self, mock_ff, mock_avail, tmp_cache): + mock_ff.return_value = FlightPrice( + origin="AAA", dest="BBB", carrier="AA", price_usd=1500, + cabin="business", source="fast_flights", + ) + checker = AvailabilityChecker(cache=tmp_cache, backend=SearchBackend.AUTO) + result = checker._search_with_cascade("AAA", "BBB", FUTURE, "business") + assert result is not None + assert result.source == "fast_flights" + + @patch("rtw.scraper.serpapi_flights.search_serpapi", side_effect=Exception("fail")) + def test_explicit_serpapi_does_not_fall_back(self, mock_serpapi, tmp_cache): + checker = AvailabilityChecker(cache=tmp_cache, backend=SearchBackend.SERPAPI) + with pytest.raises(Exception, match="fail"): + checker._search_with_cascade("AAA", "BBB", FUTURE, "business") + + +class TestCacheWithSerpAPI: + @patch("rtw.scraper.serpapi_flights.serpapi_available", return_value=True) + @patch("rtw.scraper.serpapi_flights.search_serpapi") + def test_serpapi_result_cached(self, mock_serpapi, mock_avail, tmp_cache): + mock_serpapi.return_value = FlightPrice( + origin="AAA", dest="BBB", carrier="QR", price_usd=2000, + cabin="business", source="serpapi", flight_number="QR 807", + duration_minutes=540, + ) + checker = AvailabilityChecker(cache=tmp_cache, backend=SearchBackend.AUTO) + cand = _make_candidate(num_segs=1) + query = _make_query() + checker.check_candidate(cand, query) + + # Second call should use cache + mock_serpapi.reset_mock() + checker.check_candidate(cand, query) + mock_serpapi.assert_not_called() + + def test_old_cache_entry_missing_new_fields(self, tmp_cache): + """Old cache entries without source/flight_number/duration_minutes still work.""" + tmp_cache.set(f"avail_AAA_BBB_{FUTURE.isoformat()}_business", { + "status": "available", "price_usd": 500, "carrier": "AA" + }) + checker = AvailabilityChecker(cache=tmp_cache, cabin="business") + cand = _make_candidate(num_segs=1) + query = _make_query() + checker.check_candidate(cand, query) + avail = cand.candidate.route_segments[0].availability + assert avail.status == AvailabilityStatus.AVAILABLE + assert avail.source is None + assert avail.flight_number is None + assert avail.duration_minutes is None + + @patch("rtw.scraper.serpapi_flights.serpapi_available", return_value=True) + @patch("rtw.scraper.serpapi_flights.search_serpapi") + def test_cached_serpapi_result_returned_for_any_backend(self, mock_serpapi, mock_avail, tmp_cache): + """A cached serpapi result is returned even with a different backend.""" + tmp_cache.set(f"avail_AAA_BBB_{FUTURE.isoformat()}_business", { + "status": "available", "price_usd": 2000, "carrier": "QR", + "source": "serpapi", "flight_number": "QR 807", "duration_minutes": 540, + }) + checker = AvailabilityChecker(cache=tmp_cache, cabin="business", backend=SearchBackend.FAST_FLIGHTS) + cand = _make_candidate(num_segs=1) + query = _make_query() + checker.check_candidate(cand, query) + avail = cand.candidate.route_segments[0].availability + assert avail.source == "serpapi" + mock_serpapi.assert_not_called() diff --git a/tests/test_search/test_cli.py b/tests/test_search/test_cli.py index 00297c5..2ff5b31 100644 --- a/tests/test_search/test_cli.py +++ b/tests/test_search/test_cli.py @@ -188,3 +188,101 @@ def test_bad_cabin_class(self): "--origin", "SYD", "--cabin", "ultra_first", ]) assert result.exit_code == 2 + + +class TestBackendFlag: + def test_invalid_backend_value(self): + result = runner.invoke(app, [ + "search", "--cities", "LHR,NRT,JFK", + "--from", FUTURE, "--to", FUTURE_END, + "--origin", "SYD", "--backend", "nonexistent", + ]) + assert result.exit_code == 2 + assert "Invalid backend" in result.output + + @patch("rtw.search.generator.generate_candidates") + @patch("rtw.search.query.parse_search_query") + def test_backend_auto_accepted(self, mock_parse, mock_gen): + from rtw.search.models import SearchQuery + + mock_parse.return_value = SearchQuery( + cities=["LHR", "NRT", "JFK"], origin="SYD", + date_from=date.today() + timedelta(days=60), + date_to=date.today() + timedelta(days=120), + cabin=CabinClass.BUSINESS, ticket_type=TicketType.DONE3, + ) + mock_gen.return_value = _mock_candidates(1) + result = runner.invoke(app, [ + "search", "--cities", "LHR,NRT,JFK", + "--from", FUTURE, "--to", FUTURE_END, + "--origin", "SYD", "--backend", "auto", + "--skip-availability", "--plain", + ]) + assert result.exit_code == 0 + + def test_backend_serpapi_no_key_error(self, monkeypatch): + monkeypatch.delenv("SERPAPI_API_KEY", raising=False) + result = runner.invoke(app, [ + "search", "--cities", "LHR,NRT,JFK", + "--from", FUTURE, "--to", FUTURE_END, + "--origin", "SYD", "--backend", "serpapi", + ]) + assert result.exit_code == 2 + + def test_backend_serpapi_no_key_error_message(self, monkeypatch): + monkeypatch.delenv("SERPAPI_API_KEY", raising=False) + result = runner.invoke(app, [ + "search", "--cities", "LHR,NRT,JFK", + "--from", FUTURE, "--to", FUTURE_END, + "--origin", "SYD", "--backend", "serpapi", + ]) + assert "SERPAPI_API_KEY" in result.output + + @patch("rtw.search.generator.generate_candidates") + @patch("rtw.search.query.parse_search_query") + def test_backend_auto_no_key_silent(self, mock_parse, mock_gen, monkeypatch): + from rtw.search.models import SearchQuery + + monkeypatch.delenv("SERPAPI_API_KEY", raising=False) + mock_parse.return_value = SearchQuery( + cities=["LHR", "NRT", "JFK"], origin="SYD", + date_from=date.today() + timedelta(days=60), + date_to=date.today() + timedelta(days=120), + cabin=CabinClass.BUSINESS, ticket_type=TicketType.DONE3, + ) + mock_gen.return_value = _mock_candidates(1) + result = runner.invoke(app, [ + "search", "--cities", "LHR,NRT,JFK", + "--from", FUTURE, "--to", FUTURE_END, + "--origin", "SYD", "--backend", "auto", + "--skip-availability", "--plain", + ]) + assert result.exit_code == 0 + assert "SERPAPI_API_KEY" not in result.output + + @patch("rtw.search.generator.generate_candidates") + @patch("rtw.search.query.parse_search_query") + def test_short_flag_b(self, mock_parse, mock_gen): + from rtw.search.models import SearchQuery + + mock_parse.return_value = SearchQuery( + cities=["LHR", "NRT", "JFK"], origin="SYD", + date_from=date.today() + timedelta(days=60), + date_to=date.today() + timedelta(days=120), + cabin=CabinClass.BUSINESS, ticket_type=TicketType.DONE3, + ) + mock_gen.return_value = _mock_candidates(1) + result = runner.invoke(app, [ + "search", "--cities", "LHR,NRT,JFK", + "--from", FUTURE, "--to", FUTURE_END, + "--origin", "SYD", "-b", "auto", + "--skip-availability", "--plain", + ]) + assert result.exit_code == 0 + + def test_backend_flag_on_scrape_prices(self): + result = runner.invoke(app, [ + "scrape", "prices", "--backend", "nonexistent", "dummy.yaml", + ]) + assert result.exit_code == 2 + assert "Invalid backend" in result.output diff --git a/tests/test_verify/__init__.py b/tests/test_verify/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/tests/test_verify/test_integration.py b/tests/test_verify/test_integration.py new file mode 100644 index 0000000..04b976b --- /dev/null +++ b/tests/test_verify/test_integration.py @@ -0,0 +1,297 @@ +"""Integration test for the full D-class verify pipeline. + +Tests: SearchResult → save to state → load → convert to VerifyOption +→ run DClassVerifier with mocked scraper → check VerifyResult. +""" + +import datetime +from io import StringIO +from unittest.mock import MagicMock + +import pytest + +from rtw.models import CabinClass, Itinerary, Ticket, TicketType +from rtw.scraper.expertflyer import SessionExpiredError +from rtw.search.models import ( + CandidateItinerary, + Direction, + ScoredCandidate, + SearchQuery, + SearchResult, +) +from rtw.verify.models import DClassResult, DClassStatus, SegmentVerification, VerifyOption +from rtw.verify.state import SearchState +from rtw.verify.verifier import DClassVerifier + + +def _build_search_result(): + """Build a realistic 4-segment SearchResult.""" + query = SearchQuery( + cities=["SYD", "HKG", "LHR", "JFK"], + origin="SYD", + date_from=datetime.date(2026, 9, 1), + date_to=datetime.date(2026, 10, 15), + cabin=CabinClass.BUSINESS, + ticket_type=TicketType.DONE4, + ) + ticket = Ticket(type=TicketType.DONE4, cabin=CabinClass.BUSINESS, origin="SYD") + itin = Itinerary( + ticket=ticket, + segments=[ + {"from": "SYD", "to": "HKG", "carrier": "CX", "type": "stopover", + "date": "2026-09-01"}, + {"from": "HKG", "to": "LHR", "carrier": "CX", "type": "stopover", + "date": "2026-09-05"}, + {"from": "LHR", "to": "JFK", "carrier": "BA", "type": "stopover", + "date": "2026-09-12"}, + {"from": "JFK", "to": "SYD", "carrier": "QF", "type": "final", + "date": "2026-09-20"}, + ], + ) + candidate = CandidateItinerary(itinerary=itin, direction=Direction.EASTBOUND) + scored = ScoredCandidate(candidate=candidate, composite_score=85.0, rank=1) + + return SearchResult( + query=query, + candidates_generated=10, + options=[scored], + base_fare_usd=6299.0, + ) + + +class TestFullPipeline: + """End-to-end: save → load → convert → verify → result.""" + + def test_save_load_convert_verify(self, tmp_path): + """Full pipeline with all segments available.""" + # Step 1: Save search result + state = SearchState(state_path=tmp_path / "state.json") + sr = _build_search_result() + state.save(sr) + + # Step 2: Load it back + loaded = state.load() + assert loaded is not None + assert len(loaded.options) == 1 + + # Step 3: Convert to VerifyOption + from rtw.cli import _scored_to_verify_option + + option = _scored_to_verify_option(loaded.options[0], 1) + assert option.option_id == 1 + assert len(option.segments) == 4 + assert all(s.segment_type == "FLOWN" for s in option.segments) + assert option.segments[0].origin == "SYD" + assert option.segments[0].carrier == "CX" + assert option.segments[0].target_date == datetime.date(2026, 9, 1) + + # Step 4: Verify with mocked scraper + def _make_result(seg): + return DClassResult( + status=DClassStatus.AVAILABLE, + seats=9, + carrier=seg.carrier or "??", + origin=seg.origin, + destination=seg.destination, + target_date=seg.target_date or datetime.date(2026, 9, 1), + ) + + scraper = MagicMock() + scraper.check_availability.side_effect = [ + _make_result(option.segments[0]), + _make_result(option.segments[1]), + _make_result(option.segments[2]), + _make_result(option.segments[3]), + ] + cache = MagicMock() + cache.get.return_value = None + + verifier = DClassVerifier(scraper=scraper, cache=cache) + result = verifier.verify_option(option) + + # Step 5: Check result + assert result.option_id == 1 + assert result.total_flown == 4 + assert result.confirmed == 4 + assert result.fully_bookable is True + assert result.percentage == 100.0 + assert scraper.check_availability.call_count == 4 + + def test_pipeline_with_surface_segment(self, tmp_path): + """Pipeline with a surface segment that gets skipped.""" + query = SearchQuery( + cities=["SYD", "HKG", "BKK", "LHR"], + origin="SYD", + date_from=datetime.date(2026, 9, 1), + date_to=datetime.date(2026, 10, 15), + cabin=CabinClass.BUSINESS, + ticket_type=TicketType.DONE4, + ) + ticket = Ticket(type=TicketType.DONE4, cabin=CabinClass.BUSINESS, origin="SYD") + itin = Itinerary( + ticket=ticket, + segments=[ + {"from": "SYD", "to": "HKG", "carrier": "CX", "type": "stopover"}, + {"from": "HKG", "to": "BKK", "type": "surface"}, + {"from": "BKK", "to": "LHR", "carrier": "BA", "type": "stopover"}, + {"from": "LHR", "to": "SYD", "carrier": "QF", "type": "final"}, + ], + ) + candidate = CandidateItinerary(itinerary=itin, direction=Direction.EASTBOUND) + scored = ScoredCandidate(candidate=candidate, rank=1) + sr = SearchResult( + query=query, candidates_generated=5, options=[scored], base_fare_usd=6299.0, + ) + + state = SearchState(state_path=tmp_path / "state.json") + state.save(sr) + loaded = state.load() + + from rtw.cli import _scored_to_verify_option + + option = _scored_to_verify_option(loaded.options[0], 1) + assert option.segments[1].segment_type == "SURFACE" + + # Only 3 scraper calls (surface skipped) + scraper = MagicMock() + scraper.check_availability.side_effect = [ + DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="SYD", destination="HKG", + target_date=datetime.date(2026, 9, 1), + ), + DClassResult( + status=DClassStatus.NOT_AVAILABLE, seats=0, carrier="BA", + origin="BKK", destination="LHR", + target_date=datetime.date(2026, 9, 1), + ), + DClassResult( + status=DClassStatus.AVAILABLE, seats=5, carrier="QF", + origin="LHR", destination="SYD", + target_date=datetime.date(2026, 9, 1), + ), + ] + cache = MagicMock() + cache.get.return_value = None + + verifier = DClassVerifier(scraper=scraper, cache=cache) + result = verifier.verify_option(option) + + assert result.total_flown == 3 # Surface excluded + assert result.confirmed == 2 # BKK-LHR not available + assert result.fully_bookable is False + assert scraper.check_availability.call_count == 3 + + def test_pipeline_session_expired_midway(self, tmp_path): + """Session expires after first segment — remaining marked UNKNOWN.""" + state = SearchState(state_path=tmp_path / "state.json") + sr = _build_search_result() + state.save(sr) + loaded = state.load() + + from rtw.cli import _scored_to_verify_option + + option = _scored_to_verify_option(loaded.options[0], 1) + + scraper = MagicMock() + scraper.check_availability.side_effect = [ + DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="SYD", destination="HKG", + target_date=datetime.date(2026, 9, 1), + ), + SessionExpiredError("session expired"), + ] + cache = MagicMock() + cache.get.return_value = None + + verifier = DClassVerifier(scraper=scraper, cache=cache) + result = verifier.verify_option(option) + + statuses = [s.dclass.status for s in result.segments if s.dclass] + assert statuses[0] == DClassStatus.AVAILABLE + assert statuses[1] == DClassStatus.UNKNOWN + assert statuses[2] == DClassStatus.UNKNOWN + assert statuses[3] == DClassStatus.UNKNOWN + assert result.confirmed == 1 + assert result.fully_bookable is False + + def test_display_does_not_crash(self, tmp_path): + """_display_verify_result doesn't crash with various result shapes.""" + from rtw.cli import _display_verify_result + from rtw.verify.models import VerifyResult + + # Empty result + empty = VerifyResult(option_id=1, segments=[]) + _display_verify_result(empty) # Should not raise + + # Result with all statuses + segments = [ + SegmentVerification( + index=0, segment_type="FLOWN", origin="SYD", destination="HKG", + carrier="CX", target_date=datetime.date(2026, 9, 1), + dclass=DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="SYD", destination="HKG", + target_date=datetime.date(2026, 9, 1), + ), + ), + SegmentVerification( + index=1, segment_type="SURFACE", origin="HKG", destination="BKK", + ), + SegmentVerification( + index=2, segment_type="FLOWN", origin="BKK", destination="LHR", + carrier="BA", target_date=datetime.date(2026, 9, 5), + dclass=DClassResult( + status=DClassStatus.NOT_AVAILABLE, seats=0, carrier="BA", + origin="BKK", destination="LHR", + target_date=datetime.date(2026, 9, 5), + ), + ), + SegmentVerification( + index=3, segment_type="FLOWN", origin="LHR", destination="SYD", + carrier="QF", target_date=datetime.date(2026, 9, 12), + dclass=DClassResult( + status=DClassStatus.ERROR, seats=0, carrier="QF", + origin="LHR", destination="SYD", + target_date=datetime.date(2026, 9, 12), + error_message="timeout", + ), + ), + SegmentVerification( + index=4, segment_type="FLOWN", origin="SYD", destination="SYD", + carrier="QF", + dclass=None, # Not checked + ), + ] + mixed = VerifyResult(option_id=2, segments=segments) + _display_verify_result(mixed) # Should not raise + + def test_summary_does_not_crash(self): + """_display_verify_summary doesn't crash.""" + from rtw.cli import _display_verify_summary + from rtw.verify.models import VerifyResult + + results = [ + VerifyResult(option_id=1, segments=[ + SegmentVerification( + index=0, segment_type="FLOWN", origin="SYD", destination="HKG", + dclass=DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="SYD", destination="HKG", + target_date=datetime.date(2026, 9, 1), + ), + ), + ]), + VerifyResult(option_id=2, segments=[ + SegmentVerification( + index=0, segment_type="FLOWN", origin="SYD", destination="LHR", + dclass=DClassResult( + status=DClassStatus.NOT_AVAILABLE, seats=0, carrier="BA", + origin="SYD", destination="LHR", + target_date=datetime.date(2026, 9, 1), + ), + ), + ]), + ] + _display_verify_summary(results) # Should not raise diff --git a/tests/test_verify/test_models.py b/tests/test_verify/test_models.py new file mode 100644 index 0000000..7816a83 --- /dev/null +++ b/tests/test_verify/test_models.py @@ -0,0 +1,367 @@ +"""Tests for D-class verification models.""" + +import datetime + +import pytest + +from rtw.verify.models import ( + AlternateDateResult, + DClassResult, + DClassStatus, + FlightAvailability, + SegmentVerification, + VerifyOption, + VerifyResult, +) + + +class TestDClassStatus: + def test_all_statuses(self): + assert set(DClassStatus) == { + DClassStatus.AVAILABLE, + DClassStatus.NOT_AVAILABLE, + DClassStatus.UNKNOWN, + DClassStatus.ERROR, + DClassStatus.CACHED, + } + + +class TestDClassResult: + def test_available(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, + seats=5, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + ) + assert r.available is True + assert r.display_code == "D5" + assert r.seats == 5 + + def test_not_available(self): + r = DClassResult( + status=DClassStatus.NOT_AVAILABLE, + seats=0, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + ) + assert r.available is False + assert r.display_code == "D0" + + def test_unknown(self): + r = DClassResult( + status=DClassStatus.UNKNOWN, + seats=0, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + ) + assert r.available is False + assert r.display_code == "D?" + + def test_error(self): + r = DClassResult( + status=DClassStatus.ERROR, + seats=0, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + error_message="timeout", + ) + assert r.available is False + assert r.display_code == "D!" + + def test_serialization_roundtrip(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, + seats=9, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + flight_number="CX252", + from_cache=True, + alternate_dates=[ + AlternateDateResult( + date=datetime.date(2026, 3, 11), seats=7, offset_days=1 + ), + ], + ) + data = r.model_dump(mode="json") + r2 = DClassResult.model_validate(data) + assert r2.status == DClassStatus.AVAILABLE + assert r2.seats == 9 + assert r2.carrier == "CX" + assert r2.flight_number == "CX252" + assert r2.from_cache is True + assert len(r2.alternate_dates) == 1 + assert r2.alternate_dates[0].seats == 7 + + def test_best_alternate_none(self): + r = DClassResult( + status=DClassStatus.NOT_AVAILABLE, + seats=0, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + ) + assert r.best_alternate is None + + def test_best_alternate(self): + r = DClassResult( + status=DClassStatus.NOT_AVAILABLE, + seats=0, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + alternate_dates=[ + AlternateDateResult( + date=datetime.date(2026, 3, 8), seats=0, offset_days=-2 + ), + AlternateDateResult( + date=datetime.date(2026, 3, 11), seats=3, offset_days=1 + ), + AlternateDateResult( + date=datetime.date(2026, 3, 12), seats=9, offset_days=2 + ), + ], + ) + best = r.best_alternate + assert best is not None + assert best.seats == 9 + assert best.offset_days == 2 + + +class TestAlternateDateResult: + def test_valid(self): + a = AlternateDateResult( + date=datetime.date(2026, 3, 11), seats=5, offset_days=1 + ) + assert a.seats == 5 + assert a.offset_days == 1 + + def test_seats_bounds(self): + with pytest.raises(Exception): + AlternateDateResult( + date=datetime.date(2026, 3, 11), seats=10, offset_days=1 + ) + + +class TestVerifyResult: + def _make_result(self, statuses): + """Helper: build VerifyResult with given D-class statuses.""" + segments = [] + for i, (stype, status, seats) in enumerate(statuses): + dclass = None + if status is not None: + dclass = DClassResult( + status=status, + seats=seats, + carrier="CX", + origin="LHR", + destination="HKG", + target_date=datetime.date(2026, 3, 10), + ) + segments.append( + SegmentVerification( + index=i, + segment_type=stype, + origin="LHR", + destination="HKG", + carrier="CX", + dclass=dclass, + ) + ) + return VerifyResult(option_id=1, segments=segments) + + def test_all_available(self): + r = self._make_result([ + ("FLOWN", DClassStatus.AVAILABLE, 9), + ("FLOWN", DClassStatus.AVAILABLE, 5), + ("FLOWN", DClassStatus.AVAILABLE, 3), + ]) + assert r.confirmed == 3 + assert r.total_flown == 3 + assert r.percentage == 100.0 + assert r.fully_bookable is True + + def test_partial_available(self): + r = self._make_result([ + ("FLOWN", DClassStatus.AVAILABLE, 9), + ("FLOWN", DClassStatus.NOT_AVAILABLE, 0), + ("FLOWN", DClassStatus.AVAILABLE, 3), + ]) + assert r.confirmed == 2 + assert r.total_flown == 3 + assert r.percentage == pytest.approx(66.67, abs=0.1) + assert r.fully_bookable is False + + def test_surface_segments_excluded(self): + r = self._make_result([ + ("FLOWN", DClassStatus.AVAILABLE, 9), + ("SURFACE", None, 0), + ("FLOWN", DClassStatus.AVAILABLE, 5), + ]) + assert r.confirmed == 2 + assert r.total_flown == 2 + assert r.percentage == 100.0 + assert r.fully_bookable is True + + def test_empty_vacuously_true(self): + r = VerifyResult(option_id=1, segments=[]) + assert r.confirmed == 0 + assert r.total_flown == 0 + assert r.percentage == 0.0 + assert r.fully_bookable is True + + def test_all_surface_vacuously_true(self): + r = self._make_result([ + ("SURFACE", None, 0), + ("SURFACE", None, 0), + ]) + assert r.total_flown == 0 + assert r.fully_bookable is True + + +class TestFlightAvailability: + def test_basic_creation(self): + f = FlightAvailability( + carrier="CX", + flight_number="CX252", + origin="LHR", + destination="HKG", + depart_time="03/10/26 11:00 AM", + arrive_time="03/11/26 7:00 AM", + aircraft="77W", + seats=9, + booking_class="D", + stops=0, + ) + assert f.carrier == "CX" + assert f.flight_number == "CX252" + assert f.seats == 9 + assert f.stops == 0 + + def test_defaults(self): + f = FlightAvailability() + assert f.carrier is None + assert f.flight_number is None + assert f.seats == 0 + assert f.booking_class == "D" + assert f.stops == 0 + + def test_seats_bounds(self): + with pytest.raises(Exception): + FlightAvailability(seats=10) + with pytest.raises(Exception): + FlightAvailability(seats=-1) + + def test_serialization_roundtrip(self): + f = FlightAvailability( + carrier="QF", flight_number="QF11", seats=6, aircraft="388", + ) + data = f.model_dump(mode="json") + f2 = FlightAvailability.model_validate(data) + assert f2.carrier == "QF" + assert f2.flight_number == "QF11" + assert f2.seats == 6 + + +class TestDClassResultWithFlights: + def _make_flights(self): + return [ + FlightAvailability(carrier="CX", flight_number="CX252", seats=9, + depart_time="03/10/26 11:00 AM"), + FlightAvailability(carrier="CX", flight_number="CX254", seats=6, + depart_time="03/10/26 10:05 PM"), + FlightAvailability(carrier="CX", flight_number="CX256", seats=0, + depart_time="03/10/26 8:15 PM"), + FlightAvailability(carrier="BA", flight_number="BA708", seats=9, + depart_time="03/10/26 6:30 AM"), + ] + + def test_flight_count(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + flights=self._make_flights(), + ) + assert r.flight_count == 4 + + def test_available_count(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + flights=self._make_flights(), + ) + assert r.available_count == 3 # CX252(9), CX254(6), BA708(9) + + def test_available_flights_sorted(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + flights=self._make_flights(), + ) + avail = r.available_flights + assert len(avail) == 3 + # Sorted by seats desc, then departure time string asc + assert avail[0].seats == 9 + assert avail[1].seats == 9 + assert avail[2].seats == 6 + assert avail[2].flight_number == "CX254" # D6 last + # D0 flight (CX256) should be excluded + assert all(f.seats > 0 for f in avail) + + def test_display_code_with_flights(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + flights=self._make_flights(), + ) + assert r.display_code == "D9 (3 avl)" + + def test_display_code_no_flights(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + ) + assert r.display_code == "D9" + + def test_display_code_all_d0(self): + flights = [ + FlightAvailability(carrier="CX", flight_number="CX252", seats=0), + FlightAvailability(carrier="CX", flight_number="CX254", seats=0), + ] + r = DClassResult( + status=DClassStatus.NOT_AVAILABLE, seats=0, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + flights=flights, + ) + assert r.display_code == "D0 (0 avl)" + + def test_serialization_with_flights(self): + r = DClassResult( + status=DClassStatus.AVAILABLE, seats=9, carrier="CX", + origin="LHR", destination="HKG", + target_date=datetime.date(2026, 3, 10), + flights=self._make_flights(), + ) + data = r.model_dump(mode="json") + r2 = DClassResult.model_validate(data) + assert r2.flight_count == 4 + assert r2.available_count == 3 + assert r2.flights[0].flight_number == "CX252" diff --git a/tests/test_verify/test_parser.py b/tests/test_verify/test_parser.py new file mode 100644 index 0000000..ecc3953 --- /dev/null +++ b/tests/test_verify/test_parser.py @@ -0,0 +1,121 @@ +"""Tests for ExpertFlyer HTML parser using real fixture data.""" + +from pathlib import Path + +import pytest + +from rtw.scraper.expertflyer import parse_availability_html + +FIXTURE_DIR = Path(__file__).parent.parent / "fixtures" + + +class TestParseAvailabilityHtml: + @pytest.fixture + def lhr_hkg_html(self): + """Real ExpertFlyer results page: LHR→HKG, D-class filtered.""" + path = FIXTURE_DIR / "ef_results_lhr_hkg_d.html" + if not path.exists(): + pytest.skip("HTML fixture not found") + return path.read_text(encoding="utf-8") + + def test_parses_flights(self, lhr_hkg_html): + results = parse_availability_html(lhr_hkg_html, "D") + assert len(results) > 0 + # We know from the capture there are 11 flight rows + assert len(results) >= 7 + + def test_d_class_values(self, lhr_hkg_html): + results = parse_availability_html(lhr_hkg_html, "D") + seats = [r["seats"] for r in results if r["seats"] is not None] + assert len(seats) > 0 + # All values should be 0-9 + for s in seats: + assert 0 <= s <= 9 + # We know D9, D5, D3 are in the results + assert 9 in seats + assert 5 in seats + + def test_carrier_extraction(self, lhr_hkg_html): + results = parse_availability_html(lhr_hkg_html, "D") + carriers = {r["carrier"] for r in results if r["carrier"]} + # CX and BA should be present + assert "CX" in carriers + + def test_airport_extraction(self, lhr_hkg_html): + results = parse_availability_html(lhr_hkg_html, "D") + # First result should have LHR origin + origins = [r["origin"] for r in results if r["origin"]] + assert "LHR" in origins + + def test_empty_html(self): + results = parse_availability_html("", "D") + assert results == [] + + def test_no_matching_class(self, lhr_hkg_html): + # Search for Z class (should find nothing) + results = parse_availability_html(lhr_hkg_html, "Z") + seats = [r["seats"] for r in results if r["seats"] is not None] + assert len(seats) == 0 + + # --- T012: Detailed accuracy tests --- + + def test_exact_flight_count(self, lhr_hkg_html): + """Fixture has exactly 11 flight rows.""" + results = parse_availability_html(lhr_hkg_html, "D") + assert len(results) == 11 + + def test_carrier_set(self, lhr_hkg_html): + """Known carriers in the LHR→HKG results.""" + results = parse_availability_html(lhr_hkg_html, "D") + carriers = {r["carrier"] for r in results if r["carrier"]} + assert "CX" in carriers + assert "BA" in carriers + + def test_specific_flight_numbers(self, lhr_hkg_html): + """Verify known CX flight numbers are extracted.""" + results = parse_availability_html(lhr_hkg_html, "D") + flight_numbers = {r["flight_number"] for r in results if r["flight_number"]} + assert "CX252" in flight_numbers + assert "CX238" in flight_numbers + assert "BA31" in flight_numbers + + def test_origin_airports(self, lhr_hkg_html): + """LHR should be the primary origin.""" + results = parse_availability_html(lhr_hkg_html, "D") + origins = [r["origin"] for r in results if r["origin"]] + assert origins[0] == "LHR" # First flight departs LHR + # Most flights originate from LHR + lhr_count = sum(1 for o in origins if o == "LHR") + assert lhr_count >= 4 + + def test_destination_airports(self, lhr_hkg_html): + """HKG should be the primary destination.""" + results = parse_availability_html(lhr_hkg_html, "D") + dests = [r["destination"] for r in results if r["destination"]] + hkg_count = sum(1 for d in dests if d == "HKG") + assert hkg_count >= 4 + + def test_seat_distribution(self, lhr_hkg_html): + """Known seat values: D9, D5, D3.""" + results = parse_availability_html(lhr_hkg_html, "D") + seats = [r["seats"] for r in results if r["seats"] is not None] + unique_seats = set(seats) + assert 9 in unique_seats # CX flights: D9 + assert 5 in unique_seats # BA 31: D5 + assert 3 in unique_seats # Codeshare flights: D3 + + def test_first_row_is_cx252(self, lhr_hkg_html): + """First parsed flight should be CX 252 LHR→HKG D9.""" + results = parse_availability_html(lhr_hkg_html, "D") + first = results[0] + assert first["carrier"] == "CX" + assert first["flight_number"] == "CX252" + assert first["origin"] == "LHR" + assert first["destination"] == "HKG" + assert first["seats"] == 9 + + def test_all_results_have_booking_class(self, lhr_hkg_html): + """Every result should have booking_class='D'.""" + results = parse_availability_html(lhr_hkg_html, "D") + for r in results: + assert r["booking_class"] == "D" diff --git a/tests/test_verify/test_session.py b/tests/test_verify/test_session.py new file mode 100644 index 0000000..0d2ee98 --- /dev/null +++ b/tests/test_verify/test_session.py @@ -0,0 +1,65 @@ +"""Tests for ExpertFlyer session management.""" + +import json +import os +import time + +import pytest + +from rtw.verify.session import SessionManager + + +class TestSessionManager: + def test_no_session(self, tmp_path): + sm = SessionManager(session_path=tmp_path / "session.json") + assert sm.has_session() is False + assert sm.session_age_hours() is None + assert sm.get_storage_state_path() is None + + def test_valid_session(self, tmp_path): + path = tmp_path / "session.json" + path.write_text(json.dumps({"cookies": []})) + sm = SessionManager(session_path=path, max_age_hours=24) + assert sm.has_session() is True + age = sm.session_age_hours() + assert age is not None + assert age < 1 # Just created + assert sm.get_storage_state_path() == path + + def test_expired_session(self, tmp_path): + path = tmp_path / "session.json" + path.write_text(json.dumps({"cookies": []})) + # Set mtime to 25 hours ago + old_time = time.time() - 25 * 3600 + os.utime(path, (old_time, old_time)) + sm = SessionManager(session_path=path, max_age_hours=24) + assert sm.has_session() is False + age = sm.session_age_hours() + assert age is not None + assert age > 24 + assert sm.get_storage_state_path() is None + + def test_clear_session(self, tmp_path): + path = tmp_path / "session.json" + path.write_text(json.dumps({"cookies": []})) + sm = SessionManager(session_path=path) + assert path.exists() + sm.clear_session() + assert not path.exists() + + def test_clear_nonexistent(self, tmp_path): + sm = SessionManager(session_path=tmp_path / "nope.json") + sm.clear_session() # Should not raise + + def test_custom_max_age(self, tmp_path): + path = tmp_path / "session.json" + path.write_text(json.dumps({"cookies": []})) + # Set mtime to 2 hours ago + old_time = time.time() - 2 * 3600 + os.utime(path, (old_time, old_time)) + # With 1-hour max age, should be expired + sm = SessionManager(session_path=path, max_age_hours=1) + assert sm.has_session() is False + # With 4-hour max age, should be valid + sm2 = SessionManager(session_path=path, max_age_hours=4) + assert sm2.has_session() is True diff --git a/tests/test_verify/test_state.py b/tests/test_verify/test_state.py new file mode 100644 index 0000000..2eed666 --- /dev/null +++ b/tests/test_verify/test_state.py @@ -0,0 +1,118 @@ +"""Tests for search state persistence.""" + +import datetime +import json + +import pytest + +from rtw.models import CabinClass, Itinerary, Ticket, TicketType +from rtw.search.models import SearchQuery, SearchResult, ScoredCandidate, CandidateItinerary, Direction +from rtw.verify.state import SearchState + + +def _make_search_result(n_options=3): + """Build a minimal SearchResult for testing.""" + query = SearchQuery( + cities=["SYD", "HKG", "LHR", "JFK"], + origin="SYD", + date_from=datetime.date(2026, 9, 1), + date_to=datetime.date(2026, 10, 15), + cabin=CabinClass.BUSINESS, + ticket_type=TicketType.DONE4, + ) + ticket = Ticket( + type=TicketType.DONE4, + cabin=CabinClass.BUSINESS, + origin="SYD", + ) + options = [] + for i in range(n_options): + itin = Itinerary( + ticket=ticket, + segments=[ + {"from": "SYD", "to": "HKG", "carrier": "CX"}, + {"from": "HKG", "to": "LHR", "carrier": "CX"}, + {"from": "LHR", "to": "JFK", "carrier": "BA"}, + {"from": "JFK", "to": "SYD", "carrier": "QF"}, + ], + ) + candidate = CandidateItinerary( + itinerary=itin, + direction=Direction.EASTBOUND, + ) + scored = ScoredCandidate( + candidate=candidate, + composite_score=90.0 - i * 10, + rank=i + 1, + ) + options.append(scored) + + return SearchResult( + query=query, + candidates_generated=10, + options=options, + base_fare_usd=6299.0, + ) + + +class TestSearchState: + def test_save_and_load(self, tmp_path): + state = SearchState(state_path=tmp_path / "state.json") + result = _make_search_result() + state.save(result) + loaded = state.load() + assert loaded is not None + assert len(loaded.options) == 3 + assert loaded.query.origin == "SYD" + assert loaded.base_fare_usd == 6299.0 + + def test_load_missing(self, tmp_path): + state = SearchState(state_path=tmp_path / "nope.json") + assert state.load() is None + + def test_load_corrupted(self, tmp_path): + path = tmp_path / "state.json" + path.write_text("not json at all") + state = SearchState(state_path=path) + assert state.load() is None + + def test_load_invalid_json(self, tmp_path): + path = tmp_path / "state.json" + path.write_text(json.dumps({"bad": "data"})) + state = SearchState(state_path=path) + assert state.load() is None + + def test_get_option(self, tmp_path): + state = SearchState(state_path=tmp_path / "state.json") + result = _make_search_result() + state.save(result) + # 1-based IDs + opt1 = state.get_option(1) + assert opt1 is not None + assert opt1.rank == 1 + opt3 = state.get_option(3) + assert opt3 is not None + assert opt3.rank == 3 + + def test_get_option_out_of_range(self, tmp_path): + state = SearchState(state_path=tmp_path / "state.json") + result = _make_search_result(n_options=2) + state.save(result) + assert state.get_option(0) is None # 0 is invalid (1-based) + assert state.get_option(3) is None # Only 2 options + assert state.get_option(-1) is None + + def test_state_age(self, tmp_path): + state = SearchState(state_path=tmp_path / "state.json") + result = _make_search_result() + state.save(result) + age = state.state_age_minutes() + assert age is not None + assert age < 1 # Just saved + + def test_option_count(self, tmp_path): + state = SearchState(state_path=tmp_path / "state.json") + assert state.option_count == 0 + result = _make_search_result(n_options=5) + state.save(result) + assert state.option_count == 5 diff --git a/tests/test_verify/test_verifier.py b/tests/test_verify/test_verifier.py new file mode 100644 index 0000000..2d52577 --- /dev/null +++ b/tests/test_verify/test_verifier.py @@ -0,0 +1,198 @@ +"""Tests for D-class verification orchestrator.""" + +import datetime +from unittest.mock import MagicMock, patch + +import pytest + +from rtw.scraper.expertflyer import SessionExpiredError +from rtw.verify.models import ( + DClassResult, + DClassStatus, + SegmentVerification, + VerifyOption, +) +from rtw.verify.verifier import DClassVerifier + + +def _make_segment(origin, dest, carrier="CX", stype="FLOWN", date=None): + return SegmentVerification( + index=0, + segment_type=stype, + origin=origin, + destination=dest, + carrier=carrier, + target_date=date or datetime.date(2026, 3, 10), + ) + + +def _make_dclass(status, seats, carrier="CX", origin="LHR", dest="HKG"): + return DClassResult( + status=status, + seats=seats, + carrier=carrier, + origin=origin, + destination=dest, + target_date=datetime.date(2026, 3, 10), + ) + + +class TestDClassVerifier: + def _make_verifier(self, scraper_results=None, cache_results=None): + """Create a verifier with mocked scraper and cache.""" + scraper = MagicMock() + if scraper_results is not None: + scraper.check_availability.side_effect = scraper_results + cache = MagicMock() + cache.get.return_value = cache_results + return DClassVerifier(scraper=scraper, cache=cache), scraper, cache + + def test_all_available(self): + results = [ + _make_dclass(DClassStatus.AVAILABLE, 9, origin="SYD", dest="HKG"), + _make_dclass(DClassStatus.AVAILABLE, 5, origin="HKG", dest="LHR"), + ] + verifier, scraper, cache = self._make_verifier(scraper_results=results) + + option = VerifyOption( + option_id=1, + segments=[ + _make_segment("SYD", "HKG"), + _make_segment("HKG", "LHR"), + ], + ) + result = verifier.verify_option(option) + assert result.confirmed == 2 + assert result.total_flown == 2 + assert result.fully_bookable is True + assert scraper.check_availability.call_count == 2 + + def test_surface_skipped(self): + results = [ + _make_dclass(DClassStatus.AVAILABLE, 9), + ] + verifier, scraper, cache = self._make_verifier(scraper_results=results) + + option = VerifyOption( + option_id=1, + segments=[ + _make_segment("SYD", "HKG"), + _make_segment("HKG", "LHR", stype="SURFACE"), + ], + ) + result = verifier.verify_option(option) + assert result.confirmed == 1 + assert result.total_flown == 1 + assert result.fully_bookable is True + # Scraper only called once (surface skipped) + assert scraper.check_availability.call_count == 1 + + def test_partial_failure(self): + results = [ + _make_dclass(DClassStatus.AVAILABLE, 9), + _make_dclass(DClassStatus.NOT_AVAILABLE, 0), + ] + verifier, scraper, cache = self._make_verifier(scraper_results=results) + + option = VerifyOption( + option_id=1, + segments=[ + _make_segment("SYD", "HKG"), + _make_segment("HKG", "LHR"), + ], + ) + result = verifier.verify_option(option) + assert result.confirmed == 1 + assert result.total_flown == 2 + assert result.fully_bookable is False + + def test_session_expired_marks_remaining_unknown(self): + verifier, scraper, cache = self._make_verifier() + scraper.check_availability.side_effect = [ + _make_dclass(DClassStatus.AVAILABLE, 9), + SessionExpiredError("expired"), + ] + + option = VerifyOption( + option_id=1, + segments=[ + _make_segment("SYD", "HKG"), + _make_segment("HKG", "LHR"), + _make_segment("LHR", "JFK"), + ], + ) + result = verifier.verify_option(option) + statuses = [s.dclass.status for s in result.segments if s.dclass] + assert statuses[0] == DClassStatus.AVAILABLE + assert statuses[1] == DClassStatus.UNKNOWN + assert statuses[2] == DClassStatus.UNKNOWN + + def test_scrape_error_marks_error(self): + verifier, scraper, cache = self._make_verifier() + scraper.check_availability.side_effect = [ + _make_dclass(DClassStatus.AVAILABLE, 9), + Exception("timeout"), + _make_dclass(DClassStatus.AVAILABLE, 5), + ] + + option = VerifyOption( + option_id=1, + segments=[ + _make_segment("SYD", "HKG"), + _make_segment("HKG", "LHR"), + _make_segment("LHR", "JFK"), + ], + ) + result = verifier.verify_option(option) + statuses = [s.dclass.status for s in result.segments if s.dclass] + assert statuses[0] == DClassStatus.AVAILABLE + assert statuses[1] == DClassStatus.ERROR + assert statuses[2] == DClassStatus.AVAILABLE + + def test_progress_callback(self): + results = [ + _make_dclass(DClassStatus.AVAILABLE, 9), + _make_dclass(DClassStatus.AVAILABLE, 5), + ] + verifier, scraper, cache = self._make_verifier(scraper_results=results) + + progress_calls = [] + + def cb(current, total, seg): + progress_calls.append((current, total)) + + option = VerifyOption( + option_id=1, + segments=[ + _make_segment("SYD", "HKG"), + _make_segment("HKG", "LHR"), + ], + ) + verifier.verify_option(option, progress_cb=cb) + assert len(progress_calls) == 2 + assert progress_calls[0] == (1, 2) + assert progress_calls[1] == (2, 2) + + def test_batch_verify(self): + results = [ + _make_dclass(DClassStatus.AVAILABLE, 9), + _make_dclass(DClassStatus.AVAILABLE, 5), + ] + verifier, scraper, cache = self._make_verifier(scraper_results=results) + + options = [ + VerifyOption(option_id=1, segments=[_make_segment("SYD", "HKG")]), + VerifyOption(option_id=2, segments=[_make_segment("HKG", "LHR")]), + ] + results = verifier.verify_batch(options) + assert len(results) == 2 + assert results[0].option_id == 1 + assert results[1].option_id == 2 + + def test_empty_option(self): + verifier, scraper, cache = self._make_verifier() + option = VerifyOption(option_id=1, segments=[]) + result = verifier.verify_option(option) + assert result.total_flown == 0 + assert result.fully_bookable is True + assert scraper.check_availability.call_count == 0 diff --git a/uv.lock b/uv.lock index e9949e1..9adc364 100644 --- a/uv.lock +++ b/uv.lock @@ -29,6 +29,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/b9/fa/123043af240e49752f1c4bd24da5053b6bd00cad78c2be53c0d1e8b975bc/backports.tarfile-1.2.0-py3-none-any.whl", hash = "sha256:77e284d754527b01fb1e6fa8a1afe577858ebe4e9dad8919e34c862cb399bc34", size = 30181, upload-time = "2024-05-28T17:01:53.112Z" }, ] +[[package]] +name = "certifi" +version = "2026.1.4" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/e0/2d/a891ca51311197f6ad14a7ef42e2399f36cf2f9bd44752b3dc4eab60fdc5/certifi-2026.1.4.tar.gz", hash = "sha256:ac726dd470482006e014ad384921ed6438c457018f4b3d204aea4281258b2120", size = 154268, upload-time = "2026-01-04T02:42:41.825Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/e6/ad/3cc14f097111b4de0040c83a525973216457bbeeb63739ef1ed275c1c021/certifi-2026.1.4-py3-none-any.whl", hash = "sha256:9943707519e4add1115f44c2bc244f782c0249876bf51b6599fee1ffbedd685c", size = 152900, upload-time = "2026-01-04T02:42:40.15Z" }, +] + [[package]] name = "cffi" version = "2.0.0" @@ -74,6 +83,79 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/e1/5e/b666bacbbc60fbf415ba9988324a132c9a7a0448a9a8f125074671c0f2c3/cffi-2.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:6c6c373cfc5c83a975506110d17457138c8c63016b563cc9ed6e056a82f13ce4", size = 223437, upload-time = "2025-09-08T23:23:38.945Z" }, ] +[[package]] +name = "charset-normalizer" +version = "3.4.4" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/13/69/33ddede1939fdd074bce5434295f38fae7136463422fe4fd3e0e89b98062/charset_normalizer-3.4.4.tar.gz", hash = "sha256:94537985111c35f28720e43603b8e7b43a6ecfb2ce1d3058bbe955b73404e21a", size = 129418, upload-time = "2025-10-14T04:42:32.879Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/ed/27/c6491ff4954e58a10f69ad90aca8a1b6fe9c5d3c6f380907af3c37435b59/charset_normalizer-3.4.4-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:6e1fcf0720908f200cd21aa4e6750a48ff6ce4afe7ff5a79a90d5ed8a08296f8", size = 206988, upload-time = "2025-10-14T04:40:33.79Z" }, + { url = "https://files.pythonhosted.org/packages/94/59/2e87300fe67ab820b5428580a53cad894272dbb97f38a7a814a2a1ac1011/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5f819d5fe9234f9f82d75bdfa9aef3a3d72c4d24a6e57aeaebba32a704553aa0", size = 147324, upload-time = "2025-10-14T04:40:34.961Z" }, + { url = "https://files.pythonhosted.org/packages/07/fb/0cf61dc84b2b088391830f6274cb57c82e4da8bbc2efeac8c025edb88772/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:a59cb51917aa591b1c4e6a43c132f0cdc3c76dbad6155df4e28ee626cc77a0a3", size = 142742, upload-time = "2025-10-14T04:40:36.105Z" }, + { url = "https://files.pythonhosted.org/packages/62/8b/171935adf2312cd745d290ed93cf16cf0dfe320863ab7cbeeae1dcd6535f/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:8ef3c867360f88ac904fd3f5e1f902f13307af9052646963ee08ff4f131adafc", size = 160863, upload-time = "2025-10-14T04:40:37.188Z" }, + { url = "https://files.pythonhosted.org/packages/09/73/ad875b192bda14f2173bfc1bc9a55e009808484a4b256748d931b6948442/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:d9e45d7faa48ee908174d8fe84854479ef838fc6a705c9315372eacbc2f02897", size = 157837, upload-time = "2025-10-14T04:40:38.435Z" }, + { url = "https://files.pythonhosted.org/packages/6d/fc/de9cce525b2c5b94b47c70a4b4fb19f871b24995c728e957ee68ab1671ea/charset_normalizer-3.4.4-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:840c25fb618a231545cbab0564a799f101b63b9901f2569faecd6b222ac72381", size = 151550, upload-time = "2025-10-14T04:40:40.053Z" }, + { url = "https://files.pythonhosted.org/packages/55/c2/43edd615fdfba8c6f2dfbd459b25a6b3b551f24ea21981e23fb768503ce1/charset_normalizer-3.4.4-cp311-cp311-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:ca5862d5b3928c4940729dacc329aa9102900382fea192fc5e52eb69d6093815", size = 149162, upload-time = "2025-10-14T04:40:41.163Z" }, + { url = "https://files.pythonhosted.org/packages/03/86/bde4ad8b4d0e9429a4e82c1e8f5c659993a9a863ad62c7df05cf7b678d75/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d9c7f57c3d666a53421049053eaacdd14bbd0a528e2186fcb2e672effd053bb0", size = 150019, upload-time = "2025-10-14T04:40:42.276Z" }, + { url = "https://files.pythonhosted.org/packages/1f/86/a151eb2af293a7e7bac3a739b81072585ce36ccfb4493039f49f1d3cae8c/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:277e970e750505ed74c832b4bf75dac7476262ee2a013f5574dd49075879e161", size = 143310, upload-time = "2025-10-14T04:40:43.439Z" }, + { url = "https://files.pythonhosted.org/packages/b5/fe/43dae6144a7e07b87478fdfc4dbe9efd5defb0e7ec29f5f58a55aeef7bf7/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:31fd66405eaf47bb62e8cd575dc621c56c668f27d46a61d975a249930dd5e2a4", size = 162022, upload-time = "2025-10-14T04:40:44.547Z" }, + { url = "https://files.pythonhosted.org/packages/80/e6/7aab83774f5d2bca81f42ac58d04caf44f0cc2b65fc6db2b3b2e8a05f3b3/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:0d3d8f15c07f86e9ff82319b3d9ef6f4bf907608f53fe9d92b28ea9ae3d1fd89", size = 149383, upload-time = "2025-10-14T04:40:46.018Z" }, + { url = "https://files.pythonhosted.org/packages/4f/e8/b289173b4edae05c0dde07f69f8db476a0b511eac556dfe0d6bda3c43384/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:9f7fcd74d410a36883701fafa2482a6af2ff5ba96b9a620e9e0721e28ead5569", size = 159098, upload-time = "2025-10-14T04:40:47.081Z" }, + { url = "https://files.pythonhosted.org/packages/d8/df/fe699727754cae3f8478493c7f45f777b17c3ef0600e28abfec8619eb49c/charset_normalizer-3.4.4-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ebf3e58c7ec8a8bed6d66a75d7fb37b55e5015b03ceae72a8e7c74495551e224", size = 152991, upload-time = "2025-10-14T04:40:48.246Z" }, + { url = "https://files.pythonhosted.org/packages/1a/86/584869fe4ddb6ffa3bd9f491b87a01568797fb9bd8933f557dba9771beaf/charset_normalizer-3.4.4-cp311-cp311-win32.whl", hash = "sha256:eecbc200c7fd5ddb9a7f16c7decb07b566c29fa2161a16cf67b8d068bd21690a", size = 99456, upload-time = "2025-10-14T04:40:49.376Z" }, + { url = "https://files.pythonhosted.org/packages/65/f6/62fdd5feb60530f50f7e38b4f6a1d5203f4d16ff4f9f0952962c044e919a/charset_normalizer-3.4.4-cp311-cp311-win_amd64.whl", hash = "sha256:5ae497466c7901d54b639cf42d5b8c1b6a4fead55215500d2f486d34db48d016", size = 106978, upload-time = "2025-10-14T04:40:50.844Z" }, + { url = "https://files.pythonhosted.org/packages/7a/9d/0710916e6c82948b3be62d9d398cb4fcf4e97b56d6a6aeccd66c4b2f2bd5/charset_normalizer-3.4.4-cp311-cp311-win_arm64.whl", hash = "sha256:65e2befcd84bc6f37095f5961e68a6f077bf44946771354a28ad434c2cce0ae1", size = 99969, upload-time = "2025-10-14T04:40:52.272Z" }, + { url = "https://files.pythonhosted.org/packages/f3/85/1637cd4af66fa687396e757dec650f28025f2a2f5a5531a3208dc0ec43f2/charset_normalizer-3.4.4-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:0a98e6759f854bd25a58a73fa88833fba3b7c491169f86ce1180c948ab3fd394", size = 208425, upload-time = "2025-10-14T04:40:53.353Z" }, + { url = "https://files.pythonhosted.org/packages/9d/6a/04130023fef2a0d9c62d0bae2649b69f7b7d8d24ea5536feef50551029df/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b5b290ccc2a263e8d185130284f8501e3e36c5e02750fc6b6bdeb2e9e96f1e25", size = 148162, upload-time = "2025-10-14T04:40:54.558Z" }, + { url = "https://files.pythonhosted.org/packages/78/29/62328d79aa60da22c9e0b9a66539feae06ca0f5a4171ac4f7dc285b83688/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:74bb723680f9f7a6234dcf67aea57e708ec1fbdf5699fb91dfd6f511b0a320ef", size = 144558, upload-time = "2025-10-14T04:40:55.677Z" }, + { url = "https://files.pythonhosted.org/packages/86/bb/b32194a4bf15b88403537c2e120b817c61cd4ecffa9b6876e941c3ee38fe/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:f1e34719c6ed0b92f418c7c780480b26b5d9c50349e9a9af7d76bf757530350d", size = 161497, upload-time = "2025-10-14T04:40:57.217Z" }, + { url = "https://files.pythonhosted.org/packages/19/89/a54c82b253d5b9b111dc74aca196ba5ccfcca8242d0fb64146d4d3183ff1/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:2437418e20515acec67d86e12bf70056a33abdacb5cb1655042f6538d6b085a8", size = 159240, upload-time = "2025-10-14T04:40:58.358Z" }, + { url = "https://files.pythonhosted.org/packages/c0/10/d20b513afe03acc89ec33948320a5544d31f21b05368436d580dec4e234d/charset_normalizer-3.4.4-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:11d694519d7f29d6cd09f6ac70028dba10f92f6cdd059096db198c283794ac86", size = 153471, upload-time = "2025-10-14T04:40:59.468Z" }, + { url = "https://files.pythonhosted.org/packages/61/fa/fbf177b55bdd727010f9c0a3c49eefa1d10f960e5f09d1d887bf93c2e698/charset_normalizer-3.4.4-cp312-cp312-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:ac1c4a689edcc530fc9d9aa11f5774b9e2f33f9a0c6a57864e90908f5208d30a", size = 150864, upload-time = "2025-10-14T04:41:00.623Z" }, + { url = "https://files.pythonhosted.org/packages/05/12/9fbc6a4d39c0198adeebbde20b619790e9236557ca59fc40e0e3cebe6f40/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:21d142cc6c0ec30d2efee5068ca36c128a30b0f2c53c1c07bd78cb6bc1d3be5f", size = 150647, upload-time = "2025-10-14T04:41:01.754Z" }, + { url = "https://files.pythonhosted.org/packages/ad/1f/6a9a593d52e3e8c5d2b167daf8c6b968808efb57ef4c210acb907c365bc4/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:5dbe56a36425d26d6cfb40ce79c314a2e4dd6211d51d6d2191c00bed34f354cc", size = 145110, upload-time = "2025-10-14T04:41:03.231Z" }, + { url = "https://files.pythonhosted.org/packages/30/42/9a52c609e72471b0fc54386dc63c3781a387bb4fe61c20231a4ebcd58bdd/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:5bfbb1b9acf3334612667b61bd3002196fe2a1eb4dd74d247e0f2a4d50ec9bbf", size = 162839, upload-time = "2025-10-14T04:41:04.715Z" }, + { url = "https://files.pythonhosted.org/packages/c4/5b/c0682bbf9f11597073052628ddd38344a3d673fda35a36773f7d19344b23/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:d055ec1e26e441f6187acf818b73564e6e6282709e9bcb5b63f5b23068356a15", size = 150667, upload-time = "2025-10-14T04:41:05.827Z" }, + { url = "https://files.pythonhosted.org/packages/e4/24/a41afeab6f990cf2daf6cb8c67419b63b48cf518e4f56022230840c9bfb2/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:af2d8c67d8e573d6de5bc30cdb27e9b95e49115cd9baad5ddbd1a6207aaa82a9", size = 160535, upload-time = "2025-10-14T04:41:06.938Z" }, + { url = "https://files.pythonhosted.org/packages/2a/e5/6a4ce77ed243c4a50a1fecca6aaaab419628c818a49434be428fe24c9957/charset_normalizer-3.4.4-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:780236ac706e66881f3b7f2f32dfe90507a09e67d1d454c762cf642e6e1586e0", size = 154816, upload-time = "2025-10-14T04:41:08.101Z" }, + { url = "https://files.pythonhosted.org/packages/a8/ef/89297262b8092b312d29cdb2517cb1237e51db8ecef2e9af5edbe7b683b1/charset_normalizer-3.4.4-cp312-cp312-win32.whl", hash = "sha256:5833d2c39d8896e4e19b689ffc198f08ea58116bee26dea51e362ecc7cd3ed26", size = 99694, upload-time = "2025-10-14T04:41:09.23Z" }, + { url = "https://files.pythonhosted.org/packages/3d/2d/1e5ed9dd3b3803994c155cd9aacb60c82c331bad84daf75bcb9c91b3295e/charset_normalizer-3.4.4-cp312-cp312-win_amd64.whl", hash = "sha256:a79cfe37875f822425b89a82333404539ae63dbdddf97f84dcbc3d339aae9525", size = 107131, upload-time = "2025-10-14T04:41:10.467Z" }, + { url = "https://files.pythonhosted.org/packages/d0/d9/0ed4c7098a861482a7b6a95603edce4c0d9db2311af23da1fb2b75ec26fc/charset_normalizer-3.4.4-cp312-cp312-win_arm64.whl", hash = "sha256:376bec83a63b8021bb5c8ea75e21c4ccb86e7e45ca4eb81146091b56599b80c3", size = 100390, upload-time = "2025-10-14T04:41:11.915Z" }, + { url = "https://files.pythonhosted.org/packages/97/45/4b3a1239bbacd321068ea6e7ac28875b03ab8bc0aa0966452db17cd36714/charset_normalizer-3.4.4-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:e1f185f86a6f3403aa2420e815904c67b2f9ebc443f045edd0de921108345794", size = 208091, upload-time = "2025-10-14T04:41:13.346Z" }, + { url = "https://files.pythonhosted.org/packages/7d/62/73a6d7450829655a35bb88a88fca7d736f9882a27eacdca2c6d505b57e2e/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6b39f987ae8ccdf0d2642338faf2abb1862340facc796048b604ef14919e55ed", size = 147936, upload-time = "2025-10-14T04:41:14.461Z" }, + { url = "https://files.pythonhosted.org/packages/89/c5/adb8c8b3d6625bef6d88b251bbb0d95f8205831b987631ab0c8bb5d937c2/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:3162d5d8ce1bb98dd51af660f2121c55d0fa541b46dff7bb9b9f86ea1d87de72", size = 144180, upload-time = "2025-10-14T04:41:15.588Z" }, + { url = "https://files.pythonhosted.org/packages/91/ed/9706e4070682d1cc219050b6048bfd293ccf67b3d4f5a4f39207453d4b99/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:81d5eb2a312700f4ecaa977a8235b634ce853200e828fbadf3a9c50bab278328", size = 161346, upload-time = "2025-10-14T04:41:16.738Z" }, + { url = "https://files.pythonhosted.org/packages/d5/0d/031f0d95e4972901a2f6f09ef055751805ff541511dc1252ba3ca1f80cf5/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:5bd2293095d766545ec1a8f612559f6b40abc0eb18bb2f5d1171872d34036ede", size = 158874, upload-time = "2025-10-14T04:41:17.923Z" }, + { url = "https://files.pythonhosted.org/packages/f5/83/6ab5883f57c9c801ce5e5677242328aa45592be8a00644310a008d04f922/charset_normalizer-3.4.4-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a8a8b89589086a25749f471e6a900d3f662d1d3b6e2e59dcecf787b1cc3a1894", size = 153076, upload-time = "2025-10-14T04:41:19.106Z" }, + { url = "https://files.pythonhosted.org/packages/75/1e/5ff781ddf5260e387d6419959ee89ef13878229732732ee73cdae01800f2/charset_normalizer-3.4.4-cp313-cp313-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:bc7637e2f80d8530ee4a78e878bce464f70087ce73cf7c1caf142416923b98f1", size = 150601, upload-time = "2025-10-14T04:41:20.245Z" }, + { url = "https://files.pythonhosted.org/packages/d7/57/71be810965493d3510a6ca79b90c19e48696fb1ff964da319334b12677f0/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:f8bf04158c6b607d747e93949aa60618b61312fe647a6369f88ce2ff16043490", size = 150376, upload-time = "2025-10-14T04:41:21.398Z" }, + { url = "https://files.pythonhosted.org/packages/e5/d5/c3d057a78c181d007014feb7e9f2e65905a6c4ef182c0ddf0de2924edd65/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:554af85e960429cf30784dd47447d5125aaa3b99a6f0683589dbd27e2f45da44", size = 144825, upload-time = "2025-10-14T04:41:22.583Z" }, + { url = "https://files.pythonhosted.org/packages/e6/8c/d0406294828d4976f275ffbe66f00266c4b3136b7506941d87c00cab5272/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:74018750915ee7ad843a774364e13a3db91682f26142baddf775342c3f5b1133", size = 162583, upload-time = "2025-10-14T04:41:23.754Z" }, + { url = "https://files.pythonhosted.org/packages/d7/24/e2aa1f18c8f15c4c0e932d9287b8609dd30ad56dbe41d926bd846e22fb8d/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:c0463276121fdee9c49b98908b3a89c39be45d86d1dbaa22957e38f6321d4ce3", size = 150366, upload-time = "2025-10-14T04:41:25.27Z" }, + { url = "https://files.pythonhosted.org/packages/e4/5b/1e6160c7739aad1e2df054300cc618b06bf784a7a164b0f238360721ab86/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:362d61fd13843997c1c446760ef36f240cf81d3ebf74ac62652aebaf7838561e", size = 160300, upload-time = "2025-10-14T04:41:26.725Z" }, + { url = "https://files.pythonhosted.org/packages/7a/10/f882167cd207fbdd743e55534d5d9620e095089d176d55cb22d5322f2afd/charset_normalizer-3.4.4-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:9a26f18905b8dd5d685d6d07b0cdf98a79f3c7a918906af7cc143ea2e164c8bc", size = 154465, upload-time = "2025-10-14T04:41:28.322Z" }, + { url = "https://files.pythonhosted.org/packages/89/66/c7a9e1b7429be72123441bfdbaf2bc13faab3f90b933f664db506dea5915/charset_normalizer-3.4.4-cp313-cp313-win32.whl", hash = "sha256:9b35f4c90079ff2e2edc5b26c0c77925e5d2d255c42c74fdb70fb49b172726ac", size = 99404, upload-time = "2025-10-14T04:41:29.95Z" }, + { url = "https://files.pythonhosted.org/packages/c4/26/b9924fa27db384bdcd97ab83b4f0a8058d96ad9626ead570674d5e737d90/charset_normalizer-3.4.4-cp313-cp313-win_amd64.whl", hash = "sha256:b435cba5f4f750aa6c0a0d92c541fb79f69a387c91e61f1795227e4ed9cece14", size = 107092, upload-time = "2025-10-14T04:41:31.188Z" }, + { url = "https://files.pythonhosted.org/packages/af/8f/3ed4bfa0c0c72a7ca17f0380cd9e4dd842b09f664e780c13cff1dcf2ef1b/charset_normalizer-3.4.4-cp313-cp313-win_arm64.whl", hash = "sha256:542d2cee80be6f80247095cc36c418f7bddd14f4a6de45af91dfad36d817bba2", size = 100408, upload-time = "2025-10-14T04:41:32.624Z" }, + { url = "https://files.pythonhosted.org/packages/2a/35/7051599bd493e62411d6ede36fd5af83a38f37c4767b92884df7301db25d/charset_normalizer-3.4.4-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:da3326d9e65ef63a817ecbcc0df6e94463713b754fe293eaa03da99befb9a5bd", size = 207746, upload-time = "2025-10-14T04:41:33.773Z" }, + { url = "https://files.pythonhosted.org/packages/10/9a/97c8d48ef10d6cd4fcead2415523221624bf58bcf68a802721a6bc807c8f/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8af65f14dc14a79b924524b1e7fffe304517b2bff5a58bf64f30b98bbc5079eb", size = 147889, upload-time = "2025-10-14T04:41:34.897Z" }, + { url = "https://files.pythonhosted.org/packages/10/bf/979224a919a1b606c82bd2c5fa49b5c6d5727aa47b4312bb27b1734f53cd/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:74664978bb272435107de04e36db5a9735e78232b85b77d45cfb38f758efd33e", size = 143641, upload-time = "2025-10-14T04:41:36.116Z" }, + { url = "https://files.pythonhosted.org/packages/ba/33/0ad65587441fc730dc7bd90e9716b30b4702dc7b617e6ba4997dc8651495/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:752944c7ffbfdd10c074dc58ec2d5a8a4cd9493b314d367c14d24c17684ddd14", size = 160779, upload-time = "2025-10-14T04:41:37.229Z" }, + { url = "https://files.pythonhosted.org/packages/67/ed/331d6b249259ee71ddea93f6f2f0a56cfebd46938bde6fcc6f7b9a3d0e09/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:d1f13550535ad8cff21b8d757a3257963e951d96e20ec82ab44bc64aeb62a191", size = 159035, upload-time = "2025-10-14T04:41:38.368Z" }, + { url = "https://files.pythonhosted.org/packages/67/ff/f6b948ca32e4f2a4576aa129d8bed61f2e0543bf9f5f2b7fc3758ed005c9/charset_normalizer-3.4.4-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ecaae4149d99b1c9e7b88bb03e3221956f68fd6d50be2ef061b2381b61d20838", size = 152542, upload-time = "2025-10-14T04:41:39.862Z" }, + { url = "https://files.pythonhosted.org/packages/16/85/276033dcbcc369eb176594de22728541a925b2632f9716428c851b149e83/charset_normalizer-3.4.4-cp314-cp314-manylinux_2_31_riscv64.manylinux_2_39_riscv64.whl", hash = "sha256:cb6254dc36b47a990e59e1068afacdcd02958bdcce30bb50cc1700a8b9d624a6", size = 149524, upload-time = "2025-10-14T04:41:41.319Z" }, + { url = "https://files.pythonhosted.org/packages/9e/f2/6a2a1f722b6aba37050e626530a46a68f74e63683947a8acff92569f979a/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:c8ae8a0f02f57a6e61203a31428fa1d677cbe50c93622b4149d5c0f319c1d19e", size = 150395, upload-time = "2025-10-14T04:41:42.539Z" }, + { url = "https://files.pythonhosted.org/packages/60/bb/2186cb2f2bbaea6338cad15ce23a67f9b0672929744381e28b0592676824/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:47cc91b2f4dd2833fddaedd2893006b0106129d4b94fdb6af1f4ce5a9965577c", size = 143680, upload-time = "2025-10-14T04:41:43.661Z" }, + { url = "https://files.pythonhosted.org/packages/7d/a5/bf6f13b772fbb2a90360eb620d52ed8f796f3c5caee8398c3b2eb7b1c60d/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:82004af6c302b5d3ab2cfc4cc5f29db16123b1a8417f2e25f9066f91d4411090", size = 162045, upload-time = "2025-10-14T04:41:44.821Z" }, + { url = "https://files.pythonhosted.org/packages/df/c5/d1be898bf0dc3ef9030c3825e5d3b83f2c528d207d246cbabe245966808d/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:2b7d8f6c26245217bd2ad053761201e9f9680f8ce52f0fcd8d0755aeae5b2152", size = 149687, upload-time = "2025-10-14T04:41:46.442Z" }, + { url = "https://files.pythonhosted.org/packages/a5/42/90c1f7b9341eef50c8a1cb3f098ac43b0508413f33affd762855f67a410e/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:799a7a5e4fb2d5898c60b640fd4981d6a25f1c11790935a44ce38c54e985f828", size = 160014, upload-time = "2025-10-14T04:41:47.631Z" }, + { url = "https://files.pythonhosted.org/packages/76/be/4d3ee471e8145d12795ab655ece37baed0929462a86e72372fd25859047c/charset_normalizer-3.4.4-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:99ae2cffebb06e6c22bdc25801d7b30f503cc87dbd283479e7b606f70aff57ec", size = 154044, upload-time = "2025-10-14T04:41:48.81Z" }, + { url = "https://files.pythonhosted.org/packages/b0/6f/8f7af07237c34a1defe7defc565a9bc1807762f672c0fde711a4b22bf9c0/charset_normalizer-3.4.4-cp314-cp314-win32.whl", hash = "sha256:f9d332f8c2a2fcbffe1378594431458ddbef721c1769d78e2cbc06280d8155f9", size = 99940, upload-time = "2025-10-14T04:41:49.946Z" }, + { url = "https://files.pythonhosted.org/packages/4b/51/8ade005e5ca5b0d80fb4aff72a3775b325bdc3d27408c8113811a7cbe640/charset_normalizer-3.4.4-cp314-cp314-win_amd64.whl", hash = "sha256:8a6562c3700cce886c5be75ade4a5db4214fda19fede41d9792d100288d8f94c", size = 107104, upload-time = "2025-10-14T04:41:51.051Z" }, + { url = "https://files.pythonhosted.org/packages/da/5f/6b8f83a55bb8278772c5ae54a577f3099025f9ade59d0136ac24a0df4bde/charset_normalizer-3.4.4-cp314-cp314-win_arm64.whl", hash = "sha256:de00632ca48df9daf77a2c65a484531649261ec9f25489917f09e455cb09ddb2", size = 100743, upload-time = "2025-10-14T04:41:52.122Z" }, + { url = "https://files.pythonhosted.org/packages/0a/4c/925909008ed5a988ccbb72dcc897407e5d6d3bd72410d69e051fc0c14647/charset_normalizer-3.4.4-py3-none-any.whl", hash = "sha256:7a32c560861a02ff789ad905a2fe94e3f840803362c84fecf1851cb4cf3dc37f", size = 53402, upload-time = "2025-10-14T04:42:31.76Z" }, +] + [[package]] name = "click" version = "8.3.1" @@ -230,6 +312,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/c0/d9/53a8b53e75279a953fae608bd01025d9afcf393406c0da1dda1b7f5693c5/hypothesis-6.151.5-py3-none-any.whl", hash = "sha256:c0e15c91fa0e67bc0295551ef5041bebad42753b7977a610cd7a6ec1ad04ef13", size = 543338, upload-time = "2026-02-03T19:33:54.583Z" }, ] +[[package]] +name = "idna" +version = "3.11" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/6f/6d/0703ccc57f3a7233505399edb88de3cbd678da106337b9fcde432b65ed60/idna-3.11.tar.gz", hash = "sha256:795dafcc9c04ed0c1fb032c2aa73654d8e8c5023a7df64a53f39190ada629902", size = 194582, upload-time = "2025-10-12T14:55:20.501Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/0e/61/66938bbb5fc52dbdf84594873d5b51fb1f7c7794e9c0f5bd885f30bc507b/idna-3.11-py3-none-any.whl", hash = "sha256:771a87f49d9defaf64091e6e6fe9c18d4833f140bd19464795bc32d966ca37ea", size = 71008, upload-time = "2025-10-12T14:55:18.883Z" }, +] + [[package]] name = "importlib-metadata" version = "8.7.1" @@ -660,6 +751,21 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/f1/12/de94a39c2ef588c7e6455cfbe7343d3b2dc9d6b6b2f40c4c6565744c873d/pyyaml-6.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b", size = 149341, upload-time = "2025-09-25T21:32:56.828Z" }, ] +[[package]] +name = "requests" +version = "2.32.5" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "certifi" }, + { name = "charset-normalizer" }, + { name = "idna" }, + { name = "urllib3" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/c9/74/b3ff8e6c8446842c3f5c837e9c3dfcfe2018ea6ecef224c710c85ef728f4/requests-2.32.5.tar.gz", hash = "sha256:dbba0bac56e100853db0ea71b82b4dfd5fe2bf6d3754a8893c3af500cec7d7cf", size = 134517, upload-time = "2025-08-18T20:46:02.573Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/1e/db/4254e3eabe8020b458f1a747140d32277ec7a271daf1d235b70dc0b4e6e3/requests-2.32.5-py3-none-any.whl", hash = "sha256:2462f94637a34fd532264295e186976db0f5d453d1cdd31473c85a6a161affb6", size = 64738, upload-time = "2025-08-18T20:46:00.542Z" }, +] + [[package]] name = "rich" version = "14.3.2" @@ -685,6 +791,7 @@ dependencies = [ { name = "playwright" }, { name = "pydantic" }, { name = "pyyaml" }, + { name = "requests" }, { name = "rich" }, { name = "typer" }, ] @@ -711,6 +818,7 @@ requires-dist = [ { name = "pytest-asyncio", marker = "extra == 'dev'", specifier = ">=0.23" }, { name = "pytest-recording", marker = "extra == 'dev'", specifier = ">=0.13" }, { name = "pyyaml", specifier = ">=6.0" }, + { name = "requests", specifier = ">=2.28" }, { name = "rich", specifier = ">=13.0" }, { name = "ruff", marker = "extra == 'dev'", specifier = ">=0.5" }, { name = "typer", extras = ["all"], specifier = ">=0.12.0" }, @@ -862,6 +970,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/dc/9b/47798a6c91d8bdb567fe2698fe81e0c6b7cb7ef4d13da4114b41d239f65d/typing_inspection-0.4.2-py3-none-any.whl", hash = "sha256:4ed1cacbdc298c220f1bd249ed5287caa16f34d44ef4e9c3d0cbad5b521545e7", size = 14611, upload-time = "2025-10-01T02:14:40.154Z" }, ] +[[package]] +name = "urllib3" +version = "2.6.3" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/c7/24/5f1b3bdffd70275f6661c76461e25f024d5a38a46f04aaca912426a2b1d3/urllib3-2.6.3.tar.gz", hash = "sha256:1b62b6884944a57dbe321509ab94fd4d3b307075e0c2eae991ac71ee15ad38ed", size = 435556, upload-time = "2026-01-07T16:24:43.925Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/39/08/aaaad47bc4e9dc8c725e68f9d04865dbcb2052843ff09c97b08904852d84/urllib3-2.6.3-py3-none-any.whl", hash = "sha256:bf272323e553dfb2e87d9bfd225ca7b0f467b919d7bbd355436d3fd37cb0acd4", size = 131584, upload-time = "2026-01-07T16:24:42.685Z" }, +] + [[package]] name = "vcrpy" version = "8.1.1"