Skip to content

Conversation

Vikrant-Khedkar
Copy link
Collaborator

@Vikrant-Khedkar Vikrant-Khedkar commented Oct 10, 2025

Changes:

  • Introduced scrape method for basic scraping of page content.
  • Added sitemap method to extract sitemap URLs for a given website.
  • Implemented agentic_scrapper method for running the Agentic Scraper workflow with flexible input handling.
  • Updated timeout settings for HTTP client to improve request handling.

This enhances the functionality of the ScapeGraphClient, allowing for more versatile web scraping capabilities.


Note

Adds scrape, sitemap, and agentic_scrapper client methods and MCP tools with flexible input handling, and increases HTTP client timeout.

  • Server/Client (src/scrapegraph_mcp/server.py):
    • New API methods:
      • ScapeGraphClient.scrape(website_url, render_heavy_js?) -> POST /scrape.
      • ScapeGraphClient.sitemap(website_url) -> POST /sitemap.
      • ScapeGraphClient.agentic_scrapper(url, user_prompt?, output_schema?, steps?, ai_extraction?, persistent_session?, timeout_seconds?) -> POST /agentic-scrapper (supports per-request timeout).
    • New MCP tools:
      • scrape(website_url, render_heavy_js?) and sitemap(website_url) wrappers with HTTP error handling.
      • agentic_scrapper(...) wrapper with input normalization (accepts steps as string/list and output_schema as dict/JSON string) and robust error/timeout handling.
    • Networking: Increase HTTPX client timeout from 60s to httpx.Timeout(120s).
    • Types/Imports: Add json and extended typing (Optional, List, Union) to support new features.

Written by Cursor Bugbot for commit c595975. This will update automatically on new commits. Configure here.

Changes:
- Introduced `scrape` method for basic scraping of page content.
- Added `sitemap` method to extract sitemap URLs for a given website.
- Implemented `agentic_scrapper` method for running the Agentic Scraper workflow with flexible input handling.
- Updated timeout settings for HTTP client to improve request handling.

This enhances the functionality of the ScapeGraphClient, allowing for more versatile web scraping capabilities.
Copy link
Contributor

@VinciGit00 VinciGit00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error

@Vikrant-Khedkar
Copy link
Collaborator Author

pls add also the sitemap

we already added in this one

@VinciGit00 VinciGit00 merged commit 0358380 into main Oct 10, 2025
2 of 3 checks passed
@VinciGit00 VinciGit00 deleted the update-mcp branch October 10, 2025 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants