Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a new guide for scaling the crawlers #476

Open
vdusek opened this issue Aug 30, 2024 · 2 comments
Open

Create a new guide for scaling the crawlers #476

vdusek opened this issue Aug 30, 2024 · 2 comments
Labels
documentation Improvements or additions to documentation. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@vdusek
Copy link
Collaborator

vdusek commented Aug 30, 2024

  • We could create a new documentation guide for scaling the crawlers (mainly the features from _autoscaling subpackage).
  • The guide should include the following:
    • ConcurrencySettings - how users can configure the concurrency of requests.
    • Add a short explanation of the internal components:
      • Snapshotter,
      • AutoscaledPool.
  • Inspiration: https://crawlee.dev/docs/guides/scaling-crawlers
  • How to name it? I used "request scaling" here, but it is not precise enough.
@vdusek vdusek added documentation Improvements or additions to documentation. t-tooling Issues with this label are in the ownership of the tooling team. labels Aug 30, 2024
@B4nan
Copy link
Member

B4nan commented Aug 30, 2024

why a different name? also, you scale the crawler to be able to run more requests, you don't scale the requests, right?

i would rather have a python version of the very same page instead of a different page

@vdusek vdusek changed the title Create a new guide for "request scaling" (concurrency) Create a new guide for scaling the crawlers Aug 30, 2024
@vdusek
Copy link
Collaborator Author

vdusek commented Aug 30, 2024

why a different name? also, you scale the crawler to be able to run more requests, you don't scale the requests, right?
i would rather have a python version of the very same page instead of a different page

Yeah... I created the issue at first without looking into JS Crawlee docs, then updated it with the inspiration bullet, but not updating the title & description. I will use the same naming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

2 participants