Scrapyd (with authentication)

Scrapyd is an application for deploying and running Scrapy spiders. It enables you to deploy (upload) your projects and control their spiders using a JSON API.

Scrapyd doesn't include any provision for password protecting itself. This container packages Scrapyd with an nginx proxy in front of it providing basic HTTP authentication. The username and password are configured through environment variables.

For more about Scrapyd, see the Scrapyd documentation.

How to use this image

Start a Scrapyd server

$ docker run -d -e USERNAME=my_username -e PASSWORD=hunter123 cdrx/scrapyd-authenticated

You can then use the Scrapyd client to easily deploy the scraper from your machine to the running container.

How to configure Scrapy to use HTTP basic authentication

Support for HTTP authentication is built into scrapyd client. Add the username and password field to your scrapy.cfg file and then deploy as you normally would.

[deploy]
url = http://scrapyd:6800/
username = my_username
password = hunter123

Installing Python packages that your scraper depends on

If your scraper depends on some 3rd party Python packages (Redis, MySQL drivers, etc) you can install them when the container launches by adding the PACKAGES environment variable.

$ docker run -d -e USERNAME=my_username -e PASSWORD=hunter123 -e PACKAGES=requests,simplejson cdrx/scrapyd-authenticated

This will make the container a bit slow to boot, so if your starting / stopping the container regularly you would be better off forking this repository and adding the packages to requirements.txt.

Supported environment variables

Variable	Required	Example	Description
`USERNAME`	Yes	my_user	The username for authentication with the Scrapy server
`PASSWORD`	Yes	hunter123	The password for authentication with the Scrapy server
`PACKAGES`	No	simplejson,requests	Comma separated list of Python packages to install before starting scrapyd

Volumes

To persist data between launches, you can mount the volume /scrapyd somewhere on your Docker host.

Feedback

Issues

If you have any problems with or questions about this image, please file an issue on the GitHub repository.

Contributing

Pull requests welcome :-)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
chaperone.conf		chaperone.conf
nginx.conf		nginx.conf
requirements.txt		requirements.txt
scrapyd.conf		scrapyd.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapyd (with authentication)

How to use this image

Start a Scrapyd server

How to configure Scrapy to use HTTP basic authentication

Installing Python packages that your scraper depends on

Supported environment variables

Volumes

Feedback

Issues

Contributing

About

Releases

Packages

Contributors 3

Languages

License

cdrx/scrapyd-authenticated

Folders and files

Latest commit

History

Repository files navigation

Scrapyd (with authentication)

How to use this image

Start a Scrapyd server

How to configure Scrapy to use HTTP basic authentication

Installing Python packages that your scraper depends on

Supported environment variables

Volumes

Feedback

Issues

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages