Skip to content

Releases: crawlab-team/crawlab

v0.6.3-dev

14 Jun 07:49
Compare
Choose a tag to compare
v0.6.3-dev Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v0.6.2...v0.6.3-dev

v0.6.3

26 Jul 08:10
2dbc737
Compare
Choose a tag to compare

v0.6.2

16 Jun 05:50
09a564d
Compare
Choose a tag to compare

Web Crawler Management Platform Crawlab v0.6.2 Official Release

Overview

Crawlab v0.6.2 is the latest iterative version of Crawlab v0.6.x, bringing a series of improvements, including bug fixes, feature enhancements, and enhanced functionality for environment variables.

Changelog

Bug Fixes

Feature Enhancements

Community

If you find Crawlab helpful for your daily development or your company, please consider starring it on GitHub. If you encounter any issues, feel free to raise them as issues on GitHub. Additionally, you're welcome to contribute to the development of Crawlab. You can also join the Crawlab technical discussion group by adding WeChat account tikazyq1, where you can communicate and discuss with other developers regarding technical development and deployment usage.

References

v0.6.1

23 Mar 05:07
8c083ed
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6.0...v0.6.1

v0.6.0-1

27 Oct 04:45
c091d82
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6.0...v0.6.0-1

v0.6.0

23 May 01:11
Compare
Choose a tag to compare

Change Log (v0.6.0)

Overview

As a major release, v0.6.0 is consisted of a number of large changes to enhance the performance, scalability, robustness and usability of Crawlab. This beta version is theoretically more robust than older versions mainly in task execution, files synchronization and node management, yet we still recommend users to thoroughly run tests with various samples.

Enhancements

Backend

  • File Synchronization. Migrated file sync from MongoDB GridFS to SeaweedFS for better stability and robustness.
  • Node Communication. Migrated node communication from Redis-based RPC to gRPC. Worker nodes indirectly interact with MongoDB by making gRPC calls to the master node.
  • Task Queue. Migrated task queue from Redis list to MongoDB collection to allow more flexibility (e.g. priority queue).
  • Logging. Migrated logging storage system to SeaweedFS to resolve performance issue in MongoDB.
  • SDK Integration. Migrated results data ingestion from native SDK to task handler side.
  • Task Related. Abstracted task related logics into Task Scheduler, Task Handler and Task Runners to increase decoupling and improve scalability and maintainability.
  • Compotenization. Introduced DI (dependency injection) framework and componentized modules, services and sub-systems.
  • Plugin Framework. Crawlab Plugin Framework (CPF) has been released. See more info [here](https://docs.crawlab.cn/en/guide/plugin/).
  • Git Integration. Git integration is implemented as a built-in feature.
  • Scrapy Integration. Scrapy integration is implemented as a plugin [spider-assistant](https://docs.crawlab.cn/en/guide/plugin/plugin-spider-assistant).
  • Dependency Integration. Dependency integration is implemented as a plugin [dependency](https://docs.crawlab.cn/en/guide/plugin/plugin-dependency).
  • Notifications. Notifications feature is implemented as a plugin [notification](https://docs.crawlab.cn/en/guide/plugin/plugin-notification).

Frontend

  • Vue 3. Migrated to latest version of frontend framework Vue 3 to support more advanced features such as composition API and TypeScript.
  • UI Framework. Built with Vue 3-based UI framework Element-Plus from Vue-Element-Admin, more flexibility and functionality.
  • Advanced File Editor. Support more advanced file editor features including drag-and-drop copying/moving files, renaming, deleting, file editing, code highlight, nav tabs, etc.
  • Customizable Table. Support more advanced built-in operations such as columns adjustment, batch operation, searching, filtering, sorting, etc.
  • Nav Tabs. Support multiple nav tabs for viewing different pages.
  • Batch Creation. Support batch creating objects including spiders, projects, schedules, etc.
  • Detail Navigation. Sidebar navigation in detail pages.
  • Enhanced Dashboard. More stats charts in home page dashboard.

Miscellaneous

v0.6.0-beta.20211224

24 Dec 10:17
e4b1bb5
Compare
Choose a tag to compare

Change Log (v0.6.0-beta.20211224)

Overview

This is the third beta release for the next major version v0.6.0. With more features and optimization coming in, the release of official version v0.6.0 is approaching soon.

Enhancement

  • Internationalization. Support Chinese.
  • CLI Upload Spider. #1020
  • Official Plugins. Allow users to install official plugins on Crawlab web UI.
  • More Documentation. Added documentation for plugins and CLI.

Bug Fixes

TODOs

  • Associated Tasks. There will be main tasks and their sub-tasks if task mode is "all nodes" or "selected nodes".
  • Crontab Editor. Frontend component that visualize the crontab editing.
  • Results Deduplication.
  • Environment Variables.
  • Frontend Utility Enhancement. Advanced features such as saved table customization.
  • Log Auto Cleanup.
  • More Documentation.
  • E2E Tests.
  • Frontend Output File Size Optimization.

What Next

The next version could the official release of v0.6.0, but not determined yet. There will be more tests running against the current beta version to ensure robustness and production-ready deployment.

v0.6.0-beta.20211120

20 Nov 13:44
779e134
Compare
Choose a tag to compare

Change Log (v0.6.0-beta.20211120)

Overview

This is the second beta release for the next major version v0.6.0 after the first beta release. With more features and optimization coming in, the release of official version v0.6.0 is approaching soon.

Enhancement

Backend

  • Plugin Framework. Crawlab Plugin Framework (CPF) has been released. See more info here.
  • Git Integration. Git integration is implemented as a built-in feature.
  • Scrapy Integration. Scrapy integration is implemented as a plugin spider-assistant.
  • Dependency Integration. Dependency integration is implemented as a plugin dependency.
  • Notifications. Notifications feature is implemented as a plugin notification.
  • Documentation Site. Set up documentation site.

Frontend

  • Bug Fixing.

TODOs

  • Associated Tasks. There will be main tasks and their sub-tasks if task mode is "all nodes" or "selected nodes".
  • Crontab Editor. Frontend component that visualize the crontab editing.
  • Results Deduplication.
  • Environment Variables.
  • Internationalization. Support Chinese.
  • Frontend Utility Enhancement. Advanced features such as saved table customization.
  • Log Auto Cleanup.
  • More Documentation.

What Next

The next version could the official release of v0.6.0, but not determined yet. There will be more tests running against the current beta version to ensure robustness and production-ready deployment.

v0.6.0-beta.20210803

03 Aug 14:53
833b78c
Compare
Choose a tag to compare

Change Log (v0.6.0-beta.20210803)

Overview

This is the beta release for the next major version v0.6.0. It recommended NOT to use it in production as it is not fully tested and thus not stable enough. Futhermore, more features including those not ready in the beta release (e.g. Git, Scrapy, Notification) are planned to be integrated into the live version, in the form of plugins.

Enhancement

As a major release, v0.6 (including beta versions) is consisted of a number of large changes to enhance the performance, scalability, robustness and usability of Crawlab. This beta version is theoretically more robust than older versions mainly in task execution, files synchronization and node management, yet we still recommend users to thoroughly run tests with various samples.

Backend

  • File Synchronization. Migrated file sync from MongoDB GridFS to SeaweedFS for better stability and robustness.
  • Node Communication. Migrated node communication from Redis-based RPC to gRPC. Worker nodes indirectly interact with MongoDB by making gRPC calls to the master node.
  • Task Queue. Migrated task queue from Redis list to MongoDB collection to allow more flexibility (e.g. priority queue).
  • Logging. Migrated logging storage system to SeaweedFS to resolve performance issue in MongoDB.
  • SDK Integration. Migrated results data ingestion from native SDK to task handler side.
  • Task Related. Abstracted task related logics into Task Scheduler, Task Handler and Task Runners to increase decoupling and improve scalability and maintainability.
  • Compotenization. Introduced DI (dependency injection) framework and componentized modules, services and sub-systems.

Frontend

  • Vue 3. Migrated to latest version of frontend framework Vue 3 to support more advanced features such as composition API and TypeScript.
  • UI Framework. Built with Vue 3-based UI framework Element-Plus from Vue-Element-Admin, more flexibility and functionality.
  • Advanced File Editor. Support more advanced file editor features including drag-and-drop copying/moving files, renaming, deleting, file editing, code highlight, nav tabs, etc.
  • Customizable Table. Support more advanced built-in operations such as columns adjustment, batch operation, searching, filtering, sorting, etc.
  • Nav Tabs. Support multiple nav tabs for viewing different pages.
  • Batch Creation. Support batch creating objects including spiders, projects, schedules, etc.
  • Detail Navigation. Sidebar navigation in detail pages.
  • Enhanced Dashboard. More stats charts in home page dashboard.

TODOs

As you may be aware that this is a beta release, some of the existing useful features such as Git and Scrapy integration may not be available. However, we are trying to include them in the official v0.6.0 release, as some of their core functionalities are already ready in the code base, and we will add to the stable version only if they are fully tested.

  • Plugin Framework. Advanced features will exist in the form of plugins, or pluggable modules.
  • Git Integration. To be included as a plugin.
  • Scrapy Integration. To be included as a plugin.
  • Notifications. To be included as a plugin.
  • Associated Tasks. There will be main tasks and their sub-tasks if task mode is "all nodes" or "selected nodes".
  • Crontab Editor. Frontend component that visualize the crontab editing.
  • Results Deduplication.
  • Environment Variables.
  • Internationalization. Support Chinese.
  • Frontend Utility Enhancement. Advanced features such as saved table customization.
  • Log Auto Cleanup.
  • Documentation.

What Next

This beta release is only a preview and a test ground for the core functionalies in Crawlab v0.6. Therefore, we will invite you guys to download and run more tests. The official release is expected to be ready after major issues from the beta version are sorted and Plugin Framework and other key features are developed and fully tested. With that beared in mind, a second beta version before the main release will also be possible.

v0.5.1

31 Jul 05:43
dcb9351
Compare
Choose a tag to compare

Features / Enhancement

  • Added error message details.
  • Added Golang programming language support.
  • Added web driver installation scripts for Chrome Driver and Firefox.
  • Support system tasks. A "system task" is similar to normal spider task, it allows users to view logs of general tasks such as installing languages.
  • Changed methods of installing languages from RPC to system tasks.

Bug Fixes

  • Fixed first download repo 500 error in Spider Market page. #808
  • Fixed some translation issues.
  • Fixed 500 error in task detail page. #810
  • Fixed password reset issue. #811
  • Fixed unable to download CSV issue. #812
  • Fixed unable to install node.js issue. #813
  • Fixed disabled status for batch adding schedules. #814