Skip to content

Does the main solving work for CaptchaKraken, designed as a cli to be language agnostic, as I want to make both a python and ts solver

License

Notifications You must be signed in to change notification settings

JWriter20/CaptchaKraken-cli

Repository files navigation

CaptchaKraken CLI

AI-powered, fully local captcha-solving CLI that uses attention-based vision models to extract precise click coordinates for common web captchas.

Description

CaptchaKraken takes a screenshot of a captcha challenge, classifies the captcha type, highlights and numbers all interactable regions, and then plans the sequence of clicks needed to solve it.
It is designed to be:

  • CLI-first: run end‑to‑end solves from the command line.
  • Model-agnostic: pluggable attention models for coordinate extraction.
  • Debuggable: optional overlays and debug images to inspect detection and planning.

High-level flow:

  1. Classify the captcha (checkbox vs image grid vs text prompt, etc.).
  2. Detect and number all interactable elements in the captcha (checkboxes, tiles, buttons).
  3. Plan actions using the point and detect tools to generate click coordinates.
  4. Output the sequence of actions (clicks) that can be replayed in a browser automation stack.

Captcha support status

  • Checkbox captchas – end‑to‑end solving working.
  • Image selection / image grid captchas – end‑to‑end solving working.
  • Text captchas – basic plumbing present, solving still in progress.

Additional captcha types and more robust classification/solving strategies are under active development.

About

Does the main solving work for CaptchaKraken, designed as a cli to be language agnostic, as I want to make both a python and ts solver

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages