Skip to content

scrape-do/sdo-domain-analyzer

Repository files navigation

SDO Domain Analyzer

Development

bin/dev        # Start Rails server (port 3020) + Tailwind watcher + Solid Queue worker

bin/dev launches three processes via foreman: the web server, the Tailwind CSS watcher, and the Solid Queue worker. The worker is required for background jobs (including the domain analysis scheduler) to run.

Domain Profile Analyzer Worker

The analyzer worker (AnalyzePendingDomainsJob) runs on a schedule every 30 minutes. It finds all domain profiles that have never been analyzed, or whose last analysis is older than DOMAIN_ANALYSIS_THRESHOLD_DAYS (default: 90 days), and runs DNS + WHOIS analysis on each one sequentially.

Concurrency: the job uses limits_concurrency to: 1, so only one pass runs at a time even if a previous run is still in progress.

Trigger manually

bin/rails runner "AnalyzePendingDomainsJob.perform_later"

Run the worker process standalone

bin/jobs                                      # uses config/queue.yml
SOLID_QUEUE_SKIP_RECURRING=true bin/jobs      # skip the scheduler (manual trigger only)

Configuration

Environment variable Default Effect
DOMAIN_ANALYSIS_THRESHOLD_DAYS 90 Days before a domain is considered stale
JOB_CONCURRENCY 1 Number of Solid Queue worker processes
SCRAPE_DO_TOKEN scrape.do API Key

The recurring schedule is defined in config/recurring.yml. To change the interval, edit the schedule: value for analyze_pending_domains.

Database setup

Solid Queue uses the primary database (no separate queue database).

bin/rails db:prepare   # create + migrate (includes Solid Queue tables)
bin/rails db:reset     # drop, create, migrate, seed

Tests

bin/rails test           # unit + integration
bin/rails test:system    # system tests (Capybara + Selenium)

Linting & Security

bin/rubocop -f github    # lint
bin/brakeman --no-pager  # security scan
bin/bundler-audit        # gem vulnerability audit

Deployment

Kamal + Docker. See config/deploy.yml.

API

Authentication

Generate an API token from the API Tokens page in the web UI (/api_tokens). Tokens are prefixed sdo_ and the plaintext is shown only once at creation time.

Pass the token as a Bearer header on every request:

Authorization: Bearer sdo_<your_token>

Domain Check Endpoint

GET /api/v1/domains/:domain
Authorization: Bearer <token>

Returns the full domain profile as JSON. If the domain has never been analyzed (or the last analysis is older than DOMAIN_ANALYSIS_THRESHOLD_DAYS), analysis runs synchronously before the response is returned. If the domain hasn't been seen before, it is created automatically.

Example response:

{
  "domain": "example.com",
  "tld": "com",
  "trd": null,
  "root_domain": null,
  "category": null,
  "source": "manual",
  "analyzed_at": "2026-05-20T10:00:00.000Z",
  "is_disposable": false,
  "abuse_detected": false,
  "abuse_detected_at": null,
  "abuse_detected_reason": null,
  "blacklisted": false,
  "blacklisted_at": null,
  "blacklist_reason": null,
  "dns_records": { "a": ["93.184.216.34"], "mx": [], "ns": [...] },
  "dns_error": null,
  "dns_records_fetched_at": "2026-05-20T10:00:00.000Z",
  "raw_whois": "...",
  "raw_whois_error": null,
  "raw_whois_at": "2026-05-20T10:00:00.000Z",
  "first_seen_at": "2026-05-20T10:00:00.000Z",
  "last_seen_at": "2026-05-20T10:00:00.000Z",
  "created_at": "2026-05-20T10:00:00.000Z",
  "updated_at": "2026-05-20T10:00:00.000Z"
}

Error responses:

Status Meaning
401 Missing or invalid token
422 Invalid domain name

How does it work?

Disposable Email Service Detection

We scrape domains by prefixing them with https://. Scraping is done with scrape.do API to prevent any Captchas to block us. Scrape.do's free tier is generous enough for scraping up to 1000 domains. Sign up today to get 1000 free credits.

The text patterns are defined in app/interactions/scraping/detect_disposable_interaction.rb

About

Domain reputation analyzer: DNS + WHOIS lookups, disposable email provider detection, and abuse/blacklist tracking, exposed via a token-authenticated JSON API. Rails 8, powered by Scrape.do.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors