Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BBOT Fails in Docker-Based Integration Due to Daemonic Process Limitation #2354

Open
AnshSinghal opened this issue Mar 14, 2025 · 8 comments

Comments

@AnshSinghal
Copy link

Hi,

I’m trying to integrate BBOT as a Docker-based analyzer for IntelOwl, but I’ve been running into persistent issues due to Python’s multiprocessing restrictions. Specifically, I keep encountering the error:

"daemonic processes are not allowed to have children"

This occurs because BBOT internally spawns child processes while running within a daemonized process. Since Python does not allow daemonic processes to create children, the scan fails when BBOT attempts to launch its internal modules.

What I’ve Tried:

I’ve attempted several workarounds to resolve this issue, including:

  1. Monkey-Patching Multiprocessing:

    • Overriding multiprocessing.Process.__init__ to force daemon=False.
    • Using a site-customize hook for early imports.
  2. Switching to billiard (Celery’s Fork of Multiprocessing):

    • Replaced Python’s standard multiprocessing with billiard, but the issue persists.
  3. Using Quart Instead of Flask:

    • Initially used Flask but switched to Quart (async-friendly) to test if async task handling changes behavior.
  4. Changing Python Versions:

    • Tested on Python 3.9, 3.10, and 3.11 and 3.12 to check if version differences might help.
  5. Manually Installing BBOT Dependencies:

    • Installed required system dependencies (gcc, openssl, etc.) and pre-installed BBOT dependencies using bbot --install-all-deps -y.
    • Still encountered package manager detection failures inside the container.
  6. Forcing multiprocessing.set_start_method("spawn", force=True):

    • Modified startup behavior to avoid daemonic process issues.

Request for Help

Since modifying BBOT’s internal multiprocessing behavior is not an option from my side, I wanted to check:

  1. Is there a recommended way to run BBOT in a containerized environment without hitting the daemon process restriction?
  2. Would it be possible to modify BBOT’s multiprocessing logic to allow for compatibility with daemonized processes?
  3. Has anyone successfully deployed BBOT in an IntelOwl-like architecture within a Docker container?

Any guidance or potential solutions would be greatly appreciated! If needed, I’d be happy to contribute to any changes that could help make BBOT more container-friendly.

You can check my workarounds on this PR

@TheTechromancer
Copy link
Collaborator

@AnshSinghal thanks for the report. We haven't run into this issue before. Can you post your steps to reproduce?

@AnshSinghal
Copy link
Author

Thanks for your response! Here are the exact steps to reproduce the issue in a Docker-based environment:

1. Environment Details:

  • Base Image: python:3.11-slim-bullseye
  • BBOT Version: Latest version installed via pip install bbot
  • Docker Runtime: Running BBOT inside a Docker container as part of an IntelOwl analyzer
  • Python Versions Tested: 3.9, 3.10, 3.11

2. Steps to Reproduce:

Step 1: Create a Dockerfile

FROM python:3.11-slim-bullseye  

# Environment variables  
ENV PROJECT_PATH=/opt/deploy/bbot  
ENV USER=bbot-user  
ENV HOME=${PROJECT_PATH}  
ENV BBOT_HOME=${PROJECT_PATH}  

# Create a non-root user  
RUN useradd -ms /bin/bash ${USER}  

# Install system dependencies  
RUN apt-get update && apt-get install -y --no-install-recommends \  
    build-essential libssl-dev libffi-dev cargo openssl \  
    libpq-dev curl unzip git make bash tar p7zip-full p7zip && \  
    apt-get clean && apt-get autoremove -y && \  
    rm -rf /var/lib/apt/lists/* /tmp/* /usr/share/doc/* /usr/share/man/*  

# Upgrade pip and install Python packages  
RUN pip install --no-cache-dir --upgrade pip && \  
    pip install --no-cache-dir quart hypercorn bbot  

# Pre-install BBOT dependencies  
RUN bbot --install-all-deps -y  

# Set up project directory  
WORKDIR ${PROJECT_PATH}  

# Copy application files  
COPY --chown=${USER}:${USER} app.py entrypoint.sh ./  

# Make scripts executable  
RUN chmod u+x entrypoint.sh app.py  

# Expose port  
EXPOSE 5000  

# Entrypoint  
ENTRYPOINT ["./entrypoint.sh"]

Step 2: Create app.py (BBOT API Server in Quart)

import asyncio
import logging
from quart import Quart, request, jsonify
from hypercorn.config import Config
from hypercorn.asyncio import serve
from bbot.scanner import Scanner, Preset


app = Quart(__name__)
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

@app.route("/run", methods=["POST"])
async def run_scan():
    data = await request.get_json()
    target = data.get("target")
    presets = data.get("presets", ["web-basic"])
    modules = data.get("modules", ["httpx"])

    if not target:
        return jsonify({"error": "No target provided"}), 400

    logger.info(f"Received scan request for target: {target}")
    logger.info(f"Using presets: {presets}")
    logger.info(f"Using modules: {modules}")

    # Construct the Preset, excluding problematic modules.
    scan_preset = Preset(
        target,
        modules=modules,
        presets=presets,
        output_modules=["json"],
        exclude_modules=["ffuf_shortnames", "filedownload"]
    )
    scanner = Scanner(preset=scan_preset)
    
    try:
        results = []
        async for event in scanner.async_start():
            results.append(event)
        return jsonify({"results": results})
    except Exception as e:
        logger.error(f"BBOT scan failed: {str(e)}")
        return jsonify({"error": f"Scan failed: {str(e)}"}), 500
    finally:
        await scanner.stop()
        del scanner

if __name__ == "__main__":
    config = Config()
    config.bind = ["0.0.0.0:5000"]
    asyncio.run(serve(app, config))

Step 3: Build and Run Docker Container

docker build -t bbot_test .
docker run --rm -it bbot_test

Step 4: Make an API Request to Start a Scan

curl -X POST http://localhost:5000/run -H "Content-Type: application/json" \
     -d '{"target": "example.com", "presets": ["web-basic"], "modules": ["httpx"]}'

3. Expected vs Actual Behavior

Expected:

BBOT should run inside the container, execute the scan, and return results successfully.

Actual:

BBOT scan fails with the following error:

[ERRR] Exception on request POST /run  
bbot_analyzer-1 | Traceback (most recent call last):  
bbot_analyzer-1 | File "/usr/local/lib/python3.11/site-packages/quart/app.py", line 1464, in handle_request  
bbot_analyzer-1 | return await self.full_dispatch_request(request_context)  
bbot_analyzer-1 | File "/usr/local/lib/python3.11/site-packages/bbot/scanner/modules/module.py", line 118, in start  
bbot_analyzer-1 | multiprocessing.process.py:118:start(): daemonic processes are not allowed to have children  

4. Key Findings:

  • BBOT’s internal modules (like httpx, sslcert, baddns, etc.) are being spawned as child processes inside a daemonized parent.
  • In Python, daemonic processes are not allowed to have children, leading to a failure when BBOT tries to fork subprocesses.
  • The issue persists even after:
    • Monkey-patching multiprocessing.Process.__init__ to force daemon=False.
    • Using multiprocessing.set_start_method("spawn", force=True).
    • Disabling problematic BBOT modules (ffuf_shortnames, filedownload).
    • Running BBOT using subprocess.Popen() instead of direct execution.

5. Possible Solutions & Request for Help

  • How can BBOT be modified to work inside a containerized environment without triggering this issue?
  • Would modifying BBOT’s multiprocessing logic to allow non-daemon processes be an option?
  • Has anyone successfully deployed BBOT inside Docker without facing this limitation?

Any insights or guidance would be greatly appreciated! Thanks in advance. 🙌

@TheTechromancer
Copy link
Collaborator

TheTechromancer commented Mar 14, 2025

Ah I see. It seems like the main issue is that Quart's processes are daemonized. Normally, BBOT's main process isn't a daemon so this isn't an issue. But of course BBOT needs to spawn processes in order to work properly.

We regularly use BBOT inside docker. We also have tests that run BBOT scans from inside a fastapi endpoint similar to what you have above. So I'm pretty confident this is specific to Quart.

@TheTechromancer
Copy link
Collaborator

TheTechromancer commented Mar 14, 2025

The simplest solution would be to call BBOT's CLI (e.g. with --json) which would offload the scan into its own process, instead of running inside the Quart process.

EDIT: I guess you've already tried that. Maybe try asyncio.create_subprocess_exec instead of subprocess.Popen? Still though, that may fail unless you can find a way to undaemonize Quart's worker processes.

EDIT 2: It's hypercorn. It's a known bug and is about to get fixed.

@AnshSinghal
Copy link
Author

Here’s your response in a concise and structured manner:


Thanks for your response!

Firstly, I realized I had shared the wrong app.py earlier. Below is the correct implementation I’m using:

import asyncio
from bbot.scanner import Scanner
from hypercorn.asyncio import serve
from hypercorn.config import Config
from quart import Quart, jsonify, request

app = Quart(__name__)

@app.route("/run", methods=["POST"])
async def run_scan():
    data = await request.get_json()
    target = data.get("target")
    presets = data.get("presets", ["web-basic"])
    modules = data.get("modules", ["httpx"])

    if not target:
        return jsonify({"error": "No target provided"}), 400

    # Pass a configuration override to disable problematic modules
    scanner = Scanner(
        target,
        modules=modules,
        presets=presets,
        output_modules=["json"],
        config={
            "modules": {
                "ffuf_shortnames": {"enabled": False},
                "filedownload": {"enabled": False}
            }
        }
    )

    results = []
    async for event in scanner.async_start():
        results.append(event)
    return {"results": results}

if __name__ == "__main__":
    config = Config()
    config.bind = ["0.0.0.0:5000"]
    asyncio.run(serve(app, config))

I originally tried running BBOT inside Flask, but I encountered another issue—Flask would terminate the process without any errors. My assumption is that Flask’s synchronous request handling struggles with the async execution of BBOT’s scanner, leading to process termination before the scan completes.

Regarding using subprocess or BBOT’s CLI directly, I explored that as well. However, in IntelOwl, analyzers need to return structured JSON responses with detailed scan results. Since BBOT's CLI runs as a separate process, capturing and handling its output properly inside IntelOwl would require extra layers of parsing, error handling, and state management.

Would you be able to point me to specific code examples of BBOT being run inside Flask? If there’s a way to make it work, I’d be happy to test that approach. I checked the issue that you provided and it asks me to use Uvicorn.

@TheTechromancer
Copy link
Collaborator

TheTechromancer commented Mar 14, 2025

There shouldn't be anything special required to run BBOT inside flask. That's definitely a strange error so if you can post the steps to reproduce, I'll take a look.

On another note, I saw you excluded a couple modules. Are they misbehaving? If so we should make issues for them so they can get fixed.

@AnshSinghal
Copy link
Author

Thanks for your response! Here are the exact steps to reproduce the issue in a Docker-based environment:

1. Environment Details:

  • Base Image: python:3.11-slim-bullseye
  • BBOT Version: Latest version installed via pip install bbot
  • Docker Runtime: Running BBOT inside a Docker container as part of an IntelOwl analyzer
  • Python Versions Tested: 3.9, 3.10, 3.11

2. Steps to Reproduce:

Step 1: Create a Dockerfile

FROM python:3.11-slim-bullseye

Environment variables

ENV PROJECT_PATH=/opt/deploy/bbot
ENV USER=bbot-user
ENV HOME=${PROJECT_PATH}
ENV BBOT_HOME=${PROJECT_PATH}

Create a non-root user

RUN useradd -ms /bin/bash ${USER}

Install system dependencies

RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential libssl-dev libffi-dev cargo openssl \
libpq-dev curl unzip git make bash tar p7zip-full p7zip && \
apt-get clean && apt-get autoremove -y && \
rm -rf /var/lib/apt/lists/* /tmp/* /usr/share/doc/* /usr/share/man/*

Upgrade pip and install Python packages

RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir quart hypercorn bbot

Pre-install BBOT dependencies

RUN bbot --install-all-deps -y

Set up project directory

WORKDIR ${PROJECT_PATH}

Copy application files

COPY --chown=${USER}:${USER} app.py entrypoint.sh ./

Make scripts executable

RUN chmod u+x entrypoint.sh app.py

Expose port

EXPOSE 5000

Entrypoint

ENTRYPOINT ["./entrypoint.sh"]

Step 2: Create app.py (BBOT API Server in Quart)

import asyncio
import logging
from quart import Quart, request, jsonify
from hypercorn.config import Config
from hypercorn.asyncio import serve
from bbot.scanner import Scanner, Preset

app = Quart(name)
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(name)

@app.route("/run", methods=["POST"])
async def run_scan():
data = await request.get_json()
target = data.get("target")
presets = data.get("presets", ["web-basic"])
modules = data.get("modules", ["httpx"])

if not target:
    return jsonify({"error": "No target provided"}), 400

logger.info(f"Received scan request for target: {target}")
logger.info(f"Using presets: {presets}")
logger.info(f"Using modules: {modules}")

# Construct the Preset, excluding problematic modules.
scan_preset = Preset(
    target,
    modules=modules,
    presets=presets,
    output_modules=["json"],
    exclude_modules=["ffuf_shortnames", "filedownload"]
)
scanner = Scanner(preset=scan_preset)

try:
    results = []
    async for event in scanner.async_start():
        results.append(event)
    return jsonify({"results": results})
except Exception as e:
    logger.error(f"BBOT scan failed: {str(e)}")
    return jsonify({"error": f"Scan failed: {str(e)}"}), 500
finally:
    await scanner.stop()
    del scanner

if name == "main":
config = Config()
config.bind = ["0.0.0.0:5000"]
asyncio.run(serve(app, config))

Step 3: Build and Run Docker Container

docker build -t bbot_test .
docker run --rm -it bbot_test

Step 4: Make an API Request to Start a Scan

curl -X POST http://localhost:5000/run -H "Content-Type: application/json"
-d '{"target": "example.com", "presets": ["web-basic"], "modules": ["httpx"]}'

3. Expected vs Actual Behavior

Expected:

BBOT should run inside the container, execute the scan, and return results successfully.

Actual:

BBOT scan fails with the following error:

[ERRR] Exception on request POST /run  
bbot_analyzer-1 | Traceback (most recent call last):  
bbot_analyzer-1 | File "/usr/local/lib/python3.11/site-packages/quart/app.py", line 1464, in handle_request  
bbot_analyzer-1 | return await self.full_dispatch_request(request_context)  
bbot_analyzer-1 | File "/usr/local/lib/python3.11/site-packages/bbot/scanner/modules/module.py", line 118, in start  
bbot_analyzer-1 | multiprocessing.process.py:118:start(): daemonic processes are not allowed to have children  

4. Key Findings:

  • BBOT’s internal modules (like httpx, sslcert, baddns, etc.) are being spawned as child processes inside a daemonized parent.

  • In Python, daemonic processes are not allowed to have children, leading to a failure when BBOT tries to fork subprocesses.

  • The issue persists even after:

    • Monkey-patching multiprocessing.Process.__init__ to force daemon=False.
    • Using multiprocessing.set_start_method("spawn", force=True).
    • Disabling problematic BBOT modules (ffuf_shortnames, filedownload).
    • Running BBOT using subprocess.Popen() instead of direct execution.

5. Possible Solutions & Request for Help

  • How can BBOT be modified to work inside a containerized environment without triggering this issue?
  • Would modifying BBOT’s multiprocessing logic to allow non-daemon processes be an option?
  • Has anyone successfully deployed BBOT inside Docker without facing this limitation?

Any insights or guidance would be greatly appreciated! Thanks in advance. 🙌

Hey! Thanks for your response. above are the files to reproduce. Quart is not working which is also the same issue in #191
And yest there is no error with above modules. They were asking for sudo permissions so I disabled them temporarily.

@AnshSinghal
Copy link
Author

Also @TheTechromancer I think there is some problem with the module iis_shortname

[INFO] flirtatious_fox: Modules running (incoming:processing:outgoing) iis_shortnames(0:1:0)
INFO:bbot.scanner:flirtatious_fox: No events in queue (0 processed in the past 15 seconds)
[INFO] flirtatious_fox: Events produced so far: DNS_NAME: 56, URL: 38, OPEN_TCP_PORT: 32, STORAGE_BUCKET: 6, TECHNOLOGY: 5, SCAN: 1, ORG_STUB: 1, VULNERABILITY: 1
[INFO] flirtatious_fox: No events in queue (0 processed in the past 15 seconds)
VERBOSE:bbot.modules.iis_shortnames:node_count: 45 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 45 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 46 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 46 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 47 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 47 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 48 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 48 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 49 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 49 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 50 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 50 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 50 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 50 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
VERBOSE:bbot.modules.iis_shortnames:node_count: 49 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 49 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 50 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 50 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
VERBOSE:bbot.modules.iis_shortnames:node_count: 50 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 50 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
VERBOSE:bbot.modules.iis_shortnames:node_count: 51 for node: https://www.youtube.com/
VERBOSE:bbot.modules.iis_shortnames:iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
[VERB] iis_shortnames: node_count: 51 for node: https://www.youtube.com/
[VERB] iis_shortnames: iis_shortnames: max_node_count (50) exceeded for node: https://www.youtube.com/. Affected branch will be terminated.
VERBOSE:bbot.modules.iis_shortnames:node_count: 48 for node: https://www.youtube.com/
[VERB] iis_shortnames: node_count: 48 for node: https://www.youtube.com/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants