Agentic SOC Orchestrator (MCP + Gemini) with Streamlit UI

An agentic SOC investigation workflow that orchestrates a multi-stage incident pipeline across:

Telemetry (cases + events)
Threat Intel (IOC enrichment)
Knowledge / Runbooks
Response controls (governed, human-in-the-loop actions)
Gemini LLM Analyst Notes (executive summary, hypotheses, safe query suggestions)

This repo includes two telemetry modes:

Dummy mode (synthetic demo): deterministic, easy to reproduce
IoT-23 mode (real IoT telemetry): Zeek-derived network telemetry from the IoT-23 dataset (processed into this project’s event schema)

The LLM does not get to “run tools freely.” Gemini only produces advisory outputs. Tool calls remain allowlisted + governed.

What this project demonstrates

Agentic SOC workflow with a staged state machine: triage → investigate → validate → recommend → respond
MCP tool orchestration across telemetry, threat intel, runbooks, and response services
Governed actions + audit trails (human-in-the-loop, logged)
LLM-assisted reasoning (Gemini):
- Executive summary (“what happened”)
- Hypotheses (“plausible narratives”)
- Suggested next queries (advisory; validated/allowlisted; not blindly executed)

Screenshots

Streamlit UI

Architecture (high level)

Streamlit UI → calls → FastAPI Orchestrator → orchestrates MCP tools:

telemetry.get_case, telemetry.search_events (+ optional pivots)
threatintel.enrich / enrich_batch
knowledge.get_runbook
response.request_action(s) (governed + audited)
Gemini produces LLM Analysis, embedded into the final report

Outputs:

JSON response (structured result)
Markdown incident report
Logs: logs/agent.jsonl
Audit trail: logs/audit.jsonl

Repository layout

services/
  agent_orchestrator/
    app.py               # FastAPI API
    agent.py              # Core agent pipeline + guardrails + Gemini integration
    planner.py            # Deterministic planner (tools + state transitions)
    report.py             # Markdown report generator (includes LLM section)
    mcp_client.py         # JSON-RPC MCP client + logging + audit
    llm_gemini.py         # Gemini wrapper
    logging_config.py     # JSON logging config
    audit.py              # AuditTrail JSONL writer
services/
  mcp_telemetry/          # Telemetry MCP service (dummy or IoT-23 backend)
  mcp_threatintel/        # Threat intel MCP service (demo)
  mcp_knowledge/          # Runbooks MCP service (demo)
  mcp_response/           # Response MCP service (demo / governed actions)
frontend/
  streamlit_app.py        # Streamlit UI
data/
  cases.json              # Dummy cases (demo mode)
  telemetry.jsonl         # Dummy telemetry events (demo mode)
  threat_intel.json       # Demo threat intel mappings
  runbooks.json           # Demo runbooks
  iot23/
    processed/            # IoT-23 processed outputs (cases.json + telemetry.jsonl)
    raw/
Dockerfile
Dockerfile.streamlit
docker-compose.yml
requirements.txt

Telemetry modes

A) Dummy mode (default)

Uses:

data/cases.json
data/telemetry.jsonl

B) IoT-23 mode (real IoT telemetry)

Uses processed files (generated by running the preprocessing script scripts/iot23_build_processed.py):

data/iot23/processed/cases.json
data/iot23/processed/telemetry.jsonl

Enable IoT-23 mode with:

TELEMETRY_BACKEND=iot23

Prerequisites

Python 3.11+
(Recommended) Docker + Docker Compose
(Optional) Kubernetes (Docker Desktop Kubernetes / local cluster)

Environment setup (.env)

Create a .env file in the repo root:

# Gemini
GEMINI_API_KEY=YOUR_KEY_HERE
GEMINI_MODEL=gemini-2.5-flash
ENABLE_LLM=true

# Telemetry backend
# dummy | iot23
TELEMETRY_BACKEND=dummy

# MCP endpoints (local)
MCP_TELEMETRY_URL=http://127.0.0.1:8011
MCP_THREATINTEL_URL=http://127.0.0.1:8012
MCP_RESPONSE_URL=http://127.0.0.1:8013
MCP_KNOWLEDGE_URL=http://127.0.0.1:8014

Notes:

.env is intentionally not committed (see .gitignore).
Docker Compose loads .env automatically.

Run locally (no Docker)

1) Install deps

python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate

pip install -r requirements.txt

2) Start MCP services (in separate terminals)

Start each service (example pattern):

python -m uvicorn services.mcp_telemetry.app:app --host 127.0.0.1 --port 8011 --reload
python -m uvicorn services.mcp_threatintel.app:app --host 127.0.0.1 --port 8012 --reload
python -m uvicorn services.mcp_response.app:app --host 127.0.0.1 --port 8013 --reload
python -m uvicorn services.mcp_knowledge.app:app --host 127.0.0.1 --port 8014 --reload

3) Start FastAPI orchestrator

python -m uvicorn services.agent_orchestrator.app:app --host 127.0.0.1 --port 8020 --reload

Health:

http://127.0.0.1:8020/health

4) Start Streamlit UI (new terminal)

streamlit run frontend/streamlit_app.py --server.port 8502

Open:

http://localhost:8502

Run with Docker Compose

This setup runs:

FastAPI orchestrator (port 8020)
Streamlit UI (port 8502)

1) Build + run

docker compose up --build

2) Verify health

curl http://localhost:8020/health

Run an investigation

Option A: FastAPI directly

curl "http://localhost:8020/investigate/1002?simulate_response=true&requested_by=analyst"

Option B: Streamlit UI

Open http://localhost:8502, enter a Case ID, click Investigate.

How to confirm Gemini is running

1) Look for log markers

Check logs/agent.jsonl for:

llm_call_start
llm_call_done (with ok: true)
llm_analysis_done

Example:

{"logger":"agent.core","msg":"llm_call_done","duration_ms":11246,"ok":true}

2) Confirm in API response

The /investigate/{case_id} response includes:

result.summary.llm_analysis

And the Markdown report contains:

## LLM Analysis (Gemini)

Logs and audit trail

Runtime + tool calls: logs/agent.jsonl
Tool calls + state transitions (audit): logs/audit.jsonl

If you run in Docker/Kubernetes and want logs persisted:

Use a volume mount for /app/logs (recommended for demos)

Guardrails / Safety model

Deterministic state machine workflow
Allowlisted tool calls only
Response actions are governed
- simulate_response=true allows simulation without real execution
- confidence threshold gates execution
Audit logs for every tool call + state transition
LLM outputs are advisory, not executable commands

IoT-23 dataset notes

This project can run on processed IoT-23 telemetry (Zeek logs transformed into the same event schema used by the agent).

The agent extracts IOCs (IPs/domains/hashes) from events
Pivot tools (telemetry.pivot, telemetry.pivot_domain) help explore related activity
Confidence can start low if only network connection logs are present; it improves with richer telemetry (DNS/HTTP/SSL, auth logs, process logs, EDR signals, etc.)

Roadmap: next improvements

Already implemented / in-progress ideas:

Better IOC extraction for domains (URLs/emails + normalization)
Action deduplication to avoid repeated response actions
Strict validator for LLM suggested_next_queries (allowlist tools + args schema)

Top additional improvements:

Richer tagging + confidence calibration
Add heuristics that leverage IoT-23 labels (Benign vs Malicious) and connection features (rare ports, burst patterns, beaconing) to improve tags + confidence scores.
Pivot explorer + timeline UI
Add a timeline view (group by minute/hour) and an IOC pivot explorer to navigate events interactively in Streamlit.
Observability + metrics
Add Prometheus metrics for tool latency, error rate, LLM latency, and pipeline stage durations (great for Docker/Kubernetes demos).

Common issues

Streamlit can’t reach FastAPI in Docker

Inside Docker, use service DNS name:

AGENT_ORCH_URL=http://agent_orchestrator:8020

Confirm from inside the container:

docker compose exec streamlit_ui python -c "import os,requests; u=os.getenv('AGENT_ORCH_URL'); print(u, requests.get(u+'/health').json())"

Gemini “API key missing” in Docker

Make sure:

.env exists in repo root
GEMINI_API_KEY is present
env_file: .env is included in docker-compose

License

This project is licensed under the MIT License — see the LICENSE.txt file for details.

Contributing

Contributions are welcome!

Fork the repository
Create a feature branch
git checkout -b feature/your-feature
Make your changes (include clear documentation / comments)
Run tests (if applicable)
pytest
Submit a Pull Request with a detailed description of what you changed and why

Support

For questions or issues:

Check logs:
- logs/agent.jsonl (runtime + tool calls)
- logs/audit.jsonl (audit trail + state transitions)
Open a GitHub issue and include:
- steps to reproduce
- expected vs actual behavior
- relevant log lines (redact secrets)
- your environment (OS, Python version, Docker version)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
eval		eval
frontend		frontend
scripts		scripts
services		services
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Case1002.png		Case1002.png
Dockerfile		Dockerfile
Dockerfile.streamlit		Dockerfile.streamlit
LICENSE.txt		LICENSE.txt
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Agentic SOC Orchestrator (MCP + Gemini) with Streamlit UI

What this project demonstrates

Screenshots

Streamlit UI

Architecture (high level)

Repository layout

Telemetry modes

A) Dummy mode (default)

B) IoT-23 mode (real IoT telemetry)

Prerequisites

Environment setup (.env)

Run locally (no Docker)

1) Install deps

2) Start MCP services (in separate terminals)

3) Start FastAPI orchestrator

4) Start Streamlit UI (new terminal)

Run with Docker Compose

1) Build + run

2) Verify health

Run an investigation

Option A: FastAPI directly

Option B: Streamlit UI

How to confirm Gemini is running

1) Look for log markers

2) Confirm in API response

Logs and audit trail

Guardrails / Safety model

IoT-23 dataset notes

Roadmap: next improvements

Common issues

Streamlit can’t reach FastAPI in Docker

Gemini “API key missing” in Docker

License

Contributing

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages