Automated Recommendation Engine for Enhancing Internet Resilience

Overview

An agentic system that combines technical infrastructure data from the Internet Yellow Pages (IYP) Neo4j database with real-time web research to generate comprehensive policy recommendations for improving national internet resilience. The engine uses a two-phase AI approach: investigative research followed by strategic synthesis.

Purpose

The Internet Resilience Index (IRI) measures a country's internet resilience across four pillars, but understanding why scores are low and what to do requires both technical analysis and contextual understanding. This project automates that process by:

Querying infrastructure and network topology data from Neo4j
Conducting web research for policy context, recent events, and regulations
Synthesizing both data streams into actionable strategic reports
Providing prioritized recommendations with implementation roadmaps

Architecture

request_for_YPI/
├── generate_report.py        # Main orchestrator (two-phase agentic workflow)
├── src/
│   ├── agents/
│   │   └── graph.py          # LangGraph agent definition with tool routing
│   ├── tools/
│   │   ├── google.py         # Google Custom Search integration
│   │   ├── scraper.py        # Web page and PDF content extraction
│   │   └── neo4j.py          # Database query execution tool
│   └── utils/
│       ├── llm.py            # LLM configuration (fast/smart/reasoning modes)
│       ├── formatting.py     # Neo4j result formatting with Jinja2
│       ├── loaders.py        # File and YAML loaders
│       └── pdf_extractor.py  # PDF text extraction with PyMuPDF
├── prompt/
│   ├── render_document_thinking.txt  # Expert system prompt for reasoning phase
│   └── render_document_based.txt     # Legacy prompt template
├── infrastructure/           # Pillar 1: IXPs, data centers
├── preparation_marche/       # Pillar 2: Peering, competition, domains
├── performance/              # Pillar 3: Speed metrics
└── securite/                 # Pillar 4: MANRS, IPv6, DNSSEC, DDoS

Each indicator directory contains:

*.cypher - Progressive queries building complete analysis
*.md - Technical documentation and analysis plans
query_templates.yaml - Jinja2 templates for formatting Neo4j results

Two-Phase Agentic Architecture

Phase 1: Investigation (Agentic Research)

Uses LangGraph to orchestrate an autonomous agent that:

Executes Neo4j queries via the run_infrastructure_query tool
Searches the web for policy documents, news, and regulations via search_google
Reads and extracts content from web pages and PDFs via read_web_page
Decides autonomously which tools to use and when to stop researching

Model: Configurable (fast=Mistral Small, smart=Mistral Large)

Phase 2: Strategic Synthesis (Reasoning Mode)

Uses a reasoning model to:

Analyze the complete investigation history
Correlate quantitative metrics with qualitative context
Perform root cause analysis linking technical issues to policy gaps
Generate comprehensive reports following expert prompt structure
Provide prioritized, actionable recommendations

Model: Magistral (Mistral's reasoning model)

IRI Coverage

Fully Supported Indicators:

IXP Coverage, Peering Efficiency, Domain Analysis
Market Competition (HHI), Transit Dependency
MANRS Adoption, IPv6 Deployment, DNSSEC Analysis
DDoS Protection (CDN presence)

Not Supported:

Performance metrics (requires Ookla data)
HTTPS adoption (requires certificate data)
Economic indicators (requires external pricing data)

Quick Start

Installation

pip install -r requirements.txt

Database Setup

Use Internet Society's Public Instance:

URI = 'neo4j://iyp-bolt.ihr.live:7687'
AUTH = None

API Keys Configuration

Create a .env file:

MISTRAL_API_KEY=your_key
GOOGLE_API_KEY=your_key
GOOGLE_CX_ID=your_key
LANGCHAIN_API_KEY=your_key
LANGCHAIN_PROJECT=your_key
LANGCHAIN_ENDPOINT=your_key
LANGCHAIN_TRACING_V2="True"

Obtaining API Keys

Mistral AI: Create an account at https://console.mistral.ai and generate an API key from the dashboard. Requires a payment method for production use.

Google Custom Search: Visit https://console.cloud.google.com to create a project, enable the Custom Search API, and generate credentials. Then create a Custom Search Engine at https://programmablesearchengine.google.com to obtain your CX ID.

LangSmith (Optional): Sign up at https://smith.langchain.com for conversation tracing and debugging. Free tier available for development.

Usage

Generate a comprehensive report for an indicator:

python request_for_YPI/generate_report.py infrastructure/ixp_coverage --country=FR --mode=smart

Parameters:

indicator_input: Partial or full path to indicator folder
--country: ISO country code (default: FR)
--domain: Domain name for analysis (default: gouv.fr)
--asn: AS number (default: 16276)
--mode: Research phase model - 'fast' or 'smart' (default: smart)

Testing Queries

Test a single query:

python testfiles/request_testing.py request_for_YPI/securite/hygiene_routage/score_manrs/1.cypher --country=FR

Test query with LLM formatting preview:

python testfiles/run_query.py request_for_YPI/preparation_marche/localisation_trafic/efficacite_peering/1.cypher --country=FR

Run all queries validation suite:

python testfiles/unit_test_request.py

Report Structure

Generated reports include:

Executive Summary - Current state, key findings, resilience assessment
Detailed Technical Analysis - Quantitative metrics with qualitative context
Risk Assessment Matrix - Technical, operational, and strategic risks
Strategic Recommendations Framework
- Short-term (0-12 months)
- Medium-term (1-3 years)
- Long-term (3-5 years)
- Each with: complexity, cost, impact, stakeholders, KPIs
Prioritization Framework - Quick wins vs strategic investments
Implementation Roadmap - Quarterly breakdown with dependencies
Measurement & Monitoring Framework - KPIs and review schedule
Risk Mitigation & Contingency Planning
Funding Strategy
International Best Practices & Case Studies

Key Concepts

IRI: Internet Resilience Index - Composite score measuring resilience across Infrastructure, Market Preparation, Performance, and Security pillars
IYP: Internet Yellow Pages - Graph database mapping global internet topology, peering relationships, and routing security
Agentic Workflow: AI agent autonomously decides which data sources to query and when sufficient information has been gathered
Two-Phase Architecture: Separate research phase (breadth) from synthesis phase (depth)
Query Pattern: Overview → Metrics → Gaps → Recommendations

LLM Modes

The system supports three operational modes:

fast: Mistral Small - Quick processing for web scraping and summarization
smart: Mistral Large - Balanced performance for research and analysis
reasoning: Magistral - Advanced reasoning for strategic synthesis (Phase 2 only)

Development

Adding new indicators:

Create directory under appropriate pillar
Write progressive .cypher queries (1.cypher, 2.cypher, etc.)
Document analysis approach in .md file
Add formatting templates in query_templates.yaml
Test with validation suite: python testfiles/unit_test_request.py

Output

Reports are saved in Markdown format in the indicator directory:

report_{indicator_name}_countryCode-{CC}_domainName-{domain}_hostingASN-{asn}.md

Dependencies

Core libraries:

neo4j - Database connectivity
langchain / langgraph - Agentic framework
langchain-mistralai - LLM integration
trafilatura - Web content extraction
PyMuPDF - PDF text extraction
Jinja2 / PyYAML - Template rendering
requests / beautifulsoup4 - HTTP and parsing

Status

Active Development | Version 0.3 | Last Updated: January 2026

Name		Name	Last commit message	Last commit date
Latest commit History 165 Commits
data		data
rapport		rapport
request_for_YPI		request_for_YPI
schema_rapport		schema_rapport
testfiles		testfiles
web		web
.gitignore		.gitignore
README.md		README.md
generate_llm_brief.py		generate_llm_brief.py
ipv6_report.py		ipv6_report.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Recommendation Engine for Enhancing Internet Resilience

Overview

Purpose

Architecture

Two-Phase Agentic Architecture

Phase 1: Investigation (Agentic Research)

Phase 2: Strategic Synthesis (Reasoning Mode)

IRI Coverage

Quick Start

Installation

Database Setup

API Keys Configuration

Obtaining API Keys

Usage

Testing Queries

Report Structure

Key Concepts

LLM Modes

Development

Output

Dependencies

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Automated Recommendation Engine for Enhancing Internet Resilience

Overview

Purpose

Architecture

Two-Phase Agentic Architecture

Phase 1: Investigation (Agentic Research)

Phase 2: Strategic Synthesis (Reasoning Mode)

IRI Coverage

Quick Start

Installation

Database Setup

API Keys Configuration

Obtaining API Keys

Usage

Testing Queries

Report Structure

Key Concepts

LLM Modes

Development

Output

Dependencies

Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages