An agentic system that combines technical infrastructure data from the Internet Yellow Pages (IYP) Neo4j database with real-time web research to generate comprehensive policy recommendations for improving national internet resilience. The engine uses a two-phase AI approach: investigative research followed by strategic synthesis.
The Internet Resilience Index (IRI) measures a country's internet resilience across four pillars, but understanding why scores are low and what to do requires both technical analysis and contextual understanding. This project automates that process by:
- Querying infrastructure and network topology data from Neo4j
- Conducting web research for policy context, recent events, and regulations
- Synthesizing both data streams into actionable strategic reports
- Providing prioritized recommendations with implementation roadmaps
request_for_YPI/
├── generate_report.py # Main orchestrator (two-phase agentic workflow)
├── src/
│ ├── agents/
│ │ └── graph.py # LangGraph agent definition with tool routing
│ ├── tools/
│ │ ├── google.py # Google Custom Search integration
│ │ ├── scraper.py # Web page and PDF content extraction
│ │ └── neo4j.py # Database query execution tool
│ └── utils/
│ ├── llm.py # LLM configuration (fast/smart/reasoning modes)
│ ├── formatting.py # Neo4j result formatting with Jinja2
│ ├── loaders.py # File and YAML loaders
│ └── pdf_extractor.py # PDF text extraction with PyMuPDF
├── prompt/
│ ├── render_document_thinking.txt # Expert system prompt for reasoning phase
│ └── render_document_based.txt # Legacy prompt template
├── infrastructure/ # Pillar 1: IXPs, data centers
├── preparation_marche/ # Pillar 2: Peering, competition, domains
├── performance/ # Pillar 3: Speed metrics
└── securite/ # Pillar 4: MANRS, IPv6, DNSSEC, DDoS
Each indicator directory contains:
*.cypher- Progressive queries building complete analysis*.md- Technical documentation and analysis plansquery_templates.yaml- Jinja2 templates for formatting Neo4j results
Uses LangGraph to orchestrate an autonomous agent that:
- Executes Neo4j queries via the
run_infrastructure_querytool - Searches the web for policy documents, news, and regulations via
search_google - Reads and extracts content from web pages and PDFs via
read_web_page - Decides autonomously which tools to use and when to stop researching
Model: Configurable (fast=Mistral Small, smart=Mistral Large)
Uses a reasoning model to:
- Analyze the complete investigation history
- Correlate quantitative metrics with qualitative context
- Perform root cause analysis linking technical issues to policy gaps
- Generate comprehensive reports following expert prompt structure
- Provide prioritized, actionable recommendations
Model: Magistral (Mistral's reasoning model)
Fully Supported Indicators:
- IXP Coverage, Peering Efficiency, Domain Analysis
- Market Competition (HHI), Transit Dependency
- MANRS Adoption, IPv6 Deployment, DNSSEC Analysis
- DDoS Protection (CDN presence)
Not Supported:
- Performance metrics (requires Ookla data)
- HTTPS adoption (requires certificate data)
- Economic indicators (requires external pricing data)
pip install -r requirements.txt- Use Internet Society's Public Instance:
URI = 'neo4j://iyp-bolt.ihr.live:7687'
AUTH = NoneCreate a .env file:
MISTRAL_API_KEY=your_key
GOOGLE_API_KEY=your_key
GOOGLE_CX_ID=your_key
LANGCHAIN_API_KEY=your_key
LANGCHAIN_PROJECT=your_key
LANGCHAIN_ENDPOINT=your_key
LANGCHAIN_TRACING_V2="True"Mistral AI: Create an account at https://console.mistral.ai and generate an API key from the dashboard. Requires a payment method for production use.
Google Custom Search: Visit https://console.cloud.google.com to create a project, enable the Custom Search API, and generate credentials. Then create a Custom Search Engine at https://programmablesearchengine.google.com to obtain your CX ID.
LangSmith (Optional): Sign up at https://smith.langchain.com for conversation tracing and debugging. Free tier available for development.
Generate a comprehensive report for an indicator:
python request_for_YPI/generate_report.py infrastructure/ixp_coverage --country=FR --mode=smartParameters:
indicator_input: Partial or full path to indicator folder--country: ISO country code (default: FR)--domain: Domain name for analysis (default: gouv.fr)--asn: AS number (default: 16276)--mode: Research phase model - 'fast' or 'smart' (default: smart)
Test a single query:
python testfiles/request_testing.py request_for_YPI/securite/hygiene_routage/score_manrs/1.cypher --country=FRTest query with LLM formatting preview:
python testfiles/run_query.py request_for_YPI/preparation_marche/localisation_trafic/efficacite_peering/1.cypher --country=FRRun all queries validation suite:
python testfiles/unit_test_request.pyGenerated reports include:
- Executive Summary - Current state, key findings, resilience assessment
- Detailed Technical Analysis - Quantitative metrics with qualitative context
- Risk Assessment Matrix - Technical, operational, and strategic risks
- Strategic Recommendations Framework
- Short-term (0-12 months)
- Medium-term (1-3 years)
- Long-term (3-5 years)
- Each with: complexity, cost, impact, stakeholders, KPIs
- Prioritization Framework - Quick wins vs strategic investments
- Implementation Roadmap - Quarterly breakdown with dependencies
- Measurement & Monitoring Framework - KPIs and review schedule
- Risk Mitigation & Contingency Planning
- Funding Strategy
- International Best Practices & Case Studies
- IRI: Internet Resilience Index - Composite score measuring resilience across Infrastructure, Market Preparation, Performance, and Security pillars
- IYP: Internet Yellow Pages - Graph database mapping global internet topology, peering relationships, and routing security
- Agentic Workflow: AI agent autonomously decides which data sources to query and when sufficient information has been gathered
- Two-Phase Architecture: Separate research phase (breadth) from synthesis phase (depth)
- Query Pattern: Overview → Metrics → Gaps → Recommendations
The system supports three operational modes:
- fast: Mistral Small - Quick processing for web scraping and summarization
- smart: Mistral Large - Balanced performance for research and analysis
- reasoning: Magistral - Advanced reasoning for strategic synthesis (Phase 2 only)
Adding new indicators:
- Create directory under appropriate pillar
- Write progressive
.cypherqueries (1.cypher, 2.cypher, etc.) - Document analysis approach in
.mdfile - Add formatting templates in
query_templates.yaml - Test with validation suite:
python testfiles/unit_test_request.py
Reports are saved in Markdown format in the indicator directory:
report_{indicator_name}_countryCode-{CC}_domainName-{domain}_hostingASN-{asn}.md
Core libraries:
- neo4j - Database connectivity
- langchain / langgraph - Agentic framework
- langchain-mistralai - LLM integration
- trafilatura - Web content extraction
- PyMuPDF - PDF text extraction
- Jinja2 / PyYAML - Template rendering
- requests / beautifulsoup4 - HTTP and parsing
Active Development | Version 0.3 | Last Updated: January 2026