The Jaguar Conservation Agent is a Semantic Graph RAG (Retrieval-Augmented Generation) system that combines OpenAI GPT with RDF knowledge graphs to provide intelligent conservation insights. This system demonstrates:
- Conversational AI using Microsoft Agent Framework DevUI that queries knowledge graphs with SPARQL
- Ontology-Aware Knowledge Extraction that transforms unstructured text into structured RDF data
This is true semantic Graph RAG using formal ontologies (RDFS/OWL), not labeled property graphs (LPG). The system showcases why ontologies are essential for intelligent knowledge representation and extraction.
- Knowledge Graph Integration: Direct SPARQL queries against GraphDB triple store
- LLM-Driven Query Generation: AI automatically generates SPARQL from natural language
- Hybrid Intelligence: Combines structured graph data with LLM reasoning
- Real-time Data Retrieval: Live queries against jaguar conservation database
- Context-Aware Responses: Maintains conversation context across queries
- Semantic Entity Disambiguation: Distinguishes wildlife jaguars from cars and guitars
- Concept Understanding: Uses formal ontologies to understand domain context
- Automated RDF Generation: Creates valid Turtle syntax aligned with ontology
- Relationship Inference: Discovers implicit connections between entities
- Zero Post-Processing: Extracts clean, structured data without manual cleanup
- Name: JaguarQueryAgent
- Role: Graph RAG specialist for jaguar conservation
- Domain: Jaguar population, conservation, habitats, threats
- Architecture: Microsoft Agent Framework with DevUI integration
- SPARQL Query Generation: Convert natural language to SPARQL queries
- Graph Data Interpretation: Understand complex ontology structures
- Natural Language Responses: Generate human-readable answers from graph data
- Conversation Context: Maintain chat history and context
- Markdown Formatting: Rich text responses with code blocks and formatting
βββββββββββββββββββββββββββββββββββββββββββ
β Microsoft Agent Framework β
β βββββββββββββββββββββββββββββββββββ β
β β DevUI Server β β
β β - Auto-opening Browser β β
β β - Interactive Chat Interface β β
β β - Built-in Debugging β β
β β - Real-time Responses β β
β βββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββ
β Agent Framework β
β βββββββββββββββββββββββββββββββββββ β
β β Jaguar Query Agent β β
β β βββββββββββββββββββββββββββ β β
β β β System Prompt β β β
β β β - Graph RAG Context β β β
β β β - SPARQL Guidelines β β β
β β β - Response Formatting β β β
β β βββββββββββββββββββββββββββ β β
β β β β
β β βββββββββββββββββββββββββββ β β
β β β OpenAI Client β β β
β β β - GPT-4 Integration β β β
β β β - Function Calling β β β
β β β - Thread Management β β β
β β βββββββββββββββββββββββββββ β β
β β β β
β β βββββββββββββββββββββββββββ β β
β β β GraphDB Tool β β β
β β β - SPARQL Execution β β β
β β β - Query Validation β β β
β β β - Result Processing β β β
β β βββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββ
β External Systems β
β βββββββββββββββββββ βββββββββββββββ β
β β OpenAI API β β GraphDB β β
β β - GPT-4 β β - RDF Store β β
β β - Responses β β - SPARQL β β
β βββββββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββ
The agent uses a carefully crafted system prompt optimized for Graph RAG:
- Graph RAG Focus: Always use the GraphDB tool for jaguar-related queries
- SPARQL Generation: Convert natural language to valid SPARQL queries
- Ontology Awareness: Base queries on the provided jaguar ontology
- Response Formatting: Use markdown with code blocks for SPARQL
- Data Attribution: Always mention that information comes from the jaguar database
- Form simple queries first, add complexity only if needed
- Show the SPARQL query once in each response
- Use bold for emphasis when appropriate
- Use bullet points or numbered lists for multiple items
- Break up long responses into paragraphs
- Be concise but comprehensive in answers
The agent's primary tool for knowledge graph retrieval:
Function Name: query_jaguar_database
Purpose: Execute SPARQL queries against the jaguar conservation knowledge graph
Parameters:
sparql_query: SPARQL query string generated by the LLM
Ontology Coverage:
- Classes: Jaguar, Habitat, Location, Threat, ConservationEffort, Organization
- Properties: hasGender, wasKilled, rescuedBy, facesThreat, hasLocation, etc.
- Relationships: Complex conservation data relationships
# Count total jaguars in database
SELECT (COUNT(?jaguar) as ?count) WHERE {
?jaguar a :Jaguar .
}
# Find jaguars by gender with labels
SELECT ?jaguar ?label ?gender WHERE {
?jaguar a :Jaguar .
OPTIONAL { ?jaguar rdfs:label ?label . }
OPTIONAL { ?jaguar :hasGender ?gender . }
}
# Find jaguars that were killed with cause of death
SELECT ?jaguar ?label ?causeOfDeath WHERE {
?jaguar a :Jaguar .
?jaguar :wasKilled true .
OPTIONAL { ?jaguar rdfs:label ?label . }
OPTIONAL { ?jaguar :causeOfDeath ?causeOfDeath . }
}
# Find conservation efforts by organization
SELECT ?effort ?org ?description WHERE {
?effort a :ConservationEffort .
?effort :conductedBy ?org .
?effort rdfs:label ?description .
?org a :Organization .
}# main.py starts DevUI server
from agent_framework.devui import serve
from src.agents.jaguar_query_agent import create_jaguar_query_agent
query_agent = create_jaguar_query_agent()
serve(entities=[query_agent], auto_open=True)- Auto-opening Browser: DevUI launches at
http://localhost:8000 - Interactive Chat: User types queries in the DevUI interface
- Real-time Responses: Immediate feedback and conversation flow
- LLM Analysis: OpenAI GPT analyzes the user's natural language query
- SPARQL Generation: AI generates appropriate SPARQL query for GraphDB
- Ontology Mapping: Maps user intent to jaguar ontology classes/properties
- Tool Call Detection: Agent Framework detects need for GraphDB tool
- SPARQL Execution:
query_jaguar_databasetool executes query against GraphDB - Result Processing: Raw SPARQL results are processed and formatted
- LLM Interpretation: OpenAI GPT interprets GraphDB results
- Natural Language Response: Generates human-readable response
- Markdown Formatting: Applies formatting with code blocks for SPARQL
- Thread Management: Agent Framework maintains conversation context
- DevUI History: Built-in conversation history in DevUI interface
- State Persistence: Conversation state preserved across requests
# DevUI handles all state management automatically
# No manual state configuration required
serve(entities=[query_agent], auto_open=True)- Agent Framework: Microsoft Agent Framework manages conversation context
- OpenAI Responses API: Server-side thread persistence
- DevUI Interface: Built-in conversation history and state management
- Agent Framework: Maintains conversation context for LLM
- DevUI: Built-in conversation history display
- Persistence: Thread state preserved across requests automatically
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key
OPENAI_RESPONSES_MODEL_ID=gpt-4
# GraphDB Configuration
GRAPHDB_URL=http://localhost:7200
GRAPHDB_REPOSITORY=jaguar_conservation# Hardcoded in create_jaguar_query_agent()
settings = OpenAISettings(
api_key=os.getenv("OPENAI_API_KEY", ""),
model_id=os.getenv("OPENAI_RESPONSES_MODEL_ID", "gpt-4")
)The agent applies Graph RAG-specific formatting:
- SPARQL Display: Show generated query in code blocks
- Data Attribution: Always mention jaguar database source
- Markdown Formatting: Use bold, bullet points, code blocks
- Structured Responses: Break complex answers into paragraphs
- SPARQL Errors: Invalid query syntax, ontology mismatches
- GraphDB Errors: Connection failures, query timeouts
- LLM Errors: OpenAI API failures, rate limits
- Web Errors: Flask request/response errors
# DevUI handles error display automatically
# Errors are shown in the DevUI interface
# No manual error handling required# Start the DevUI application
python3 main.py
# DevUI automatically opens at http://localhost:8000Try these Graph RAG queries in the DevUI interface:
Basic Queries:
- "How many jaguars are in the database?"
- "Show me all female jaguars"
- "What conservation efforts are being conducted?"
Complex Queries:
- "Find jaguars that were killed and their causes of death"
- "Which organizations are conducting conservation efforts?"
- "Show me jaguars by location and their monitoring dates"
# Direct agent usage (if needed)
from src.agents.jaguar_query_agent import create_jaguar_query_agent
import asyncio
agent = create_jaguar_query_agent()
thread = agent.get_new_thread()
# Run a query
response = asyncio.run(agent.run(
"How many jaguars are in the database?",
thread=thread,
store=True
))
print(response.text)The system includes a Jupyter notebook (text2knowledge.ipynb) that demonstrates ontology-driven extraction:
Challenge: Extract jaguar conservation data from a corpus containing:
- π Wildlife jaguars (Panthera onca)
- π Jaguar cars (E-Type, XK-E)
- πΈ Fender Jaguar guitars
Solution: By providing GPT-5 with the jaguar conservation ontology, the LLM:
- Understands the domain context from formal class definitions
- Disambiguates entities based on semantic structure
- Extracts only wildlife-related information
- Generates RDF Turtle aligned with the ontology
- Infers relationships between entities from context
1. Load Ontology (jaguar_ontology.ttl)
β
2. Load Mixed-Content Corpus (jaguar_corpus.txt)
β
3. GPT-5 Semantic Analysis
- Understands domain from ontology classes/properties
- Identifies relevant entities (wildlife jaguars only)
- Maps relationships to ontology structure
β
4. Generate RDF Turtle
- Valid syntax
- Aligned with ontology
- Proper URIs and datatypes
β
5. Import to GraphDB
- Use "Import β Text snippet"
- Data integrates seamlessly with existing graph
This CANNOT be done with LPG databases because:
β No Formal Semantics
- LPG labels are just strings (
"Jaguar","OCCURS_IN") - No machine-readable domain definitions
- LLM has no semantic guidance
β No Class Hierarchies
- No RDFS/OWL inheritance
- No taxonomic structure
- No reasoning capabilities
β No Property Constraints
- No domain/range definitions
- No cardinality rules
- No validation mechanisms
β RDF/Ontologies Provide
- Formal class definitions (
ont:Jaguar rdfs:subClassOf ont:Animal) - Property semantics (
ont:hasGender rdfs:domain ont:Jaguar) - Hierarchical structure for LLM understanding
- Validation and reasoning capabilities
- W3C standards for interoperability
Input Corpus (mixed content):
El Jefe is an adult, male jaguar that was seen in Arizona...
The Jaguar E-Type is a British sports car manufactured by Jaguar Cars...
The Fender Jaguar is an electric guitar characterized by...
Output RDF (wildlife only):
:ElJefe a ont:Jaguar ;
rdfs:label "El Jefe" ;
ont:hasGender "Male" ;
ont:occursIn :Arizona ;
ont:monitoredByOrg :AZGFD .No cars. No guitars. Just semantically relevant conservation data.
- Multi-Modal Queries: Combine text and image analysis
- Temporal Queries: Time-based conservation trend analysis
- Geospatial Queries: Location-based jaguar population mapping
- Predictive Analytics: Conservation outcome predictions
- Multi-Domain Ontologies: Extract from diverse sources
- Streaming Extraction: Real-time knowledge mining
- Active Learning: Improve extraction with user feedback
- Cross-Lingual Extraction: Process multilingual corpora
- Dynamic Ontology Updates: Real-time ontology evolution
- Federated Queries: Query multiple knowledge graphs
- Semantic Reasoning: Advanced SPARQL reasoning capabilities
- Graph Visualization: Interactive knowledge graph exploration
- Multi-Agent Workflows: Specialized agents for different tasks
- Streaming Responses: Real-time Graph RAG query processing
- Human-in-the-Loop: Query refinement and validation
- Memory Management: Long-term conversation summarization
- DevUI Interface: Test queries through the DevUI
- SPARQL Validation: Verify generated queries in GraphDB
- Response Quality: Check natural language response accuracy
# Test basic counting
"How many jaguars are in the database?"
# Test filtering
"Show me all male jaguars"
# Test relationships
"Which jaguars were rescued and by which organization?"
# Test complex queries
"Find conservation efforts in Arizona with their success rates"- Simple Queries: 2-3 seconds (count, basic filters)
- Complex Queries: 3-5 seconds (joins, aggregations)
- No Tool Calls: 1-2 seconds (general conversation)
- Query Caching: Cache frequent SPARQL patterns
- Index Optimization: Optimize GraphDB indexes
- Response Streaming: Stream partial results
- Async Processing: Non-blocking GraphDB queries
- Query Validation: Validate SPARQL syntax before execution
- Parameter Escaping: Properly escape user inputs
- GraphDB Security: Leverage GraphDB's built-in security
- Read-Only Access: Agent only reads from GraphDB
- No Data Storage: No user data persisted beyond conversation
- Secure APIs: Use HTTPS for all external communications