ENTAERA is a multi-agent AI system with intelligent query routing across multiple AI providers. The system automatically selects the best AI agent for each query based on content analysis.
# Install dependencies
pip install aiohttp python-dotenv
# Configure environment
cp .env.example .env
# Edit .env with your API keys
# Run system check
python check.py
# Start the agent
python agent.pyThe system integrates three AI providers:
- Ollama (Local) - llama3.1:8b model running locally
- Google Gemini - Gemini 2.5 Flash for creative tasks
- Perplexity - Sonar model for real-time web research
Queries are automatically routed to the optimal agent based on:
- Keyword analysis
- Context classification
- Priority scoring
| Agent | Provider | Specialization | Keywords |
|---|---|---|---|
| Assistant | Ollama | General queries | help, how, what, explain |
| Code Assistant | Ollama | Programming | code, function, debug, program |
| Data Analyst | Ollama | Analysis | analyze, compare, evaluate |
| Creative Writer | Gemini 2.5 | Creative content | write, create, story, poem |
| Research Assistant | Perplexity | Real-time data | latest, news, today, current |
Query local Ollama instance.
Parameters:
prompt(str): User querysystem_prompt(str, optional): System instruction
Returns: str - AI response
Example:
result = await query_ollama("What is Python?")Query Google Gemini API with automatic key rotation.
Parameters:
prompt(str): User querysystem_prompt(str, optional): System instruction
Returns: str - AI response
Features:
- Automatic key rotation on rate limits
- Fallback to Ollama on errors
- Uses Gemini 2.5 Flash model
Query Perplexity API for real-time web search.
Parameters:
prompt(str): User querysystem_prompt(str, optional): System instruction
Returns: str - AI response with citations
Features:
- Real-time web search
- Sonar model (fast online search)
- Fallback to Ollama on errors
Automatically select the best agent for a query.
Parameters:
query(str): User input
Returns: tuple(agent_id, agent_config)
Algorithm:
- Tokenize and lowercase query
- Score against agent keywords
- Apply priority boosts (e.g., +10 for time-sensitive queries)
- Return highest scoring agent
Process a query end-to-end with routing.
Parameters:
query(str): User input
Returns: str - AI response
Flow:
- Select optimal agent
- Route to appropriate provider
- Handle errors with fallbacks
- Return response
Required in .env:
# Ollama (Local)
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b
# Google Gemini (Optional - falls back to Ollama)
GEMINI_API_KEY=your_key_here
GEMINI_API_KEY_2=your_key_here
GEMINI_API_KEY_3=your_key_here
# Perplexity (Optional - falls back to Ollama)
PERPLEXITY_API_KEY=your_key_hereModify AGENTS dictionary in agent.py:
AGENTS = {
"agent_id": {
"name": "Agent Name",
"provider": "ollama|gemini|perplexity",
"description": "What this agent does",
"keywords": ["keyword1", "keyword2"],
"system_prompt": "System instruction for this agent"
}
}All cloud API functions automatically fall back to Ollama:
try:
# Try cloud API
response = await api_call()
except Exception:
# Fallback to local Ollama
response = await query_ollama(prompt)Gemini implements automatic key rotation:
- Maintains 3 API keys
- Rotates on 429 (rate limit) errors
- Falls back to Ollama when all keys exhausted
All API calls have 30-second timeouts:
timeout=aiohttp.ClientTimeout(total=30)When running python agent.py:
| Command | Description |
|---|---|
/agents |
List all available agents |
/status |
Check API connection status |
/quit |
Exit the system |
Add a new specialized agent:
AGENTS["custom"] = {
"name": "Custom Agent",
"provider": "ollama",
"description": "Custom functionality",
"keywords": ["custom", "special"],
"system_prompt": "You are a custom specialist."
}Boost priority for specific queries:
if agent_id == "custom":
if any(k in query_lower for k in ["urgent", "critical"]):
score += 20 # High priority boostChange provider for existing agent:
AGENTS["writer"]["provider"] = "ollama" # Use local instead of Gemini| Provider | Typical Response Time |
|---|---|
| Ollama | 0.5-2 seconds |
| Gemini | 1-3 seconds |
| Perplexity | 2-5 seconds |
- 80% of queries handled by free local Ollama
- Cloud APIs used only for specialized tasks
- Estimated monthly cost: <$5 for typical usage
python check.pyVerifies:
- Ollama installation and model availability
- Environment configuration
- API keys presence
- Python dependencies
Test individual providers:
import asyncio
from agent import query_ollama, query_gemini, query_perplexity
# Test Ollama
result = asyncio.run(query_ollama("Hello"))
# Test Gemini
result = asyncio.run(query_gemini("Write a haiku"))
# Test Perplexity
result = asyncio.run(query_perplexity("What's the date today?"))Ollama Not Found
# Install Ollama from https://ollama.ai
# Pull model
ollama pull llama3.1:8bAPI Key Errors
- Verify keys in
.env - Check key validity at provider websites
- System will fallback to Ollama automatically
Import Errors
pip install aiohttp python-dotenvSee CONTRIBUTING.md for development guidelines.
See LICENSE file for details.
Documentation Version: 1.0
Last Updated: November 2, 2025