A multi-model conversation system that intelligently routes user queries to specialized AI models based on task type, maintaining conversation context for each model. Now with local Ollama models and a beautiful desktop GUI!
- Intelligent Task Classification: Automatically classifies user queries into different task types
- Multi-Model Routing: Routes queries to the most appropriate AI model for each task
- Conversation Context: Maintains separate conversation histories for each model
- Local AI Models: Uses Ollama for cost-effective local AI processing
- Beautiful Desktop GUI: Modern chat interface inspired by leading chat apps
- Command Line Interface: Traditional CLI for power users
- Easy Setup: Simple installation and configuration
# On any OS (macOS, Windows, Linux)
python gui_app.py
# Or on macOS/Linux
./start_gui.shFeatures:
- π¨ Beautiful modern interface
- π¬ Real-time chat experience
- π Live model status display
- ποΈ Easy conversation management
- β¨οΈ Enter to send, Shift+Enter for new line
python main.pyFeatures:
- β‘ Fast and lightweight
- π§ Full control and debugging
- π Scriptable and automatable
- π₯οΈ Works on any terminal
JOAT uses a single comprehensive installation by default (regular profile). You can optionally switch to a small profile for lower resource usage. The mapping lives in models_mapping.json.
Regular profile models:
- coding_generation: yi-coder:9b
- text_generation: mistral:7b-instruct
- mathematical_reasoning: qwen2-math:7b
- commonsense_reasoning: mistral:7b-instruct
- question_answering: llama3-chatqa:8b
- dialogue_systems: qwen2.5:7b-instruct
- summarization: llama3.2:3b-instruct
- sentiment_analysis: llama3.1:8b-chat
- visual_question_answering: llava:7b
- video_question_answering: qwen2.5vl:7b
Small profile models (optional):
- coding_generation: deepseek-coder:1.3b
- text_generation: llama3.2:1b
- mathematical_reasoning: deepscaler:1.5b
- commonsense_reasoning: tinyllama:1.1b
- question_answering: phi3.5:3.8b
- dialogue_systems: qwen2.5:3b-instruct
- summarization: phi3.5:3.8b
- sentiment_analysis: llama3.2:1b
- visual_question_answering: moondream:1.8b
- video_question_answering: qwen2.5vl:3b
Each model in the profiles is selected to specialize per task (coding, math, QA, etc.), balancing quality and performance. You can switch profiles at runtime if needed.
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
# Windows
# Download from https://ollama.ai/brew services start ollama # macOS
# or
ollama serve # Any platformpip install -r requirements.txtpython setup_comprehensive_models.pyDesktop GUI (Recommended):
python gui_app.py
# or on macOS/Linux
./start_gui.shCommand Line:
python main.pyThe app will automatically download models as needed, but you can pre-install them:
# Default comprehensive (regular profile) interactive installer
python setup_comprehensive_models.py
# Optional minimal one-shot installer (installs regular profile without prompting)
python setup_ollama.py# Pull any model from the mapping manually
ollama pull yi-coder:9b
ollama pull qwen2.5:7b-instruct
... # etc.- small_sized_models: optimized for speed and low memory.
- regular_sized_models: balanced quality and performance (default install).
- Launch:
python gui_app.py(or./start_gui.shon macOS/Linux) - Type your query in the input box
- Press Enter or click Send
- Watch the AI route your query to the best model
- See which model was used in the response
python main.py
You: Write a Python function to calculate fibonacci numbers
π€ [Response from codellama]
You: Solve the equation 2x + 5 = 13
π€ [Response from wizard-math]
You: What is the capital of France?
π€ [Response from llama3]Edit models_mapping.json to customize which models handle which tasks per profile:
{
"regular_sized_models": {
"coding_generation": "yi-coder:9b",
"text_generation": "mistral:7b-instruct",
"mathematical_reasoning": "qwen2-math:7b",
"commonsense_reasoning": "mistral:7b-instruct",
"question_answering": "llama3-chatqa:8b",
"dialogue_systems": "qwen2.5:7b-instruct",
"summarization": "llama3.2:8b-instruct",
"sentiment_analysis": "llama3.1:8b-chat",
"visual_question_answering": "llava:7b",
"video_question_answering": "qwen2-vl:7b-instruct"
},
"small_sized_models": {
"coding_generation": "stablecode:3b",
"text_generation": "llama3.2:1b",
"mathematical_reasoning": "deepscaler:1.5b",
"commonsense_reasoning": "tinyllama:1.1b",
"question_answering": "phi3.5:3.8b",
"dialogue_systems": "qwen2.5:3b-instruct",
"summarization": "phi3.5:3.8b",
"sentiment_analysis": "llama3.2:1b",
"visual_question_answering": "moondream:2",
"video_question_answering": "qwen2-vl:2b-instruct"
}
}- Default: auto-detected. The app uses
regular_sized_modelsonly if all regular-profile models inmodels_mapping.jsonare installed. Otherwise, it usessmall_sized_models. - Force via env var:
export JOAT_PROFILE=small_sized_modelsorregular_sized_models - Programmatic override:
JOATSystem(models_mapping_file="models_mapping.json", profile_key="small_sized_models")
See the Advanced Guide for troubleshooting and performance tips.
The JOAT source code is licensed under the Apache License 2.0. You can find the full license text in the LICENSE file.
The AI models used by this project have their own licenses. When you use JOAT, you are also subject to the license terms of the underlying models, which include:
- Llama 3 Community License: Applies to models like
llama3andcodellama. - Apache 2.0: Applies to models like
mistral.
It is your responsibility to review and comply with the terms of each model's license and its Acceptable Use Policy.
joat/
βββ main.py # CLI version
βββ app.py # Core application logic
βββ gui_app.py # The GUI application window
βββ start_gui.sh # GUI launch script for macOS/Linux
βββ ollama_client.py # Ollama API client
βββ setup_ollama.py # Model setup script (installs regular by default)
βββ models_mapping.json # Task-to-model mapping (profiles)
βββ requirements.txt # Python dependencies
βββ docs/ # Documentation files
βββ README.md # This file
- Modern Design: Clean, modern chat-like interface
- Real-time Status: Live Ollama and model status
- Model Information: Shows which models are available
- Conversation History: Maintains chat history
- Keyboard Shortcuts: Enter to send, Shift+Enter for new line
- Responsive Layout: Adapts to window size
- Error Handling: Graceful error messages
- Click "ποΈ Clear History" to clear all conversations
- Each model maintains its own conversation context
- Automatic model switching based on task type
- Type
clearto clear history - Type
historyto see conversation history - Type
statusto check system status - Type
quitto exit
- SSD Storage: Models load faster from SSD
- RAM: 16GB+ recommended for smooth operation
- GPU: Optional but speeds up inference (if supported by your hardware and Ollama)
- Model Selection: Use smaller models for faster responses
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
Enjoy your local AI assistant! π€β¨