Skip to content

maitrisavaliya/conversational-ai

Repository files navigation

requirements.txt

Conversational AI (local voice assistant)

Python License

A compact, privacy-first Python demo that turns speech into natural conversations using a local LLM and local speech tools. Designed for local development and experimentation with voice interruption (speak while the assistant is talking).

Highlights

  • Voice-first interaction (speech-to-text → LLM → text-to-speech)
  • Supports quick, local LLMs (e.g., Ollama models)
  • Designed to run locally — no cloud required
  • Windows-friendly instructions (cmd/Batch examples)

Recommended repository name & short description

  • Repository name: conversational-ai
  • Short description: "Local Python voice assistant with speech-to-text, LLM integration, and text-to-speech — Windows-friendly demo."

You can use those when creating the repo on GitHub.


Files in this project

This README describes the project files found in the workspace root. Brief summaries help you quickly find where things live.

conversational_ai/
├── __init__.py         # package marker / module init
├── audio_handler.py    # audio capture & VAD (voice activity detection)
├── config.py           # config loader (env / defaults)
├── llm_handler.py      # LLM integration (e.g., Ollama client wrapper)
├── main.py             # main application loop / orchestrator
├── speech_to_text.py   # speech recognition wrapper (Whisper or local alternative)
├── text_to_speech.py   # text-to-speech output
└── README.md           # this file

If you add requirements.txt, .env.example, or other assets, document them here as well.


Quick start (Windows cmd)

These commands use Windows-style activation and paths (Batch / cmd). Run them in a Command Prompt or PowerShell (they work in both when using the same syntax for venv activation).

  1. Clone the repo
git clone https://github.com/maitrisavaliya/conversational-ai.git
cd conversational-ai
  1. Create a virtual environment
python -m venv venv
venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. (Optional) Pull a local LLM model with Ollama
ollama pull llama3:latest
  1. Run the app
python main.py

When the assistant starts you should see a ready message and can speak into your microphone. Say exit, quit, or goodbye to stop.


Configuration

Use environment variables or an .env file to customize the app. Example keys (add to .env or export in your environment):

# LLM
OLLAMA_MODEL_NAME=llama3:latest
OLLAMA_BASE_URL=http://localhost:11434

# ASR (speech-to-text)
WHISPER_MODEL_SIZE=base.en

# Audio
AUDIO_SAMPLE_RATE=16000
AUDIO_CHANNELS=1
AUDIO_DEVICE_INDEX=

# TTS
TTS_RATE=180
TTS_VOLUME=1.0

If your project uses a different config format, adapt these keys accordingly in config.py.


Troubleshooting (Windows)

  • If you see audio errors, verify microphone access and that your audio drivers are working.
  • Whisper requires FFmpeg. On Windows, download FFmpeg and add the bin folder to your PATH.
  • If ollama commands fail, ensure Ollama is installed and running (ollama serve).

Common checks:

where ffmpeg
python -c "import sounddevice; print(sounddevice.query_devices())"
ollama list

How to publish to GitHub (cmd)

  1. Create the repository on GitHub using the name and description suggested above.

  2. Push your local project:

git init
git add .
git commit -m "Initial: Add conversational AI demo"
git branch -M main
git remote add origin https://github.com/<your-username>/conversational-ai.git
git push -u origin main

Replace <your-username> with your GitHub username.


Contributing

Contributions are welcome. Suggested small improvements:

  • Add a requirements.txt listing precise dependencies
  • Provide a .env.example with default keys
  • Add unit tests for critical modules

Please open an issue or a pull request.


License

This project is provided under the MIT License.

About

Local Python voice assistant with speech-to-text, LLM integration, and text-to-speech — Windows-friendly demo.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages