WIP: Adding support for Local Models #26

charlieduzstuf · 2025-10-08T09:39:47Z

Doing some of this myself, and as an experiment I'm also giving part of this task to the AI itself. So far it seems a bit sloppy and more like pseudo-code, but I'll give it more of a kick in the coming week(s)

…dling - Add HuggingFaceChatClient for running models from Hugging Face Hub - Introduce LocalModelClient base class for common local model functionality - Update README with detailed local model configuration options - Add huggingface_hub dependency to requirements.txt - Improve environment variable handling for local models

Implement GGUF model discovery functionality that searches for models in a specified directory. When GGUF_MODEL_DIR is set, the system will automatically find and load GGUF models by name without requiring full paths. Update documentation to reflect new GGUF model discovery feature and clarify optional environment variables.

…lization - Move GGUF model search function below imports and conditional imports - Expand AVAILABLE_LLMS list with additional model variants - Rename chat namespace to _openai_compatible_chat for clarity - Simplify Hugging Face model parsing logic and update env var names

Added section comments to group and clarify the different model providers in the AVAILABLE_LLMS list for better readability and maintainability.

Add support for multiple new LLM providers (anthropic, bedrock, vertex_ai, deepseek, openrouter, ollama, localai, llama-cpp) and implement automatic GGUF model discovery with configurable local models directory. Include path detection logic and comprehensive provider-specific model name formatting.

Add support for --model_dir parameter to allow GGUF model discovery in a specified directory when using the 'local' provider. This enables interactive model selection when no specific model is provided and improves error handling for missing models or directories.

The example usage was uncommented and updated with a more specific prompt to demonstrate the LLM client functionality. This helps verify the client setup is working as expected.

Update response content access to use direct 'content' attribute instead of nested structure Add model path construction for llama-2 model

charlieduzstuf added 10 commits October 7, 2025 14:23

Attempt to use Local Models

eab12a1

Add llama-cpp-python support for local GGUF model inference

69a8b44

Bits N Pieces

1196c28

docs(llm): add comments to clarify model categories in AVAILABLE_LLMS

e52c6e8

Added section comments to group and clarify the different model providers in the AVAILABLE_LLMS list for better readability and maintainability.

Works-ish

a03c5fd

WIP

1ced437

charlieduzstuf marked this pull request as ready for review October 9, 2025 09:39

charlieduzstuf marked this pull request as draft October 9, 2025 09:40

charlieduzstuf added 3 commits October 10, 2025 16:28

test: uncomment and update example LLM client usage

d5c3f26

The example usage was uncommented and updated with a more specific prompt to demonstrate the LLM client functionality. This helps verify the client setup is working as expected.

fix(run_llm): update response content access and add model path

c378cd2

Update response content access to use direct 'content' attribute instead of nested structure Add model path construction for llama-2 model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Adding support for Local Models #26

WIP: Adding support for Local Models #26

Uh oh!

charlieduzstuf commented Oct 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

WIP: Adding support for Local Models #26

Are you sure you want to change the base?

WIP: Adding support for Local Models #26

Uh oh!

Conversation

charlieduzstuf commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

charlieduzstuf commented Oct 8, 2025 •

edited

Loading