llm_dynamic_load_unload

This repo contains codes for dynamic load and unload llms on localized edge device

This script provides a Flask-based system to manage LLM model tasks by continous load and unloading on localized edge device using Ollama and Raspberry Pi 4B, focusing on efficient model usage and logging. Below is a breakdown of its main components:

Install Ollama https://github.com/ollama/ollama

ollama run <model_name>

Alyways perform the code under virtual environment.

Then pip install -r requirements.txt

Then run llm_basic_scheduling_switch_7.py in a terminal

Then run test.py on other terminal

Check the CSV logs.

Key Features

Model Management:
- Loads and unloads models based on usage.
- Maintains an active model to reduce loading times.
- Automatically unloads models idle beyond a configurable MODEL_TIMEOUT.
Task Queue:
- Tasks (e.g., arithmetic operations) are added to a queue and processed sequentially.
- Ensures tasks are assigned to appropriate models.
System Resource Monitoring:
- Tracks CPU usage, memory usage, and system load averages using psutil.
- Logs these metrics for performance analysis.
Logging:
- Logs detailed task metrics, including timestamps, task latencies, resource usage, and model states, into a CSV file (llm_metrics.csv).
- Supports debugging with optional console logs.
REST API:
- Provides an endpoint (/perform_task) to add tasks via HTTP POST requests.
- Accepts JSON payloads specifying the task type, model, and prompt.
Concurrency:
- Processes tasks and manages models concurrently using threads and thread-safe mechanisms like Lock.
Error Handling:
- Handles failed model loading or task execution gracefully.
- Logs failures and returns appropriate error messages to clients.

Important Components

Constants:
- BASE_URL: Base URL for model interactions.
- MODEL_TIMEOUT: Duration (in seconds) after which idle models are unloaded.
- DEBUG: Enables debug logs for troubleshooting.
- LOG_FILE: Path to the CSV log file.
Key Functions:
- debug_log: Logs messages if debugging is enabled.
- log_to_csv: Writes task and system performance metrics to a CSV file.
- monitor_resources: Captures system resource usage metrics.
- load_model: Loads a specified model and tracks its state.
- unload_idle_models: Unloads models that haven't been used recently.
- process_tasks: Main loop to process queued tasks and log their performance.
REST Endpoint:
- /perform_task: Accepts task details and adds them to the queue.
Background Thread:
- A daemon thread (processor_thread) runs the process_tasks function continuously.

Example Use Case

Add a Task: A client sends a POST request to /perform_task with:

curl -X POST http://localhost:5000/perform_task -H "Content-Type: application/json" -d '{"task_type": "arithmetic", "model_name": "qwen2.5:0.5b-instruct", "prompt": "What is 2+2?"}'

Process Task:
- The task is added to the queue.
- The system ensures the model is loaded, executes the task, and logs the result.
Resource Monitoring:
- During task execution, CPU, memory, and load averages are monitored.
Unload Idle Models:
- Models unused for MODEL_TIMEOUT are unloaded to conserve resources.

This system is ideal for scenarios requiring efficient AI model usage, such as dynamically handling multiple models with varying tasks while monitoring and logging system performance.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
llm_basic_scheduling_switch_7.py		llm_basic_scheduling_switch_7.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llm_dynamic_load_unload

Key Features

Important Components

Example Use Case

About

Uh oh!

Releases

Packages

Languages

License

ParthaPRay/llm_dynamic_load_unload

Folders and files

Latest commit

History

Repository files navigation

llm_dynamic_load_unload

Key Features

Important Components

Example Use Case

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages