🎨 Kiko Trainer - Modern UI for FLUX LoRA Training

A modern, React-based web application for training FLUX LoRA models with integrated captioning, dataset management, and HuggingFace publishing.

Features • Quick Start • Installation • Usage • API • Contributing

🌟 Features

Core Capabilities

🖼️ Advanced Image Captioning - Support for Florence-2 and Qwen2.5-VL models
📦 Automated Dataset Creation - Smart image processing with latent caching
🚀 FLUX LoRA Training - Full integration with kohya-ss/sd-scripts
📊 Real-time Monitoring - Live training metrics and progress tracking
☁️ HuggingFace Publishing - Direct upload to HuggingFace model hub
💾 Model Management - Download from CivitAI and HuggingFace with preview support

Modern UI Features

Responsive Design - Works on desktop and mobile devices
Dark Mode - Easy on the eyes during long training sessions
Real-time Updates - WebSocket-based live training logs
Drag & Drop - Intuitive file uploads
Progress Tracking - Visual progress bars and metrics charts

🏗️ Architecture

graph TB
    subgraph "Frontend - React"
        UI[React UI]
        Store[Zustand State]
        API_Client[API Client]
    end

    subgraph "Backend - FastAPI"
        Server[FastAPI Server]
        Vision[Vision Models]
        Training[Training Manager]
        Downloads[Download Manager]
    end

    subgraph "External Services"
        HF[HuggingFace]
        Civit[CivitAI]
        SD[sd-scripts]
    end

    UI --> Store
    Store --> API_Client
    API_Client -->|HTTP/WS| Server
    Server --> Vision
    Server --> Training
    Server --> Downloads
    Training -->|subprocess| SD
    Downloads -->|API| HF
    Downloads -->|API| Civit

    style UI fill:#61dafb
    style Server fill:#009688
    style SD fill:#ff6b6b

🚀 Quick Start

Using Docker (Recommended)

# Clone the repository
git clone git@github.com:ComfyAssets/kiko-trainer.git
cd kiko-trainer

# Start with Docker Compose
docker compose up -d --build

# Access the application
# Frontend: http://localhost:8080
# API: http://localhost:8001

Manual Installation

# Clone and setup
git clone git@github.com:ComfyAssets/kiko-trainer.git
cd kiko-trainer

# Setup backend
./setup.sh  # This clones sd-scripts and installs dependencies

# Start backend
source venv/bin/activate
uvicorn backend.server:app --host 0.0.0.0 --port 8001 --reload

# In a new terminal - Setup frontend
cd web
npm install
npm run dev

# Access at http://localhost:5173

📋 Prerequisites

Python 3.8+
Node.js 16+
CUDA-capable GPU (for training)
16GB+ VRAM recommended
Git

🔧 Installation

Backend Setup

Clone the repository

git clone git@github.com:ComfyAssets/kiko-trainer.git
cd kiko-trainer

Run the setup script

./setup.sh

This will:

Clone kohya-ss/sd-scripts (required dependency)
Create a Python virtual environment
Install all Python dependencies

GPU Setup (if needed)

# For CUDA 12.1
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Frontend Setup

cd web
npm install

📖 Usage

Training Workflow

graph LR
    A[Upload Images] --> B[Caption Images]
    B --> C[Create Dataset]
    C --> D[Configure Training]
    D --> E[Start Training]
    E --> F[Monitor Progress]
    F --> G[Publish to HF]

    style A fill:#4ade80
    style G fill:#60a5fa

1. Model Management

Navigate to the Models tab
Enter your CivitAI API key (get from CivitAI Account)
Download FLUX models and required components

2. Image Captioning

Go to Setup tab
Upload your images via drag & drop
Choose captioning model:
- Florence-2: Fast, good quality
- Qwen2.5-VL: Slower, best quality
Configure caption settings and generate

3. Dataset Creation

Set output folder name
Choose resolution (512, 768, 1024)
Enable latent caching for faster training
Create dataset

4. Training Configuration

Navigate to Training tab
Select base model and dataset
Configure hyperparameters:
- Learning rate: 1e-4 (recommended)
- Steps: 500-2000
- Batch size: Based on VRAM
Start training

5. Publishing

Go to Publish tab
Select trained LoRA
Enter HuggingFace credentials
Configure repository settings
Publish to HuggingFace

🎮 Advanced Features

Vision Model Selection

The application supports multiple vision-language models:

Model	Size	Quality	Speed	VRAM
Florence-2	0.7B	Good	Fast	4GB
Qwen2.5-VL-3B	3B	Better	Medium	8GB
Qwen2.5-VL-7B	7B	Best	Slow	16GB

Training Parameters

# Example configuration
learning_rate: 1e-4
network_dim: 32
network_alpha: 16
batch_size: 1
max_train_steps: 1000
save_every_n_steps: 100

Metrics Monitoring

Real-time training metrics include:

Loss curves
Learning rate schedule
Memory usage
Training speed (it/s)
Sample generation preview

🔌 API Reference

Core Endpoints

Endpoint	Method	Description
`/api/health`	GET	Health check
`/api/models`	GET	List available models
`/api/caption`	POST	Caption images
`/api/create_dataset`	POST	Create training dataset
`/api/train/prepare`	POST	Prepare training scripts
`/api/train/start`	POST	Start training
`/api/train/logs`	GET	Get training logs
`/api/loras`	GET	List trained LoRAs
`/api/publish`	POST	Publish to HuggingFace

WebSocket Endpoints

/api/train/ws - Real-time training logs
/api/metrics/stream - Live metrics updates

📁 Project Structure

kiko-trainer/
├── backend/                 # FastAPI backend
│   ├── server.py           # Main API server
│   ├── train_utils.py      # Training utilities
│   └── civitai_downloader.py # Model downloads
├── web/                    # React frontend
│   ├── src/
│   │   ├── components/     # React components
│   │   ├── pages/         # Application pages
│   │   └── store/         # State management
│   └── package.json
├── sd-scripts/            # Training engine (cloned)
├── models/                # Downloaded models
├── datasets/              # Training datasets
├── outputs/               # Trained LoRAs
├── docker-compose.yml     # Docker configuration
└── setup.sh              # Setup script

🐳 Docker Deployment

Docker Compose Configuration

version: "3.9"
services:
  api:
    build: .
    ports:
      - "8001:8001"
    volumes:
      - ./models:/app/models
      - ./outputs:/app/outputs
    environment:
      - HF_HUB_ENABLE_HF_TRANSFER=1

  web:
    build: ./web
    ports:
      - "8080:80"
    depends_on:
      - api

Building Images

# Build and start
docker compose up -d --build

# View logs
docker compose logs -f

# Stop services
docker compose down

🛠️ Development

Running Tests

# Backend tests
source venv/bin/activate
pytest

# Frontend tests
cd web
npm test
npm run type-check

Code Style

Backend: Black, Flake8, MyPy
Frontend: ESLint, Prettier, TypeScript

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

kohya-ss/sd-scripts - Training engine
FLUX - Base model architecture
Florence-2 - Vision model
Qwen2.5-VL - Advanced vision model

🐛 Troubleshooting

Common Issues

CUDA not detected

# Install CUDA-compatible PyTorch
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Out of Memory

Reduce batch size to 1
Enable gradient checkpointing
Use 8-bit optimization
Consider using a smaller vision model

Port already in use

# Change backend port
uvicorn backend.server:app --port 8002

# Change frontend port
cd web && VITE_PORT=3001 npm run dev

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Discord: Join our Discord

🗺️ Roadmap

Made with ❤️ by the Kiko Trainer Team

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
backend		backend
web		web
.env.example		.env.example
.gitignore		.gitignore
Dockerfile.api		Dockerfile.api
EXTERNAL_ACCESS_SETUP.md		EXTERNAL_ACCESS_SETUP.md
LICENSE		LICENSE
MODEL_FEATURES.md		MODEL_FEATURES.md
MODEL_SELECTION_IMPLEMENTATION.md		MODEL_SELECTION_IMPLEMENTATION.md
MODEL_TAB_GUIDE.md		MODEL_TAB_GUIDE.md
QWEN_SCRIPTS_GUIDE.md		QWEN_SCRIPTS_GUIDE.md
README.md		README.md
docker-compose.yml		docker-compose.yml
install.sh		install.sh
models.yaml		models.yaml
qwen-fixed.md		qwen-fixed.md
qwen-notes.md		qwen-notes.md
requirements.txt		requirements.txt
setup.sh		setup.sh
start-server.sh		start-server.sh

Folders and files

Latest commit

History

Repository files navigation

🎨 Kiko Trainer - Modern UI for FLUX LoRA Training

🌟 Features

Core Capabilities

Modern UI Features

🏗️ Architecture

🚀 Quick Start

Using Docker (Recommended)

Manual Installation

📋 Prerequisites

🔧 Installation

Backend Setup

Frontend Setup

📖 Usage

Training Workflow

1. Model Management

2. Image Captioning

3. Dataset Creation

4. Training Configuration

5. Publishing

🎮 Advanced Features

Vision Model Selection

Training Parameters

Metrics Monitoring

🔌 API Reference

Core Endpoints

WebSocket Endpoints

📁 Project Structure

🐳 Docker Deployment

Docker Compose Configuration

Building Images

🛠️ Development

Running Tests

Code Style

🤝 Contributing

📝 License

🙏 Acknowledgments

🐛 Troubleshooting

Common Issues

📞 Support

🗺️ Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages