Skip to content
/ cbx Public

Zero-dependency local text-to-speech CLI. Wraps Chatterbox ONNX models in a standalone Rust binary for macOS, Linux, and Windows.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

srv1n/cbx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cbx

A single-binary CLI for local text-to-speech.

This project wraps Resemble AI's Chatterbox ONNX models in a standalone executable. The neural network inference is handled by ONNX Runtime; the Rust code provides a CLI interface, model management, and voice caching.

Based on Resemble AI's Chatterbox, a state-of-the-art open-source TTS model.


Why cbx?

cbx offers a zero-dependency way to use Chatterbox:

  • Single binary - Download one file, run it. No Python, no virtual environments, no pip.
  • Built-in model management - Commands to download, list, and clean up model files.
  • Voice profile caching - Encode a reference voice once, reuse it without re-processing.
  • Cross-platform - Same CLI on macOS, Linux, and Windows.

The official Chatterbox repository has more features (multilingual, GPU acceleration, fine-tuning). Use cbx when you want a simple, portable tool for basic TTS.


Audio Samples

Listen to what Chatterbox can produce:

cbx uses the same underlying model, so output quality is identical.


Installation

Quick Install (Recommended)

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/srv1n/cbx/main/packaging/scripts/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/srv1n/cbx/main/packaging/scripts/install.ps1 | iex

The installer downloads the appropriate binary for your platform from GitHub Releases. Models are downloaded separately on first use.

Other Options


Quick Start

1. Generate speech

cbx speak --text "Hello from cbx." --voice-wav ./your-voice.wav --out-wav ./output.wav

This will:

  • Download the required model files (first run only, ~1-2 GB depending on variant)
  • Encode your reference voice
  • Generate speech and save to output.wav

2. (Optional) Use the bundled default voice

If you don't have a reference WAV file, install the pre-packaged default voice:

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/srv1n/cbx/main/packaging/scripts/install_default_voice.sh | bash

Windows:

irm https://raw.githubusercontent.com/srv1n/cbx/main/packaging/scripts/install_default_voice.ps1 | iex

Then generate speech without specifying a voice:

cbx speak --text "Hello from cbx." --out-wav ./output.wav

3. (Optional) Cache a voice profile for faster runs

Voice encoding takes a few seconds. Cache it once to skip this step on future runs:

cbx voice add --name myvoice --voice-wav ./your-voice.wav
cbx speak --voice myvoice --text "Much faster now." --out-wav ./output.wav

Commands

Command Description
cbx speak Generate speech from text
cbx download Pre-download model files
cbx sizes Show download sizes for each model variant
cbx models List cached models
cbx voice add Create a cached voice profile
cbx voice list List cached voice profiles
cbx voice remove Delete a voice profile
cbx clean Remove cached models

Run cbx --help or cbx <command> --help for details.


Model Variants

Chatterbox is published in several ONNX variants. Use --dtype to select:

Variant Notes
fp16 (default) Good balance of size and speed
fp32 Largest, most compatible
quantized, q4, q4f16 Smaller downloads, speed varies by platform

Check available sizes without downloading:

cbx sizes

Download a specific variant:

cbx download --dtype fp16

Voice Profiles

cbx supports two ways to use reference voices:

Direct path (slower): Pass --voice-wav each time. The voice is re-encoded on every run.

cbx speak --voice-wav ./voice.wav --text "Hello" --out-wav ./out.wav

Cached profile (faster): Encode once, reuse many times.

cbx voice add --name myvoice --voice-wav ./voice.wav
cbx speak --voice myvoice --text "Hello" --out-wav ./out.wav

Voice profiles are tied to the model variant (--dtype). If you switch variants, create a new profile for that variant.

Voice file requirements

  • Format: WAV
  • Channels: Mono or stereo (converted internally)
  • Duration: 5-20 seconds of clear speech works well
  • Quality: Clean recording without background noise

Performance

CPU is usually fastest

cbx includes platform-specific acceleration (CoreML on macOS, DirectML on Windows, CUDA on Linux), but for this particular model, CPU execution is often faster due to graph partitioning overhead in the neural accelerators.

The default is --ep cpu, which should work well on most systems.

Thread tuning

cbx automatically uses all available CPU cores. Override with:

cbx --intra-threads 4 speak --text "Hello" --out-wav out.wav

On Apple Silicon, 4 threads often outperforms higher counts due to contention. Experiment to find what works best for your hardware.

Benchmarks (M1 MacBook Pro)

Configuration Average Time
CPU, 4 threads 22.7s
CPU, 2 threads 29.0s
CPU, 8 threads 33.1s
CoreML, 4 threads 49.7s

Text: "The quick brown fox jumps over the lazy dog. This is a benchmark run for cbx."


Cache Management

View cache status

cbx models        # Show cached model files
cbx models --long # Show detailed info including commit hashes
cbx downloads     # Show all downloaded variants
cbx voice list    # Show cached voice profiles

Clean up

Preview what would be deleted:

cbx clean --dry-run

Delete specific variants:

cbx clean --dtype fp32 --yes

Delete everything (models and voices):

cbx clean --all --voices --yes

Uninstall

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/srv1n/cbx/main/packaging/scripts/uninstall.sh | bash

Windows:

irm https://raw.githubusercontent.com/srv1n/cbx/main/packaging/scripts/uninstall.ps1 | iex

This removes the binary. To also remove cached models and voices:

cbx clean --all --voices --yes

Homebrew

A Homebrew formula template is available at packaging/homebrew/cbx.rb.template. Once configured in a tap:

brew tap srv1n/tap
brew install cbx

Building from Source

cargo build --release
./target/release/cbx --help

Enable platform-specific acceleration:

cargo build --release --features coreml    # macOS
cargo build --release --features directml  # Windows
cargo build --release --features cuda      # Linux with NVIDIA GPU

Acknowledgments

This project would not exist without:


License

Dual-licensed under Apache License 2.0 or MIT, at your option.

See LICENSE-APACHE and LICENSE-MIT.

About

Zero-dependency local text-to-speech CLI. Wraps Chatterbox ONNX models in a standalone Rust binary for macOS, Linux, and Windows.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published