Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 33 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@ It lets you transcribe audio in R **without caring which backend actually perfor

### ✅ What it *is*

* A thin R wrapper around **OpenAI-style STT endpoints**
* A unified interface for speech-to-text in R
* A way to switch easily between:

* OpenAI `/v1/audio/transcriptions`
* Local OpenAI-compatible servers (LM Studio, OpenWebUI, AnythingLLM, Whisper containers)
* Local `{audio.whisper}` *if available*
* `{whisper}` (native R torch, local GPU/CPU)
* OpenAI `/v1/audio/transcriptions` (cloud or local servers)
* `{audio.whisper}` (whisper.cpp)
* Designed for scripting, Shiny apps, containers, and reproducible pipelines

### ❌ What it is *not*
Expand All @@ -24,17 +24,13 @@ It lets you transcribe audio in R **without caring which backend actually perfor
* Not a model manager
* Not a GPU / CUDA helper
* Not an audio preprocessing toolkit
* Not a replacement for `{audio.whisper}`
* Not a replacement for `{whisper}` or `{audio.whisper}`

---

## Installation

```r
# From CRAN (once available)
install.packages("stt.api")

# Development version
remotes::install_github("cornball-ai/stt.api")
```

Expand All @@ -45,59 +41,52 @@ Required dependencies are minimal:

Optional backends:

* `{audio.whisper}` (local transcription)
* `{whisper}` (recommended, on CRAN)
* `{audio.whisper}` (whisper.cpp alternative)
* `{processx}` (Docker helpers)

---

## Quick start

### 1. Use an OpenAI-compatible API (local or cloud)

```r
library(stt.api)
install.packages("whisper")
remotes::install_github("cornball-ai/stt.api")

set_stt_base("http://localhost:4123")
# Optional, for hosted services like OpenAI
set_stt_key(Sys.getenv("OPENAI_API_KEY"))
library(stt.api)

res <- stt("speech.wav")
res$text
```

This works with:

* OpenAI
* Chatterbox / Whisper containers
* LM Studio
* OpenWebUI
* AnythingLLM
* Any server implementing `/v1/audio/transcriptions`
That's it. With `{whisper}` installed, `stt()` transcribes locally on GPU or CPU with no configuration needed.

---

### 2. Use local `{audio.whisper}` (if installed)
## Other backends

stt.api also supports OpenAI-compatible APIs for cloud or container-based transcription:

```r
res <- stt("speech.wav", backend = "audio.whisper")
res$text
set_stt_base("http://localhost:4123")
# Optional, for hosted services like OpenAI
set_stt_key(Sys.getenv("OPENAI_API_KEY"))

res <- stt("speech.wav", backend = "openai")
```

If `{audio.whisper}` is not installed and you request it explicitly, `stt.api` will error with clear instructions.
This works with OpenAI, Whisper containers, LM Studio, OpenWebUI, AnythingLLM, or any server implementing `/v1/audio/transcriptions`.

---

### 3. Automatic backend selection (default)

```r
res <- stt("speech.wav")
```
## Automatic backend selection

Backend priority:
When you call `stt()` without specifying a backend, it picks the first available:

1. OpenAI-compatible API (if `stt.api.api_base` is set)
2. `{audio.whisper}` (if installed)
3. Error with guidance
1. `{whisper}` (native R torch, if installed)
2. `{audio.whisper}` (whisper.cpp, if installed)
3. OpenAI-compatible API (if `stt.api_base` is set)
4. Error with guidance

---

Expand All @@ -110,7 +99,7 @@ list(
text = "Transcribed text",
segments = NULL | data.frame(...),
language = "en",
backend = "api" | "audio.whisper",
backend = "api" | "whisper" | "audio.whisper",
raw = <raw backend response>
)
```
Expand Down Expand Up @@ -144,7 +133,8 @@ Useful for Shiny apps and deployment checks.
Explicit backend choice:

```r
stt("speech.wav", backend = "api")
stt("speech.wav", backend = "openai")
stt("speech.wav", backend = "whisper")
stt("speech.wav", backend = "audio.whisper")
```

Expand Down Expand Up @@ -192,10 +182,9 @@ Docker helpers are **explicit and opt-in**.

```r
options(
stt.api.api_base = NULL,
stt.api.api_key = NULL,
stt.api.timeout = 60,
stt.api.backend = "auto"
stt.api_base = NULL,
stt.api_key = NULL,
stt.timeout = 60
)
```

Expand All @@ -219,7 +208,7 @@ Example:
```
Error in stt():
No transcription backend available.
Set stt.api.api_base or install audio.whisper.
Install whisper, install audio.whisper, or set stt.api_base.
```

---
Expand Down
Loading