pipecat-ai · cbentes · Sep 18, 2025
diff --git a/docs.json b/docs.json
@@ -193,7 +193,8 @@
                 "pages": [
                   "server/services/s2s/aws",
                   "server/services/s2s/gemini",
-                  "server/services/s2s/openai"
+                  "server/services/s2s/openai",
+                  "server/services/s2s/pinch"
                 ]
               },
               {

diff --git a/server/services/s2s/pinch.mdx b/server/services/s2s/pinch.mdx
@@ -0,0 +1,138 @@
+---
+title: "Pinch"
+description: "Real-time translation service implementation using Pinch's speech-to-speech API"
+---
+
+## Overview
+
+`PinchAudioService` provides real-time speech translation with synchronized audio output and transcription capabilities. The service translates spoken audio from one language to another while maintaining natural conversation flow through streaming audio processing.
+
+The service provides:
+- **Real-time Translation**: Stream audio input and receive translated audio output with minimal latency
+- **Dual Transcription**: Both source language transcription and translated text output
+- **Voice Synthesis**: Natural-sounding translated speech with customizable voice parameters
+- **Streaming Architecture**: Optimized for low-latency conversational applications
+
+## Installation
+
+To use `PinchAudioService`, install the required dependencies:
+
+```bash
+pip install "pipecat-ai[pinch]"
+```
+
+You'll also need to set up your Pinch API token as an environment variable: `PINCH_API_TOKEN`.
+
+<Tip>
+  Get your API token by creating an account at [Pinch](https://www.startpinch.com/).
+</Tip>
+
+## Frames
+
+### Input
+
+<ParamField path="InputAudioRawFrame" type="Frame">
+  Raw PCM audio data for speech input (16-bit, 16kHz, mono)
+</ParamField>
+
+### Output
+
+<ParamField path="TranscriptionFrame" type="Frame">
+  Final transcription of the source language speech
+</ParamField>
+
+<ParamField path="InterimTranscriptionFrame" type="Frame">
+  Real-time partial transcription updates during speech
+</ParamField>
+
+<ParamField path="LLMTextFrame" type="Frame">
+  Translated text output in the target language
+</ParamField>
+
+<ParamField path="TTSTextFrame" type="Frame">
+  Text being synthesized to speech in the target language
+</ParamField>
+
+<ParamField path="SpeechOutputAudioRawFrame" type="Frame">
+  Translated audio stream chunks (16-bit PCM)
+</ParamField>
+
+## Configuration
+
+### Constructor Parameters
+
+<ParamField path="api_token" type="str" required>
+  Pinch API authentication token
+</ParamField>
+
+<ParamField path="session" type="aiohttp.ClientSession" required>
+  HTTP client session for WebSocket connections and API requests
+</ParamField>
+
+<ParamField path="session_request" type="PinchSessionRequest">
+  Session configuration object. Defaults to English → Spanish translation with female voice
+</ParamField>
+
+### Session Configuration
+
+The `PinchSessionRequest` object configures the translation session:
+
+<ParamField path="source_language" type="str">
+  Input language code (e.g., "en" for English). See supported languages below
+</ParamField>
+
+<ParamField path="target_language" type="str">
+  Output language code (e.g., "es" for Spanish). See supported languages below
+</ParamField>
+
+<ParamField path="voice_type" type="str">
+  Voice characteristic for synthesized speech. Options: "female", "male"
+</ParamField>
+
+<ParamField path="enable_audio_output" type="bool">
+  Whether to generate translated audio output. Default: `True`
+</ParamField>
+
+<ParamField path="enable_transcription" type="bool">
+  Whether to output transcription frames. Default: `True`
+</ParamField>
+
+<ParamField path="sample_rate" type="int">
+  Audio sample rate in Hz. Default: `16000`
+</ParamField>
+
+## Language Support
+
+Pinch supports real-time translation between a growing number of language pairs. We are constantly adding new languages and improving translation quality.
+For a complete list of supported languages and available translation pairs, please visit the [Pinch documentation](https://www.startpinch.com/).
+
+## Usage Example
+
+```python
+from pipecat.transports.pinch.api import PinchSessionRequest
+from pipecat.services.pinch import PinchAudioService
+
+pinch_api_key = os.getenv("PINCH_API_KEY")
+
+# Configure session
+session_request = PinchSessionRequest(
+    source_language="en",
+    target_language="es",
+    voice_type="female",
+    enable_audio_output=True
+)
+
+# Create Pinch audio streaming service
+pinch_service = PinchAudioService(
+    api_token=pinch_api_key,
+    session=session,
+    session_request=session_request
+)
+
+# Create pipeline
+pipeline = Pipeline([
+    transport.input(),   # Audio input
+    pinch_service,       # Translation service
+    transport.output(),  # Audio output
+])
+```
diff --git a/server/services/supported-services.mdx b/server/services/supported-services.mdx
@@ -112,6 +112,7 @@ Speech-to-Speech services are multi-modal LLM services that take in audio, video
 | [AWS Nova Sonic](/server/services/s2s/aws)            | `pip install "pipecat-ai[aws-nova-sonic]"` |
 | [Gemini Multimodal Live](/server/services/s2s/gemini) | `pip install "pipecat-ai[google]"`         |
 | [OpenAI Realtime](/server/services/s2s/openai)        | `pip install "pipecat-ai[openai]"`         |
+| [Pinch](/server/services/s2s/pinch)                   | `pip install "pipecat-ai[pinch]"`          |
 
 ## Image Generation