Using Text To Speech(TTS)

日本語

Currently there are two way to use TTS.

Pregenerated: Generates and flash speeches at buildtime and plays standalone. Suitable for predefined statements.
Remote: Queries speech statements to the TTS server, then streams the voice generated.

Local(On demand) TTS such as aquestalk is not available for now pull requests are welcome!

Prerequisites

No matter which way you choose, you should prepare an extrenal TTS engine first.

Tested below:

Google Cloud Text-to-Speech API
Coqui AI TTS
VoiceVox
ElevenLabs

See also official documents of each of them.

Google Cloud TTS

Get through this authentication guide and generate key.json
Save key.json under scripts directory

Coqui AI TTS

Install coqui-ai/TTS
Launch server

$ tts-server --port 8080 --model_name tts_models/ja/kokoro/tacotron2-DDC

save server configuration under config.tts.host|port of stackchan/manifest_local.json

{
    "config": {
        "tts": {
            "host": "your.tts.host.local",
            "port": 8080
        }
    }
}

ElevenLabs TTS

Get through API KEY and get API KEY.
Set API KEY to config.tts of stackchan/manifest_local.json.

{
    "config": {
        "tts": {
            "type": "elevenlabs",
            "token": "YOUR_API_KEY"
        },
    }
}

Usage(Pregenerated)

write down sentenses to speech in the format below (See mods/monologue/speeches_monologue.js and other examples)

// speeches.js
export const speeches = {
  niceToMeetYou: 'Hello. I am Stach-chan. Nice to meet you.',
  hello: 'Hello World.',
  konnichiwa: 'Konnichiwa.',
  nihao: 'Nee hao.',
}

Run npm run generate-speech-[google|coqui|voicevox]
- this script get voice data from server and saves wave files under stackchan/assets/sounds
Flash firmware with assets
Call Robot#speak(sentense: string) with the sentense.

import { speeches } from 'speeches'
const keys = Object.keys(speeches)

export async function onRobotCreated(robot) {
  await robot.say('hello')
  await robot.say(keys[0] /* 'niceToMeetYou' */)
}

Usage(Remote)

Set config.tts.type according to your TTS server in manifest_local.json

{
    "config": {
        "tts": {
            "type": "remote",
            "host": "your.tts.host.local",
            "port": 8080
        }
    }
}

Call Robot#say(sentense: string)

// ...
export async function onRobotCreated(robot) {
  await robot.say('Now I can speak any sentense you want.')
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

text-to-speech.md

text-to-speech.md

Using Text To Speech(TTS)

Prerequisites

Google Cloud TTS

Coqui AI TTS

ElevenLabs TTS

Usage(Pregenerated)

Usage(Remote)

Files

text-to-speech.md

Latest commit

History

text-to-speech.md

File metadata and controls

Using Text To Speech(TTS)

Prerequisites

Google Cloud TTS

Coqui AI TTS

ElevenLabs TTS

Usage(Pregenerated)

Usage(Remote)