Skip to content

Commit

Permalink
Adding TTS options
Browse files Browse the repository at this point in the history
  • Loading branch information
kkacsh321 committed Oct 12, 2024
1 parent 3948c1b commit 3ead49b
Show file tree
Hide file tree
Showing 11 changed files with 306 additions and 14 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
.DS_Store
./models
models/*
temp_audio.wav
108 changes: 106 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,15 @@ Interact with a hosted version of this app live at [<https://robotf.ai/Halloween
- [Features](#features-️🕯️)
- [Getting Started](#getting-started-🧹)
- [Docker Compose with LocalAI](#option-1-local-ai-with-docker-compose-🖤)
- [Docker from DockerHub](#option-2-docker-hub-container-👻)
- [Direct Python Development](#option-3-local-development-👨‍💻)
- [Docker from DockerHub](#option-2-docker-hub-container-👻)
- [Direct Python Development](#option-3-local-development-👨‍💻)
- [Running the App](#running-the-app)
- [OpenAI](#openai)
- [Docker Compose](#docker-compose)
- [Using LocalAI/LMStudio/Ollama/etc locally](#using-localailmstudioollamaetc-locally)
- [Using a custom endpoint URL](#using-a-custom-endpoint-url)
- [Development Setup](#development-setup)
- [Text to Speech](#text-to-speech)
- [Contact](#contact)
- [Contributing](#contributing-👥)
- [License](#license-📜)
Expand All @@ -24,17 +30,28 @@ Interact with a hosted version of this app live at [<https://robotf.ai/Halloween

Welcome to the eerie realm of the Spooky Streamlit Storyteller! This is no ordinary codebase; it's a haunted mansion of horror stories, where AI and LLMs (Large Language Models) come together to weave chilling tales that will send shivers down your spine. If you're brave enough to conjure up a streaming app with Streamlit that generates spooky Halloween stories, you've just unlocked the creaky front door.

In reality this is just an example of how to integrate LLM's with Streamlit using python, langchain, requests, and even LocalAI (if you don't want to waste money on OpenAI credits.). This is just a demo to show people what is possible.

## About the Project 👻

This project is a digital ouija board, channeling the supernatural power of AI to craft horror stories that are as dynamic as they are dreadful. With Streamlit's enchanting capabilities, we've bewitched an app that streams terror with the grace of a ghost gliding through the night.

## Features 🕯️

AI-Powered Storytelling: Summon the spirits of AI to generate tales of terror on the fly.

Bring your own AI/LLM with LocalAI (or other custom OpenAI compatible API) or use OpenAI

Interactive UI: Choose your own adventure by selecting story elements that shape your frightening fable.

Real-time Streaming: Experience the horror unfold in real-time as the story mutates before your terrified eyes.

Text to Speech: Don't just read it the story, hear it told to you using TTS on OpenAI or LocalAI

Halloween Humor: Because even in the darkest depths, a chuckle can be the most terrifying sound.

![application](images/app.png)

## Getting Started 🧹

Choose your path to horror story glory with one of these three enchanting options:
Expand Down Expand Up @@ -129,6 +146,84 @@ OR

Set your key for OpenAI, or a custom address for your OpenAI compatiable API LLM endpoint.

## Running the Application

This is dependant on which API provider you are going to use:

Set your specific settings (see below for basic guides)

![settings](images/settings.png)

Hit the `Generate Story` button

![story](images/story.png)

If you want to hear the story spoken to you, hit the `Speak it to Me` button

![text-to-speech](images/text-to-speech.png)

### OpenAI

Set your OpenAI API Key at the top left

Select your LLM model (gpt-4)

Select your TTS model (tts-1)

Select your voice (you can change this later to try multiple voices)

Hit the `Generatate Story` button and watch it go.

Once the story is done generating if you want you can hit the `Speak it to me button` to generate and play the Text to Speech.

### Docker Compose

Leave the OpenAI API Key at the top left blank

Select the `http://localai:8080/v1` endpoint (internal docker networking)

Select your LLM model (example: gpt-4)

Select your TTS model (example: tts-1)

Select your voice (you can change this later to try multiple voices)

Hit the `Generatate Story` button and watch it go.

Once the story is done generating if you want you can hit the `Speak it to me button` to generate and play the Text to Speech.

### Using LocalAI/LMStudio/Ollama/etc locally

Leave the OpenAI API Key at the top left blank

Select the `http://localai:8080/v1` endpoint (internal docker networking)

Select your LLM model (example: gpt-4)

Select your TTS model (example: tts-1)

Select your voice (you can change this later to try multiple voices)

Hit the `Generatate Story` button and watch it go.

Once the story is done generating if you want you can hit the `Speak it to me button` to generate and play the Text to Speech.

### Using a custom endpoint URL

Leave the OpenAI API Key at the top left blank

Select the `http://localai:8080/v1` endpoint (internal docker networking)

Select your LLM model (example: gpt-4)

Select your TTS model (example: tts-1)

Select your voice (you can change this later to try multiple voices)

Hit the `Generatate Story` button and watch it go.

Once the story is done generating if you want you can hit the `Speak it to me button` to generate and play the Text to Speech.

## Development Setup

This repo uses things such as precommit, task, and brew (for Mac)
Expand Down Expand Up @@ -169,6 +264,15 @@ with just plain streamlit
streamlit run RoboTF_Halloween_Stories.py
```

## Text to Speech

For OpenAI select the TTS-1 model

For using LocalAI if you want extra voices, just copy the `voice_models_localai/tts-1.yaml` to the `models/` directory and startup (or restart LocalAI container).
This uses Piper under the hood with LocalAI.

Then you should be able to use the full selection of voices in the menu.

## Contact

<[email protected]>
Expand Down
165 changes: 155 additions & 10 deletions RoboTF_Halloween_Stories.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
import emoji
import io
import re
import requests
import streamlit as st
from langchain_openai import ChatOpenAI
from requests.exceptions import RequestException
import requests

accumulated_story = ""

# Function to generate story stream
def generate_story_stream(api_key, endpoint, model, prompt):
def generate_story_stream(api_key, endpoint, model, prompt, on_complete_callback=None):
llm = ChatOpenAI(
base_url=endpoint,
openai_api_key=api_key,
Expand All @@ -13,9 +18,98 @@ def generate_story_stream(api_key, endpoint, model, prompt):
)

formatted_input = [{"role": "user", "content": prompt}]

if on_complete_callback:
on_complete_callback(accumulated_story)
return llm.stream(formatted_input)

# Function to handle the completion of the story generation
def on_story_complete(story):
# This function will be called once the story streaming is complete
# You can now use the accumulated_story variable as needed
global accumulated_story
accumulated_story = story
print("Story complete:", accumulated_story)

# Function to remove emojis and special characters for better TTS
def remove_emojis(text):
# Remove emoji using the emoji library
text_without_emojis = emoji.replace_emoji(text, replace='')

# Remove asterisks using regex
text_without_asterisks = re.sub(r"\*", '', text_without_emojis)

# Remove quotes (single and double)
text_without_quotes = re.sub(r"[\"']", '', text_without_asterisks)

# Remove line breaks and extra spaces
text_without_linebreaks = text_without_quotes.replace("\n", " ").replace("\r", " ").strip()

# Remove special characters (except for alphanumeric and spaces)
clean_text = re.sub(r"[^a-zA-Z0-9\s]", '', text_without_linebreaks)

# Replace multiple spaces with a single space
final_cleaned_text = re.sub(r"\s+", ' ', clean_text)

return final_cleaned_text

# Function to get the TTS wav file
def text_to_speech(text, endpoint, api_key, tts_model, voice_selection):
"""
Convert text to speech using the provided TTS endpoint and model.
"""

headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}

if 'api.openai.com' in endpoint:
# If it does, append '/audio/speech' to the endpoint
tts_endpoint = endpoint + '/audio/speech'
payload = {
"input": text,
"model": tts_model,
"voice": voice_selection,
"response_format": "wav"
}
else:
# If it does not, replace '/v1' with '/tts' for LocalAI
tts_endpoint = endpoint.replace("/v1", "/tts")
payload = {
"model": voice_selection+".onnx",
"backend": "piper",
"input": text
}

print(f"tts_endpoint: {tts_endpoint}")
print(f"tts_model: {tts_model}")
print(f"voice: {voice_selection}")
print(f"Payload: {payload}")

response = requests.post(tts_endpoint, headers=headers, json=payload)
print(f"request: {response}")
if response.status_code == 200:
audio_content = response.content
print(response)
print(response.status_code)

audio_content = response.content

with open('temp_audio.wav', 'wb') as f:
f.write(audio_content)

return io.BytesIO(audio_content)

else:
raise RequestException(f"TTS request failed with status code {response.status_code}")

# Function to play the audio
def play_audio(audio_bytes):
"""
Play audio directly within the Streamlit app.
"""
st.audio(audio_bytes, format='audio/wav', autoplay=True)

# Function to query models from LLM URL
def get_llm_models(llm_url, api_key):
headers = {
Expand All @@ -24,8 +118,8 @@ def get_llm_models(llm_url, api_key):

try:
response = requests.get(f"{llm_url}/models", headers=headers)
response.raise_for_status() # Raise an exception for HTTP errors
if response.status_code == 200:
response.raise_for_status() # Raise an exception for HTTP errors
if response.status_code == 200:
return [model['id'] for model in response.json().get('data', [])]
else:
st.error("Failed to fetch models. Status code: {response.status_code}")
Expand All @@ -39,14 +133,14 @@ def get_llm_models(llm_url, api_key):
st.error(f"An unexpected error occurred: {e}")
return []

def main():
def main():
# Streamlit app
st.title("RoboTF Halloween Story Generator")

st.image("images/robotf_halloween.jpg")
st.sidebar.title("Settings")
api_key = st.sidebar.text_input("OpenAI API Key (Leave Blank for LocalAI)", type="password", value="1234")
default_endpoint = st.sidebar.selectbox("Default Endpoint", ["https://api.openai.com/v1", "http://localai:8080/v1"], index=1)
api_key = st.sidebar.text_input("OpenAI API Key (Leave Blank for LocalAI unless API Key set on Server)", type="password", value="1234")
default_endpoint = st.sidebar.selectbox("Default Endpoint", ["https://api.openai.com/v1", "http://localai:8080/v1", "http://localhost:8080/v1"], index=1)

st.sidebar.write("Or Use Another API")

Expand All @@ -60,11 +154,42 @@ def main():
# Sidebar to select the LLM model
model = st.sidebar.selectbox("Select LLM Model", models)

tts_model = st.sidebar.selectbox("Select the TTS Model", models)

# Check if the endpoint contains 'api.openai.com'
if 'api.openai.com' in endpoint:
# If it does, append '/audio/speech' to the endpoint
voice_list = [
"alloy",
"echo",
"fable",
"onyx",
"nova",
"shimmer"
]
else:
# If it does not, replace '/v1' with '/tts'
voice_list = [
"en-us-amy-low",
"en-gb-alan-low",
"en-gb-southern_english_female-low",
"en-us-danny-low",
"en-us-kathleen-low",
"en-us-lessac-low",
"en-us-lessac-medium",
"en-us-libritts-high",
"en-us-ryan-high",
"en-us-ryan-low",
"en-us-ryan-medium",
]

voice_selection = st.sidebar.selectbox("Select the Voice", voice_list)

# Show default prompt and allow changes
st.write(':green[User Prompt]')
user_prompt = """Create a spooky Halloween tale where cutting-edge AI and powerful
hardware like GPUs and CPUs come to life. In this story, large language models (LLMs)
play a central role, but something goes wrong during testing, inference, power
play a central role, but something goes wrong during testing, inference, power
consumption or anything else that is AI related. Perhaps the models start
predicting strange, eerie outcomes, or the hardware begins to malfunction in ways no
one expected. The tale should blend technological horror with classic Halloween
Expand Down Expand Up @@ -93,8 +218,28 @@ def main():
if api_key and endpoint and model:
# Clear the story placeholder before generating a new story
story_placeholder.empty()
# Update the global accumulated_story variable
global accumulated_story
# Generate and stream the story into the placeholder
story_placeholder.write_stream(generate_story_stream(api_key, endpoint, model, prompt))
# Pass the on_story_complete function as a callback
accumulated_story = story_placeholder.write_stream(generate_story_stream(api_key, endpoint, model, prompt, on_complete_callback=on_story_complete))
# The on_story_complete function will be called with the full story content
print(accumulated_story)
st.session_state['accumulated_story'] = accumulated_story


if st.button("Speak It To Me"):
print("Speak it to Me Button Clicked")
# Retrieve the generated story
print(f"Full Story: {st.session_state['accumulated_story']}")
story_text = st.session_state['accumulated_story']
clean_text = remove_emojis(story_text)
print(f"Clean Text: {clean_text}")
st.text_area(':green[Generated Story:]', story_text, key="story_text", height=400)
# Convert the story to speech
audio_bytes = text_to_speech(clean_text, endpoint, api_key, tts_model, voice_selection)
# Play the audio
play_audio(audio_bytes)

if __name__ == "__main__":
main()
2 changes: 1 addition & 1 deletion Taskfile.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
version: "3"

vars:
IMAGE_VERSION: "v0.0.3"
IMAGE_VERSION: "v0.0.4"
IMAGE_NAME: "robotf/robotf-halloween-stories"

tasks:
Expand Down
Binary file added images/app.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/story.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/text-to-speech.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{
"name": "robotf-halloween-stories",
"version": "v0.0.3"
"version": "v0.0.4"
}
Loading

0 comments on commit 3ead49b

Please sign in to comment.