Skip to content

frank1003A/Google-stt-websocket-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WebSocket Speech Recognition Server

This repository contains a WebSocket server that uses Google Cloud Speech-to-Text to provide real-time transcription of audio streams sent from connected WebSocket clients. The server is built with Express and WebSocket, allowing for real-time speech recognition and transcription.

Table of Contents

Overview

This WebSocket server listens for audio data streams from clients, processes the audio using Google Cloud Speech-to-Text, and sends back real-time transcriptions to the client. The server supports multiple language codes and dynamically adjusts to each client's selected language.

Features

  • Real-time Transcription: Transcribes audio data from WebSocket clients in real-time.
  • Language Support: Specify language codes in the WebSocket connection URL.
  • Configurable Server: Easily configurable for local development and production.
  • Auto-reconnect Handling: Automatically restarts the recognition stream when connections close or errors occur.

Setup and Installation

Prerequisites

Local Development

  1. Clone the Repository:

    git clone https://github.com/your-username/hng-websocket-server.git
    cd hng-websocket-server
  2. Install Dependencies:

    npm install
  3. Configure Environment Variables: Create a .env file in the root of the project with the necessary environment variables (see Environment Variables below).

  4. Run the Server:

    npm start
  5. The server will start on http://localhost:8080 by default, or on the port specified in your .env file.

Production Deployment

To deploy this server to a production environment, you can use services like Render, Heroku, or any other Node.js-compatible hosting service.

General Deployment Steps

  1. Set up your environment variables in your hosting provider's dashboard.
  2. Deploy the repository and configure the start command to npm start.

Environment Variables

The server requires the following environment variables:

  • PORT: (Optional) The port on which the server will run. Defaults to 8080.
  • GOOGLE_APPLICATION_CREDENTIALS: JSON string of your Google Cloud Speech-to-Text service account credentials.

Example .env file:

PORT=8080
GOOGLE_APPLICATION_CREDENTIALS='{"type": "service_account", "project_id": "your-project-id", ... }'

About

A mini websocket server for google speech-to-text live streaming

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors