WebSocket Speech Recognition Server

This repository contains a WebSocket server that uses Google Cloud Speech-to-Text to provide real-time transcription of audio streams sent from connected WebSocket clients. The server is built with Express and WebSocket, allowing for real-time speech recognition and transcription.

Overview

This WebSocket server listens for audio data streams from clients, processes the audio using Google Cloud Speech-to-Text, and sends back real-time transcriptions to the client. The server supports multiple language codes and dynamically adjusts to each client's selected language.

Features

Real-time Transcription: Transcribes audio data from WebSocket clients in real-time.
Language Support: Specify language codes in the WebSocket connection URL.
Configurable Server: Easily configurable for local development and production.
Auto-reconnect Handling: Automatically restarts the recognition stream when connections close or errors occur.

Setup and Installation

Prerequisites

Node.js (version 14+)
Google Cloud account with Speech-to-Text enabled
Google Cloud service account credentials JSON

Local Development

Clone the Repository:

git clone https://github.com/your-username/hng-websocket-server.git
cd hng-websocket-server

Install Dependencies:
```
npm install
```
Configure Environment Variables: Create a .env file in the root of the project with the necessary environment variables (see Environment Variables below).
Run the Server:
```
npm start
```
The server will start on http://localhost:8080 by default, or on the port specified in your .env file.

Production Deployment

To deploy this server to a production environment, you can use services like Render, Heroku, or any other Node.js-compatible hosting service.

General Deployment Steps

Set up your environment variables in your hosting provider's dashboard.
Deploy the repository and configure the start command to npm start.

Environment Variables

The server requires the following environment variables:

PORT: (Optional) The port on which the server will run. Defaults to 8080.
GOOGLE_APPLICATION_CREDENTIALS: JSON string of your Google Cloud Speech-to-Text service account credentials.

Example .env file:

PORT=8080
GOOGLE_APPLICATION_CREDENTIALS='{"type": "service_account", "project_id": "your-project-id", ... }'

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
package-lock.json		package-lock.json
package.json		package.json
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebSocket Speech Recognition Server

Table of Contents

Overview

Features

Setup and Installation

Prerequisites

Local Development

Production Deployment

General Deployment Steps

Environment Variables

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WebSocket Speech Recognition Server

Table of Contents

Overview

Features

Setup and Installation

Prerequisites

Local Development

Production Deployment

General Deployment Steps

Environment Variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages