Speech to Text with OpenAI API

This script transcribes an audio file using OpenAI's Whisper model and optionally post-processes the transcription with GPT-4o for corrections. The transcription and corrected text are saved to text files.

Features

Transcribes audio files to text using OpenAI's Whisper model.
Optionally post-processes the transcription with GPT-4o to correct spelling and punctuation.
Progress bars for uploading the file.
Saves the transcription and corrected text to text files.

Supported Audio Formats

MP3 (.mp3)
MP4 (.mp4)
MPEG (.mpeg)
MPGA (.mpga)
M4A (.m4a)
WAV (.wav)
WEBM (.webm)

Prerequisites

Node.js (v14 or later)
npm (Node package manager)

Installation

Clone the repository:

git clone https://github.com/o-Oby/speech-to-text.git
cd speech-to-text

Install dependencies:

npm install fs path form-data axios readline-sync openai progress chalk

Setup

Update API Key:

Open the transcribe_and_postprocess.js file and replace the placeholder API key with your actual OpenAI API key.
```
const configuration = new Configuration({
  apiKey: 'your-api-key-here', // Replace with your actual API key
});
```

Update File Path:

Ensure the file path to your audio file is correct in the transcribeFile function.

const filePath = path.resolve('path/to/your/audio/file.m4a'); // Replace with your actual file path

Usage

Run the script:
```
node transcribe_and_postprocess.js
```
Follow the prompts:
- The script will ask if you want to post-process the transcription with GPT-4o.
- Respond with yes or no.

Files

transcription.txt: Contains the initial transcription of the audio file.
corrected_transcription.txt: Contains the corrected transcription (if post-processed with GPT-4o).

Notes

File uploads are currently limited to 25 MB. Ensure your audio file size does not exceed this limit.

License

This project is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
transcribe_and_postprocess.js		transcribe_and_postprocess.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech to Text with OpenAI API

Features

Supported Audio Formats

Prerequisites

Installation

Setup

Usage

Files

Notes

License

About

Releases

Packages

Languages

o-Oby/speech-to-text

Folders and files

Latest commit

History

Repository files navigation

Speech to Text with OpenAI API

Features

Supported Audio Formats

Prerequisites

Installation

Setup

Usage

Files

Notes

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages