Skip to content

docs: add documentation about transcription uploads options#13

Open
lkiesow wants to merge 5 commits into
namastexlabs:mainfrom
lkiesow:parameter-docs
Open

docs: add documentation about transcription uploads options#13
lkiesow wants to merge 5 commits into
namastexlabs:mainfrom
lkiesow:parameter-docs

Conversation

@lkiesow

@lkiesow lkiesow commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

Most options for the /v1/transcript endpoint to start a new transcription are undocumented. This patch adds documentation for all of the current request parameters.

Summary by CodeRabbit

  • Documentation
    • Added comprehensive "Parameters" section to the README documenting all API inputs and their descriptions across multiple categories: Input, Language & Task, Speaker Diarization, Decoding, Timestamps & Alignment, VAD, and Subtitle Formatting, with detailed parameter tables showing type, default values, and descriptions.

Most options for the `/v1/transcript` endpoint to start a new
transcription are undocumented. This patch adds documentation for all of
the current request parameters.
@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello @lkiesow, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the API documentation by detailing the various parameters available for the /v1/transcript endpoint. The changes provide clarity on how to utilize the transcription service's advanced features, covering everything from input methods and language settings to speaker diarization, decoding options, and webhook configurations. This update aims to improve usability and reduce ambiguity for developers integrating with the API.

Highlights

  • API Documentation: Added comprehensive documentation for all request parameters of the /v1/transcript endpoint, which were previously undocumented.
  • Parameter Categorization: Organized the new documentation into logical categories such as Input, Language & Task, Speaker Diarization, Decoding, Timestamps & Alignment, VAD, Subtitle Formatting, and Webhook parameters.
Changelog
  • README.md
    • Added detailed tables documenting all request parameters for the /v1/transcript endpoint.
    • Categorized parameters for better readability and understanding.
Activity
  • No review comments or activity have been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with πŸ‘ and πŸ‘Ž on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds comprehensive documentation for the transcription endpoint parameters, which is a great improvement for usability. The new documentation is well-structured and aligns with the implementation. I've added a few suggestions to enhance the clarity of some of the more technical decoding parameters.

Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
@lkiesow lkiesow changed the title Add documentation about transcription uploads options docs: add documentation about transcription uploads options Feb 25, 2026
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@coderabbitai

coderabbitai Bot commented Mar 1, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@lkiesow has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 24 minutes and 20 seconds before requesting another review.

βŒ› How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between f6da439 and f001a4c.

πŸ“’ Files selected for processing (1)
  • README.md
πŸ“ Walkthrough

Walkthrough

The PR adds a comprehensive "Parameters" section to README.md documenting API inputs across eight categories: Input, Language & Task, Speaker Diarization, Decoding, Timestamps & Alignment, VAD, Subtitle Formatting, and Webhook. Each category includes parameter tables with type, default value, and description details.

Changes

Cohort / File(s) Summary
Documentation
README.md
Added 79 lines introducing a "Parameters" section with detailed tables documenting API inputs, their types, defaults, and descriptions across eight functional categories.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A Parameters guide so neat,
Eight categories, documentation sweet,
API inputs now crystal clear,
Tables structured, far and near,
Users rejoiceβ€”no more to fear! ✨

πŸš₯ Pre-merge checks | βœ… 3
βœ… Passed checks (3 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The title accurately and concisely summarizes the main change: adding documentation for transcription upload options. It directly aligns with the PR's primary objective of documenting /v1/transcript endpoint parameters.
Docstring Coverage βœ… Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
πŸ§ͺ Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
README.md (2)

248-254: Add a webhook security note near these parameters.

Consider adding a brief note to prefer HTTPS endpoints and treat webhook_auth_header as a secret. This helps prevent accidental insecure deployments.

πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 248 - 254, Add a short security note next to the
webhook parameters (`webhook_url`, `webhook_auth_header`) advising to use HTTPS
endpoints and to treat `webhook_auth_header` as a secret (do not commit/store in
plaintext), and suggest rotating credentials and validating payloads if
possible; insert this note immediately after the Webhook table so readers see
security guidance alongside the parameter definitions.

185-239: Document value ranges and key constraints for decoding/VAD parameters.

The tables are comprehensive, but for several numeric fields users still won’t know valid bounds (for example 0–1 thresholds) or common constraints. Adding ranges/constraints here would reduce invalid requests and support churn.

πŸ“ Suggested doc patch
 **Decoding:**
@@
 | `temperature` | float | `0.0` | Sampling temperature (`0` = greedy decoding) |
@@
 | `no_speech_threshold` | float | `0.6` | If the no-speech probability is higher than this value, the segment is considered silent |
@@
+Notes:
+- `temperature`, `temperature_increment_on_fallback` should be non-negative.
+- `no_speech_threshold` is typically in `[0.0, 1.0]` (higher = more aggressive silence filtering).
+- `logprob_threshold` and `compression_ratio_threshold` are advanced filters; defaults are recommended unless tuning for a specific dataset.
+
 **VAD (Voice Activity Detection):**
@@
 | `vad_onset` | float | `0.5` | VAD speech onset threshold |
 | `vad_offset` | float | `0.363` | VAD speech offset threshold |
@@
+Notes:
+- `vad_onset` / `vad_offset` are probability thresholds, typically in `[0.0, 1.0]`.
πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 185 - 239, Add explicit valid ranges and key
constraints for numeric and enumerated parameters shown (e.g., temperature,
temperature_increment_on_fallback, beam_size, best_of, patience, length_penalty,
logprob_threshold, compression_ratio_threshold, no_speech_threshold, vad_onset,
vad_offset, chunk_size, speakers_expected/min_speakers/max_speakers, and boolean
flags like word_timestamps/return_char_alignments); for each parameter include
allowed type, min/max or accepted values (e.g., temperature 0.0–1.0, beam_size
>=1, best_of >=1, patience >=0, compression_ratio_threshold >=1.0,
logprob_threshold typically <=0, vad_onset/vad_offset in 0.0–1.0, chunk_size >0
and recommended max like 30s, segment_resolution ∈ {sentence, chunk}), note any
inter-parameter constraints (e.g., max_speakers >= min_speakers,
speakers_expected optional when diarization disabled), and mark defaults and
recommended typical ranges to prevent invalid requests.
πŸ€– Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@README.md`:
- Around line 248-254: Add a short security note next to the webhook parameters
(`webhook_url`, `webhook_auth_header`) advising to use HTTPS endpoints and to
treat `webhook_auth_header` as a secret (do not commit/store in plaintext), and
suggest rotating credentials and validating payloads if possible; insert this
note immediately after the Webhook table so readers see security guidance
alongside the parameter definitions.
- Around line 185-239: Add explicit valid ranges and key constraints for numeric
and enumerated parameters shown (e.g., temperature,
temperature_increment_on_fallback, beam_size, best_of, patience, length_penalty,
logprob_threshold, compression_ratio_threshold, no_speech_threshold, vad_onset,
vad_offset, chunk_size, speakers_expected/min_speakers/max_speakers, and boolean
flags like word_timestamps/return_char_alignments); for each parameter include
allowed type, min/max or accepted values (e.g., temperature 0.0–1.0, beam_size
>=1, best_of >=1, patience >=0, compression_ratio_threshold >=1.0,
logprob_threshold typically <=0, vad_onset/vad_offset in 0.0–1.0, chunk_size >0
and recommended max like 30s, segment_resolution ∈ {sentence, chunk}), note any
inter-parameter constraints (e.g., max_speakers >= min_speakers,
speakers_expected optional when diarization disabled), and mark defaults and
recommended typical ranges to prevent invalid requests.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between 907e7a2 and f6da439.

πŸ“’ Files selected for processing (1)
  • README.md

lkiesow and others added 3 commits March 2, 2026 00:11
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants