Skip to content

Conversation

giriraj-singh-couchbase

This pull request introduces a new tutorial for using Couchbase Capella's AI Services auto-vectorization feature with LangChain, focusing on unstructured data workflows—especially data stored in S3 buckets. The changes add comprehensive documentation and a runnable Jupyter notebook that walks users through deploying models, configuring workflows, importing unstructured data, and performing semantic vector search with LangChain.

The most important changes are:

Documentation and Tutorial Content:

  • Added a detailed README.md explaining prerequisites, installation steps, and a quick start guide for the auto-vectorization tutorial.
  • Added frontmatter.md to provide metadata and summary information for the tutorial, including title, description, tags, and estimated duration.

Jupyter Notebook Tutorial:

  • Introduced autovec_unstructured.ipynb, a step-by-step notebook covering:
    • Capella cluster and model deployment.
    • Setting up access control and API keys for AI models.
    • Creating and configuring auto-vectorization workflows to import and process unstructured data from S3.
    • Performing semantic search using vector embeddings and LangChain integration, with code samples and explanations.

Copy link

github-actions bot commented Sep 26, 2025

Caution

Notebooks or Frontmatter Files Have Been Modified

  • Please ensure that a frontmatter.md file is accompanying the notebook file, and that the frontmatter is up to date.
  • These changes will be published to the developer portal tutorials only if frontmatter.md is included.
  • Proofread all changes before merging, as changes to notebook and frontmatter content will update the developer tutorial.

1 Notebook Files Modified:

Notebook File Frontmatter Included?
autovec_unstructured/autovec_unstructured.ipynb

1 Frontmatter Files Modified:

Frontmatter File
autovec_unstructured/frontmatter.md
Note: frontmatter will be checked and tested in the Test Frontmatter workflow.

Copy link

Summary of Changes

Hello @giriraj-singh-couchbase, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a new, comprehensive tutorial designed to educate users on leveraging Couchbase Capella's AI Services for auto-vectorization of unstructured data sourced from S3 buckets. The tutorial provides a practical, step-by-step guide, including a Jupyter notebook, to set up and execute workflows that transform raw data into vector embeddings, enabling advanced semantic search capabilities through LangChain integration. The primary goal is to empower users to efficiently manage and query unstructured data within the Capella ecosystem.

Highlights

  • New Tutorial Introduction: This pull request introduces a new tutorial demonstrating Couchbase Capella's AI Services auto-vectorization feature for unstructured data, specifically focusing on data stored in S3 buckets.
  • LangChain Integration: The tutorial showcases how to integrate with LangChain for performing semantic vector search on the auto-vectorized data.
  • Comprehensive Documentation: A detailed README.md and frontmatter.md have been added to provide prerequisites, installation steps, and metadata for the tutorial.
  • Interactive Jupyter Notebook: A runnable Jupyter notebook (autovec_unstructured.ipynb) is included, guiding users through Capella cluster and model deployment, access control setup, configuring auto-vectorization workflows for S3 data, and executing semantic search.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new tutorial for using Couchbase Capella's AI Services auto-vectorization feature. While the tutorial is comprehensive, there are several areas that need improvement. Critically, the Jupyter notebook references images that are not included in the pull request, which will prevent users from following the visual steps. There are also significant structural issues, such as incorrect section numbering and confusing instructions that reference incorrect data sources. Additionally, there are opportunities to improve code quality by removing unused imports, using environment variables for credentials to promote security best practices, and fixing minor typos and grammatical errors. Addressing these points will greatly improve the quality and usability of the tutorial.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant