Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: New Feature: Support for configuring dimensions in Google AI embeddings generation #10488

Open
ArieSLV opened this issue Feb 11, 2025 · 0 comments
Labels
.NET Issue or Pull requests regarding .NET code triage

Comments

@ArieSLV
Copy link

ArieSLV commented Feb 11, 2025


name: Feature request
about: Suggest an idea for this project

Description

The current implementation of Google AI embeddings in Semantic Kernel doesn't support configuring the output dimensionality of embeddings, although this is a supported feature in the Google AI API. Adding this capability would give users more control over the embedding generation process and allow for optimization of the embedding size based on specific use cases.

Current Behavior

When using GoogleAITextEmbeddingGenerationService, there is no option to specify the desired number of dimensions for the embeddings. The service currently relies on the default dimensionality provided by the model.

Proposed Solution

Introduce an optional dimensions parameter in the Google AI embedding-related classes and methods. This can be implemented by:

  1. Service Constructor Changes:
    • Add an optional dimensions parameter to the GoogleAITextEmbeddingGenerationService constructor.
  2. Builder and Extension Methods:
    • Extend the builder methods (e.g., AddGoogleAIEmbeddingGeneration) to accept the dimensions parameter.
    • Propagate the dimensions value through to the Google AI API calls.
  3. Request and Metadata Enhancements:
    • Add an optional Dimensions property (serialized as output_dimensionality) in the GoogleAIEmbeddingRequest class.
    • Update the metadata and attributes system to include the dimensions value when provided.
  4. Unit Testing:
    • Add tests to verify that when a dimensions value is provided, it is correctly serialized in the JSON request.
    • Ensure that if the parameter is not provided, the default behavior remains unchanged.
  5. Backward Compatibility:
    • Ensure that existing implementations remain unaffected since the new parameter is optional.

Benefits

  • Customization: Users can fine-tune the embedding dimensionality to better fit their specific requirements.
  • Performance & Resource Optimization: Adjusting the embedding size may lead to improvements in memory usage and processing performance.
  • API Feature Alignment: Leverages the full capabilities of the Google AI API.
  • Flexibility: Provides additional configuration options without breaking existing functionality.

Example Usage

// Using the service directly
var embeddingService = new GoogleAITextEmbeddingGenerationService(
    modelId: "models/embedding-001",
    apiKey: "your-api-key",
    dimensions: 512);

// Using the builder pattern
kernel.AddGoogleAIEmbeddingGeneration(
    modelId: "models/embedding-001",
    apiKey: "your-api-key",
    dimensions: 512);

Additional Context

This feature would be particularly beneficial in scenarios where:

  • Users need to optimize storage or bandwidth.
  • Specific downstream tasks require embeddings with a particular number of dimensions.
  • Performance optimizations are needed for resource-constrained environments.
  • There is a need to integrate with existing systems that expect embeddings of a specific size.

Related Documentation

@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Feb 11, 2025
@github-actions github-actions bot changed the title New Feature: Support for configuring dimensions in Google AI embeddings generation .Net: New Feature: Support for configuring dimensions in Google AI embeddings generation Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
.NET Issue or Pull requests regarding .NET code triage
Projects
None yet
Development

No branches or pull requests

2 participants