Add Inference embed and rerank by jhamon · Pull Request #436 · pinecone-io/pinecone-python-client

jhamon · 2025-01-28T17:46:39Z

Problem

We want to migrate the content of the inference plugin into the core SDK. This should improve UX by allowing code completion to function properly.

Solution

Begin by copying & modifying code from plugin
Lazily instantiate the class from the Pinecone parent class when someone attempts to use inference capabilities. This saves a little bit of startup overhead if someone is not using these functions.
Migrated tests from the plugin and added coverage of some additional models
Removed dependency on the pinecone-plugin-inference package, since we now have all the functionality of that plugin in the core SDK.

In addition to copy & modify:

Implemented Enum objects, RerankModel and EmbedModel, to serve mainly as documentation and support for code suggestions. You can still pass in any string you like to keep this forward compatible with any future models.
Inference class now scans for plugins extending its own capability during initialization. This is so that we can add more experimental things into the inference namespace via future plugins if desired. I should really define a base class that does this scan for all classes by default but have not gotten around to it yet.
Added some warning messages if the user attempts to interact with the Inference class directly instead of going through the parent object. This is following a pattern we started for the Index client based on customer success team feedback.
Removed old integration tests verifying the old plugin was being correctly installed.

Usage: Embed

from pinecone import Pinecone

pc = Pinecone(api_key='key')

# You can pass the model name as a string if you know the correct value
embeddings = pc.inference.embed(
    model="multilingual-e5-large",
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)

Or, if you'd like you can import an enum of available options

from pinecone import Pinecone, EmbedModel

pc = Pinecone(api_key='key')

# You can get the model name from an enum
embeddings = pc.inference.embed(
    model=EmbedModel.Multilingual_E5_Large,
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)

Or, for the very clever/lazy, the enum is also attached to the inference namespace

from pinecone import Pinecone

pc = Pinecone(api_key='key')

embeddings = pc.inference.embed(
    model=pc.inference.EmbedModel.Multilingual_E5_Large,
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)

Usage: Rerank

Same enum mechanic is now available for Rerank options

from pinecone import Pinecone, RerankModel

pc = Pinecone(api_key='key')

model = RerankModel.Bge_Reranker_V2_M3
result = client.inference.rerank(
    model=model,
    query="i love dogs",
    documents=["dogs are pretty cool", "everyone loves dogs", "I'm a cat person"],
    return_documents=False,
)

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
Infrastructure change (CI configs, etc)
Non-code change (docs, etc)
None of the above: (explain here)

ssmith-pc · 2025-01-28T18:20:38Z

Haven't fully reviewed yet, but one top level question that comes to mind is what happens if someone still has the inference-plugin as a dependency on their project?

austin-denoble · 2025-01-28T18:45:37Z

Haven't fully reviewed yet, but one top level question that comes to mind is what happens if someone still has the inference-plugin as a dependency on their project?

Haven't reviewed quite yet either, but my expectation is during runtime the plugin code would try and overwrite the existing methods, and fail silently. I'm saying fail silently because I think we just log if there's something erroneous in the plugin loading process. I'd need to double check though, it's something we should definitely test around.

jhamon · 2025-01-28T19:00:47Z

I think rather than have anxiety about possible race conditions, interactions, etc, I think it's best to just fail loud and clear with instructions. Thoughts? Leaving this in to silently fail just seems like it has a lot of potential to be a landmine causing weird problems down the line.

austin-denoble · 2025-01-28T19:32:53Z

I think rather than have anxiety about possible race conditions, interactions, etc, I think it's best to just fail loud and clear with instructions. Thoughts? Leaving this in to silently fail just seems like it has a lot of potential to be a landmine causing weird problems down the line.

If we can give them an error specifying that they need to uninstall the plugin before using the SDK that seems like it should be fine.

ssmith-pc

this looks great

ssmith-pc · 2025-01-29T01:54:14Z

pinecone/data/features/inference/inference_request_builder.py

+class EmbedModel(Enum):
+    Multilingual_E5_Large = "multilingual-e5-large"
+    Pinecone_Sparse_English_V0 = "pinecone-sparse-english-v0"
+
+
+class RerankModel(Enum):
+    Bge_Reranker_V2_M3 = "bge-reranker-v2-m3"
+    Cohere_Rerank_3_5 = "cohere-rerank-3.5"
+    Pinecone_Rerank_v0 = "pinecone-rerank-v0"


the model enums are nice, user-friendly additions

Wonder if it makes sense to do something similar for the model-specific parameters supported by each model (separate PR)

Thanks, yeah, I think it's quite nice and want to bring a similar strategy to other places in the SDK. The fields are loosely typed as Union[RerankModel, str] and Union[EmbedModel, str] so you can still pass whatever you want but it's nice to have some structure indicating what values are available. I hate when I'm trying to do something and need to go lookup what string values are valid. E.g. what cloud regions actually exist for serverless?

austin-denoble

Looks great, thanks for pulling the internals of these plugins into the client.

Add embed and rerank capabilities

c16964a

jhamon changed the base branch from main to release-candidate/2025-01 January 28, 2025 17:46

jhamon added 2 commits January 28, 2025 12:47

Remove plugin dependency

aa7a816

Inference class scans for plugins

816d1b2

jhamon requested review from austin-denoble and ssmith-pc January 28, 2025 18:09

jhamon added 2 commits January 28, 2025 13:24

Fix mypy errors

65ccceb

Remove obsolete workflow job

705db7e

Interrupt startup when deprecated plugins are present

24e7813

ssmith-pc approved these changes Jan 29, 2025

View reviewed changes

mypy and ruff fixes

1985a63

austin-denoble approved these changes Jan 29, 2025

View reviewed changes

jhamon marked this pull request as ready for review January 29, 2025 05:46

jhamon merged commit 1ed5932 into release-candidate/2025-01 Jan 29, 2025
71 checks passed

jhamon deleted the jhamon/merge-inference2 branch January 29, 2025 05:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add Inference embed and rerank#436

Add Inference embed and rerank#436
jhamon merged 7 commits intorelease-candidate/2025-01from
jhamon/merge-inference2

jhamon commented Jan 28, 2025 •

edited

Loading

Uh oh!

ssmith-pc commented Jan 28, 2025

Uh oh!

austin-denoble commented Jan 28, 2025

Uh oh!

jhamon commented Jan 28, 2025 •

edited

Loading

Uh oh!

austin-denoble commented Jan 28, 2025

Uh oh!

ssmith-pc left a comment

Uh oh!

ssmith-pc Jan 29, 2025

Uh oh!

jhamon Jan 29, 2025

Uh oh!

austin-denoble left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

jhamon commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Usage: Embed

Usage: Rerank

Type of Change

Uh oh!

ssmith-pc commented Jan 28, 2025

Uh oh!

austin-denoble commented Jan 28, 2025

Uh oh!

jhamon commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

austin-denoble commented Jan 28, 2025

Uh oh!

ssmith-pc left a comment

Choose a reason for hiding this comment

Uh oh!

ssmith-pc Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

jhamon Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

austin-denoble left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jhamon commented Jan 28, 2025 •

edited

Loading

jhamon commented Jan 28, 2025 •

edited

Loading