Skip to content

Comments

Add Inference embed and rerank#436

Merged
jhamon merged 7 commits intorelease-candidate/2025-01from
jhamon/merge-inference2
Jan 29, 2025
Merged

Add Inference embed and rerank#436
jhamon merged 7 commits intorelease-candidate/2025-01from
jhamon/merge-inference2

Conversation

@jhamon
Copy link
Collaborator

@jhamon jhamon commented Jan 28, 2025

Problem

We want to migrate the content of the inference plugin into the core SDK. This should improve UX by allowing code completion to function properly.

Solution

  • Begin by copying & modifying code from plugin
  • Lazily instantiate the class from the Pinecone parent class when someone attempts to use inference capabilities. This saves a little bit of startup overhead if someone is not using these functions.
  • Migrated tests from the plugin and added coverage of some additional models
  • Removed dependency on the pinecone-plugin-inference package, since we now have all the functionality of that plugin in the core SDK.

In addition to copy & modify:

  • Implemented Enum objects, RerankModel and EmbedModel, to serve mainly as documentation and support for code suggestions. You can still pass in any string you like to keep this forward compatible with any future models.
  • Inference class now scans for plugins extending its own capability during initialization. This is so that we can add more experimental things into the inference namespace via future plugins if desired. I should really define a base class that does this scan for all classes by default but have not gotten around to it yet.
  • Added some warning messages if the user attempts to interact with the Inference class directly instead of going through the parent object. This is following a pattern we started for the Index client based on customer success team feedback.
  • Removed old integration tests verifying the old plugin was being correctly installed.

Usage: Embed

from pinecone import Pinecone

pc = Pinecone(api_key='key')

# You can pass the model name as a string if you know the correct value
embeddings = pc.inference.embed(
    model="multilingual-e5-large",
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)

Or, if you'd like you can import an enum of available options

from pinecone import Pinecone, EmbedModel

pc = Pinecone(api_key='key')

# You can get the model name from an enum
embeddings = pc.inference.embed(
    model=EmbedModel.Multilingual_E5_Large,
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)

Or, for the very clever/lazy, the enum is also attached to the inference namespace

from pinecone import Pinecone

pc = Pinecone(api_key='key')

embeddings = pc.inference.embed(
    model=pc.inference.EmbedModel.Multilingual_E5_Large,
    inputs=["The quick brown fox jumps over the lazy dog.", "lorem ipsum"],
    parameters={"input_type": "query", "truncate": "END"},
)

Usage: Rerank

Same enum mechanic is now available for Rerank options

from pinecone import Pinecone, RerankModel

pc = Pinecone(api_key='key')

model = RerankModel.Bge_Reranker_V2_M3
result = client.inference.rerank(
    model=model,
    query="i love dogs",
    documents=["dogs are pretty cool", "everyone loves dogs", "I'm a cat person"],
    return_documents=False,
)

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Infrastructure change (CI configs, etc)
  • Non-code change (docs, etc)
  • None of the above: (explain here)

@jhamon jhamon changed the base branch from main to release-candidate/2025-01 January 28, 2025 17:46
@ssmith-pc
Copy link
Contributor

Haven't fully reviewed yet, but one top level question that comes to mind is what happens if someone still has the inference-plugin as a dependency on their project?

@austin-denoble
Copy link
Contributor

Haven't fully reviewed yet, but one top level question that comes to mind is what happens if someone still has the inference-plugin as a dependency on their project?

Haven't reviewed quite yet either, but my expectation is during runtime the plugin code would try and overwrite the existing methods, and fail silently. I'm saying fail silently because I think we just log if there's something erroneous in the plugin loading process. I'd need to double check though, it's something we should definitely test around.

@jhamon
Copy link
Collaborator Author

jhamon commented Jan 28, 2025

I think rather than have anxiety about possible race conditions, interactions, etc, I think it's best to just fail loud and clear with instructions. Thoughts? Leaving this in to silently fail just seems like it has a lot of potential to be a landmine causing weird problems down the line.

Screenshot 2025-01-28 at 2 02 33 PM

@austin-denoble
Copy link
Contributor

I think rather than have anxiety about possible race conditions, interactions, etc, I think it's best to just fail loud and clear with instructions. Thoughts? Leaving this in to silently fail just seems like it has a lot of potential to be a landmine causing weird problems down the line.

Screenshot 2025-01-28 at 2 02 33 PM

If we can give them an error specifying that they need to uninstall the plugin before using the SDK that seems like it should be fine.

Copy link
Contributor

@ssmith-pc ssmith-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great

Comment on lines +12 to +20
class EmbedModel(Enum):
Multilingual_E5_Large = "multilingual-e5-large"
Pinecone_Sparse_English_V0 = "pinecone-sparse-english-v0"


class RerankModel(Enum):
Bge_Reranker_V2_M3 = "bge-reranker-v2-m3"
Cohere_Rerank_3_5 = "cohere-rerank-3.5"
Pinecone_Rerank_v0 = "pinecone-rerank-v0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the model enums are nice, user-friendly additions

Wonder if it makes sense to do something similar for the model-specific parameters supported by each model (separate PR)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah, I think it's quite nice and want to bring a similar strategy to other places in the SDK. The fields are loosely typed as Union[RerankModel, str] and Union[EmbedModel, str] so you can still pass whatever you want but it's nice to have some structure indicating what values are available. I hate when I'm trying to do something and need to go lookup what string values are valid. E.g. what cloud regions actually exist for serverless?

Copy link
Contributor

@austin-denoble austin-denoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for pulling the internals of these plugins into the client.

@jhamon jhamon marked this pull request as ready for review January 29, 2025 05:46
@jhamon jhamon merged commit 1ed5932 into release-candidate/2025-01 Jan 29, 2025
71 checks passed
@jhamon jhamon deleted the jhamon/merge-inference2 branch January 29, 2025 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants