Skip to content

RuntimeError: Input is too long for context length 77. No truncation passed #468

Open
@hessaAlawwad

Description

@hessaAlawwad

Hello,
So I am trying to embed text using CLIP, I got the error that my text is too long but from the huggingface I see that I can fix the variable:

max_position_embeddings (int, optional, defaults to 77) — The maximum sequence length that this model might ever be used with. Typically set this to something large just in case (e.g., 512 or 1024 or 2048).

any ideas on how can I do this?

My code is as follow:

import clip
client = OpenAI(api_key = "",)
#load model on device. The device you are running inference/training on is either a CPU or GPU if you have.
device = "cpu"
model, preprocess = clip.load("ViT-B/32",device=device)
def clip_get_features_from_single_image(image_path):
    image = preprocess(Image.open(image_path).convert("RGB"))
    image_input = torch.tensor(image).unsqueeze(0)
    with torch.no_grad():
        image_features = model.encode_image(image_input).float()
    return image_features

def clip_get_single_text_embedding(text):
    # inputs = clip.tokenize(text, context_length=77, truncate=True)
    inputs = clip.tokenize(text).to(device)
    with torch.no_grad():
        text_features = model.encode_text(inputs)
    return text_features 

and do I need to adjust the image embeddings size too in order to calculate the similarty with the text?

Thanks in advance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions