Skip to content

Commit a889c42

Browse files
asmigoswshubhagr-quicabukhoy
authored andcommitted
Enabled Infer CLI for VLM (quic#287)
Added support for enabling VLMs via CLI. Sample command: ```bash python -m QEfficient.cloud.infer --model_name meta-llama/Llama-3.2-11B-Vision-Instruct --batch_size 1 --prompt_len 32 --ctx_len 512 --num_cores 16 --device_group [0] --prompt "Descrive the image?" --mos 1 --allocator_dealloc_delay 1 --image_url https://i.etsystatic.com/8155076/r/il/0825c2/1594869823/il_fullxfull.1594869823_5x0w.jpg ``` --------- Signed-off-by: Shubham Agrawal <[email protected]> Signed-off-by: Asmita Goswami <[email protected]> Signed-off-by: Abukhoyer Shaik <[email protected]> Co-authored-by: shubhagr-quic <[email protected]> Co-authored-by: Abukhoyer Shaik <[email protected]>
1 parent 0789be1 commit a889c42

File tree

2 files changed

+12
-1
lines changed

2 files changed

+12
-1
lines changed

QEfficient/base/common.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
from transformers import AutoConfig
1919

2020
from QEfficient.base.modeling_qeff import QEFFBaseModel
21-
from QEfficient.transformers.models.modeling_auto import QEFFAutoModelForCausalLM
21+
from QEfficient.transformers.modeling_utils import MODEL_CLASS_MAPPING
2222
from QEfficient.utils import login_and_download_hf_lm
2323

2424

QEfficient/transformers/modeling_utils.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@
1010

1111
import torch
1212
import torch.nn as nn
13+
import transformers.models.auto.modeling_auto as mapping
14+
from transformers import AutoModelForCausalLM
1315
from transformers.models.codegen.modeling_codegen import (
1416
CodeGenAttention,
1517
CodeGenBlock,
@@ -278,6 +280,15 @@
278280
}
279281

280282

283+
MODEL_CLASS_MAPPING = {
284+
**{architecture: "QEFFAutoModelForCausalLM" for architecture in mapping.MODEL_FOR_CAUSAL_LM_MAPPING_NAMES.values()},
285+
**{
286+
architecture: "QEFFAutoModelForImageTextToText"
287+
for architecture in mapping.MODEL_FOR_IMAGE_TEXT_TO_TEXT_MAPPING_NAMES.values()
288+
},
289+
}
290+
291+
281292
def _prepare_cross_attention_mask(
282293
cross_attention_mask: torch.Tensor,
283294
num_vision_tokens: int,

0 commit comments

Comments
 (0)