Skip to content

camp2 lmdeploy llava运行时输入高分辨率图片会返回空字符串 #620

Open
@NagatoYuki0943

Description

@NagatoYuki0943

camp2 lmdeploy llava运行时输入高分辨率图片会返回空字符串

项目地址 https://github.com/InternLM/Tutorial/blob/camp2/lmdeploy/README.md#61-%E4%BD%BF%E7%94%A8lmdeploy%E8%BF%90%E8%A1%8C%E8%A7%86%E8%A7%89%E5%A4%9A%E6%A8%A1%E6%80%81%E5%A4%A7%E6%A8%A1%E5%9E%8Bllava

当输入一张分辨率为 1920*1080 分辨率的图片时不会返回文字
llava4
打印response时显示text为空
llava5

解决方法为手动降低分辨率

import gradio as gr
from lmdeploy import pipeline


# pipe = pipeline('liuhaotian/llava-v1.6-vicuna-7b') 非开发机运行此命令
pipe = pipeline('/share/new_models/liuhaotian/llava-v1.6-vicuna-7b')

def model(image, text):
    if image is None:
        return [(text, "请上传一张图片。")]
    else:
        width, height = image.size
        print(f"width = {width}, height = {height}")

        # 调整图片最长宽/高为256
        if max(width, height) > 256:
            ratio = max(width, height) / 256
            n_width = int(width / ratio)
            n_height = int(height / ratio)
            print(f"new width = {n_width}, new height = {n_height}")
            image = image.resize((n_width, n_height))

        response = pipe((text, image)).text
        print(f"response: {response}")
        return [(text, response)]

demo = gr.Interface(fn=model, inputs=[gr.Image(type="pil"), gr.Textbox()], outputs=gr.Chatbot())
demo.launch()   

更改后效果可以正常返回text
llava6
llava7

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions