Skip to content

Bug: llama-server + LLava 1.6 hallucinates #8001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
farnazj opened this issue Jun 19, 2024 · 3 comments
Closed

Bug: llama-server + LLava 1.6 hallucinates #8001

farnazj opened this issue Jun 19, 2024 · 3 comments
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale

Comments

@farnazj
Copy link

farnazj commented Jun 19, 2024

What happened?

When using ./llama-llava-cli , I get perfectly fine descriptions of images. But when hosting LLava with ./llama-server, LLava hallucinates big time.

Here's how I'm running LLava with the cli:
./llama-llava-cli -m models/llava-v1.6-vicuna-7b.Q5_K_S.gguf --mmproj models/mmproj-model-f16.gguf --image images/sth.jpeg -c 4096

Here's how I'm starting the server:
./llama-server -m models/llava-v1.6-vicuna-7b.Q5_K_S.gguf --mmproj models/mmproj-model-f16.gguf -c 2048 --host 127.0.0.1 --port 8000

Here's the python code to send the request:

import requests
import base64

def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

base64_image = encode_image("./images/sth.png")
      
headers = {
    'Content-Type': 'application/json',
}

json_data = {
    'image_data': [{
        'data': base64_image, 
        'id': 10
    }],
    "prompt": "USER:[img-10]Describe the image.\nASSISTANT:",
	"temperature": 0.1
}

response = requests.post('http://127.0.0.1:8000/completion', headers=headers, json=json_data)
print(response.json()["content"])

Name and Version

./llama-cli --version
version: 3173 (a94e6ff8)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.5.0

What operating system are you seeing the problem on?

Mac

Relevant log output

No response

@farnazj farnazj added bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels Jun 19, 2024
@ngxson
Copy link
Collaborator

ngxson commented Jun 19, 2024

Multimodal is currently not supported on server. The model will generate the response without looking at the image (so it hallucinates)

Related to #8010

@farnazj
Copy link
Author

farnazj commented Jun 19, 2024

Oh no :( what is the latest stable release/commit that still supports the multimodal?

@github-actions github-actions bot added the stale label Jul 20, 2024
Copy link
Contributor

github-actions bot commented Aug 3, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale
Projects
None yet
Development

No branches or pull requests

2 participants