Skip to content

Output of emojis seems to be somewhat broken #608

Open
@thomasfowler

Description

@thomasfowler

Expected Behavior

I am seeing some odd character output with certain emojis. Not all emojis, but certainly some, are not being output correctly.

Current Behavior

When asking Llama2 model to respond using mostly emojis, some are output as follows:

🏋️\u200d♀️💪🏼👟🥊🚀🎉🤩👍

The \u200d plus the female symbol seems to be some kind of encoding issue

Environment and Context

llama-cpp-python Version: 0.1.77 (latest at time of writing)
Python Version: 3.11.4
Platform: Apple M1 Macbook 16GB
Llama2 Model: llama-2-7b-chat.ggmlv3.q4_1.bin via https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML

Steps to Reproduce

Here is a basic example, using the model file noted above, that seems to reliably re-create the issue

from llama_cpp import Llama

prompt = """
[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.
<</SYS>>

Show me a list of fitness related emojis.[/INST]
"""

llm = Llama(
    model_path="models/llama-2-7b-chat.ggmlv3.q4_1.bin",
    n_gpu_layers=1,
    n_ctx=2048,
    n_batch=512,
    f16_kv=True,
    verbose=True
)

llm(prompt=prompt)

Produces the following output:

{'id': 'cmpl-61a22458-6094-4ebc-9498-eac52c29570d',
'object': 'text_completion',
'created': 1691949830,
'model': '/Users/thomas/Hobbies/fff-pledge/llama-2-7b-chat.ggmlv3.q4_1.bin',
'choices': [{'text': '🏋️\u200d♀️💪🏼👟🥊🚀🎉🤩👍',
'index': 0,
'logprobs': None,
'finish_reason': 'stop'}],
'usage': {'prompt_tokens': 58, 'completion_tokens': 40, 'total_tokens': 98}}

I have tried multiple versions of the llama2 model, from TheBloke, and despite the overall performance differences, the specific issue I am seeing remains.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingqualityQuality of model output

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions