Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message for custom embedding model: 'NoneType' object has no attribute 'tokenize' #3369

Open
Ccccx opened this issue Aug 24, 2024 · 0 comments
Labels
bug Something isn't working unconfirmed

Comments

@Ccccx
Copy link

Ccccx commented Aug 24, 2024

LocalAI version:
localai/localai:v2.19.4-cublas-cuda12

Environment, CPU architecture, OS, and Version:

  • Ubuntu 22.04
  • Cuda compilation tools, release 12.4, V12.4.131
  • Memery 64G、A 10 GPU

Describe the bug

I used embeddings the way I used a custom model, and the service started fine, but I received the following return.
{ "error": { "code": 500, "message": "rpc error: code = Unknown desc = Exception calling application: 'NoneType' object has no attribute 'tokenize'", "type": "" } }

To Reproduce
My model defines the configuration file:

name: text2vec-base-chinese
backend: sentencetransformers
embeddings: true
parameters:
  models: shibing624/text2vec-base-chinese
  model_name_or_path: /build/models/text2vec-base-chinese
  local_files_only: True
usage: |
    You can test this model with curl like this:

    curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{
      "input": "你好啊(默认情况下,超过 256 个单词段的输入文本将被截断。)",
      "model": "text2vec-base-chinese"
    }'

Request content:
curl --location --request POST 'http://127.0.01:8080/embeddings' \ --header 'User-Agent: Apifox/1.0.0 (https://apifox.com)' \ --header 'Content-Type: application/json' \ --data-raw '{ "input": "你好啊", "model": "text2vec-base-chinese" }'

Expected behavior
The correct vector response should be returned, but I'm confused about this return, not knowing if it's a compatibility issue or a bug

Logs
api_1 | 8:30AM DBG Request received: {"model":"text2vec-base-chinese","language":"","translate":false,"n":0,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"repeat_last_n":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","size":"","prompt":null,"instruction":"","input":"hello world","stop":null,"messages":null,"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"backend":"","model_base_name":""}
api_1 | 8:30AM DBG guessDefaultsFromFile: not a GGUF file
api_1 | 8:30AM DBG Parameter Config: &{PredictionOptions:{Model: Language: Translate:false N:0 TopP:0xc000e4d988 TopK:0xc000e4d990 Temperature:0xc000e4d998 Maxtokens:0xc000e4d9c8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000e4d9c0 TypicalP:0xc000e4d9b8 Seed:0xc000e4d9e0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:text2vec-base-chinese F16:0xc000e4d980 Threads:0xc000e4d978 Debug:0xc0004bc850 Roles:map[] Embeddings:0xc000e4d96d Backend:sentencetransformers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[hello world] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000e4d9b0 MirostatTAU:0xc000e4d9a8 Mirostat:0xc000e4d9a0 NGPULayers:0xc000e4d9d0 MMap:0xc000e4d9d8 MMlock:0xc000e4d9d9 LowVRAM:0xc000e4d9d9 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc000e4d970 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:You can test this model with curl like this:
api_1 |
api_1 | curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{
api_1 | "input": "你好啊(默认情况下,超过 256 个单词段的输入文本将被截断。)",
api_1 | "model": "text2vec-base-chinese"
api_1 | }'
api_1 | }
api_1 | 8:30AM INF Loading model with backend sentencetransformers
api_1 | 8:30AM DBG Model already loaded in memory:
api_1 | 8:30AM DBG GRPC(-127.0.0.1:35523): stderr Calculated embeddings for: hello world
api_1 | 8:30AM ERR Server error error="rpc error: code = Unknown desc = Exception calling application: 'NoneType' object has no attribute 'tokenize'" ip=123.161.203.27 latency=2.440391ms method=POST status=500 url=/embeddings

@Ccccx Ccccx added bug Something isn't working unconfirmed labels Aug 24, 2024
@Ccccx Ccccx changed the title Error message for custom embedding model: 'Nonetipe' Obiket, Hasno-Atribut, 'Toknidze' Error message for custom embedding model: 'NoneType' object has no attribute 'tokenize' Aug 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
Projects
None yet
Development

No branches or pull requests

1 participant