You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi, good afternoon, i deployed the deepseek-ai/deepseek-coder-7b-instruct model on sagemaker with the same config as your demo on hugging face like tok_p 0.9 and top_k 50, i assume the temprature is 0.6, if it is not please tell me the one you use, do_sample as false, it is running fine, but if i try a prompt on your demo, it gives correct and accurate result, but if i prompt the one i deployed it doesn't give me as accurate result with thesame prompt, please is there any tweak that you did there and you can share it with me, please i need your help. thanks. @chester please respond to this.
and please could it be that there is "deepseek-ai/deepseek-coder-7b-instruct" and also "deepseek-ai/deepseek-coder-7b-chat"
?
and please what is the stop token, because even if i use "stop":[<|EOT|>], it still keep generating until the max_new_token is exhausted.
here is how i am deploying to sagemaker:
`import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
predictor.predict({
"inputs": "My name is Julien and I like to",
"parameters": {
"do_sample": False,
"top_p": 0.90,
"top_k": 50,
"temperature": 0.35,
"max_new_tokens": 1024,
"repetition_penalty": 1.0,
"stop": ["<|EOT|>"]
}
})`
The text was updated successfully, but these errors were encountered:
hi, good afternoon, i deployed the deepseek-ai/deepseek-coder-7b-instruct model on sagemaker with the same config as your demo on hugging face like tok_p 0.9 and top_k 50, i assume the temprature is 0.6, if it is not please tell me the one you use, do_sample as false, it is running fine, but if i try a prompt on your demo, it gives correct and accurate result, but if i prompt the one i deployed it doesn't give me as accurate result with thesame prompt, please is there any tweak that you did there and you can share it with me, please i need your help. thanks. @chester please respond to this.
and please could it be that there is "deepseek-ai/deepseek-coder-7b-instruct" and also "deepseek-ai/deepseek-coder-7b-chat"
?
and please what is the stop token, because even if i use "stop":[<|EOT|>], it still keep generating until the max_new_token is exhausted.
here is how i am deploying to sagemaker:
`import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'deepseek-ai/deepseek-coder-6.7b-instruct',
'SM_NUM_GPUS': json.dumps(1)
}
create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface",version="1.1.0"),
env=hub,
role=role,
)
deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
container_startup_health_check_timeout=300,
)
send request
predictor.predict({
"inputs": "My name is Julien and I like to",
"parameters": {
"do_sample": False,
"top_p": 0.90,
"top_k": 50,
"temperature": 0.35,
"max_new_tokens": 1024,
"repetition_penalty": 1.0,
"stop": ["<|EOT|>"]
}
})`
The text was updated successfully, but these errors were encountered: