You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: serverless/pages/ml-nlp-auto-scale.asciidoc
+3-3
Original file line number
Diff line number
Diff line change
@@ -40,14 +40,14 @@ If you set the minimum number of allocations to 1, you will be charged even if t
40
40
41
41
You can enable adaptive allocations by using:
42
42
43
-
* the create inference endpoint API for https://www.elastic.co/guide/en/elasticsearch/reference/master/infer-service-elser.html[ELSER], https://www.elastic.co/guide/en/elasticsearch/reference/master/infer-service-elasticsearch.html[E5 and models uploaded through Eland] that are used as inference services.
44
-
* the https://www.elastic.co/guide/en/elasticsearch/reference/master/start-trained-model-deployment.html[start trained model deployment] or https://www.elastic.co/guide/en/elasticsearch/reference/master/update-trained-model-deployment.html[update trained model deployment] APIs for trained models that are deployed on machine learning nodes.
43
+
* the create inference endpoint API for {ref}/infer-service-elser.html[ELSER], {ref}/infer-service-elasticsearch.html[E5 and models uploaded through Eland] that are used as inference services.
44
+
* the {ref}/start-trained-model-deployment.html[start trained model deployment] or {ref}/update-trained-model-deployment.html[update trained model deployment] APIs for trained models that are deployed on machine learning nodes.
45
45
46
46
If the new allocations fit on the current machine learning nodes, they are immediately started.
47
47
If more resource capacity is needed for creating new model allocations, then your machine learning node will be scaled up if machine learning autoscaling is enabled to provide enough resources for the new allocation.
48
48
The number of model allocations can be scaled down to 0.
49
49
They cannot be scaled up to more than 32 allocations, unless you explicitly set the maximum number of allocations to more.
50
-
Adaptive allocations must be set up independently for each deployment and https://www.elastic.co/guide/en/elasticsearch/reference/master/put-inference-api.html[inference endpoint].
50
+
Adaptive allocations must be set up independently for each deployment and {ref}/put-inference-api.html[inference endpoint].
51
51
52
52
When you create inference endpoints on Serverless using Kibana, adaptive allocations are automatically turned on, and there is no option to disable them.
0 commit comments