Add datanode resource limits #69

jniebuhr · 2017-07-13T11:41:49Z

It seems like the limits were removed so the data nodes won't be limited too much so I've added a new resources section for the data nodes.

stevesloka · 2017-07-13T12:57:42Z

As you can see I already had this code in there, just commented out. I found with limits on memory when you set them and the pod exceeds, then it gets killed. I think the underlying container needs some work to limit the java heap to not exceed a limit so that the pod doesn't.

In my testing, the data nodes would exceed the memory limit pretty regularly and cause the cluster to be restarting always.

Here's the docs:

If a container exceeds its RAM limit, it is terminated. If a container exceeds its CPU limit, it becomes a candidate for having its CPU use throttled.

Do you have any thoughts? Does your testing show the same?

jniebuhr · 2017-07-13T13:00:34Z

In my example, there was an environment variable ES_JAVA_OPTS set on the stateful set. That contains -Xmx1024M -Xms1024M for me. So I've set my container limits accordingly. -Xmx1024 should tell the JVM to only have 1Gi of Heap, there will be some other memory depending on how many cores you have so it should be around 1.5Gi max.

jniebuhr · 2017-07-13T13:02:34Z

oh yeah, it's hardcoded for the data nodes

jniebuhr · 2017-07-13T13:03:32Z

we should add something you can set that too

stevesloka · 2017-07-13T13:17:12Z

Would you mind coding it so that the limits are optional? Maybe if you don't supply value for the limit then it doesn't get set?

jniebuhr · 2017-07-13T13:17:40Z

I'll change it

jrnt30 · 2017-11-03T18:36:35Z

@stevesloka This actually brings up an initial observation I had while looking at the project.

What are your thoughts on making:

distinct node type's request/limits customizable but defaulted to something sane
setting the request/limit be the same so you get the "Guaranteed" QOS for K8
JVM settings dynamically tuned off of those defaulted as a % of request/limit

jrnt30 · 2017-11-08T15:57:11Z

On additional thing to note here, when you add in a "Limit" to a container, this also includes the filesystem cache and the CGroup will cap it. The implication of this is you will probably want a lower ratio of JVM Heap / Limit than you would expect (ex. maybe 50% of limit goes to the Limit) to provide ES ample headroom for OffHeap storage via the Filesystem Cache.

We ran into a similar issue with our custom grown ES deployment recently and figured I'd throw that out here.

chancez · 2017-11-22T18:32:41Z

It's pretty easy to write a entrypoint.sh which calculates the java heap size based on the pod's memory request/limits. I've found that making the JVM heapsize 50% of the memory limit/request tends to work fairly well. The pod's entrypoint can get this information as environment variables via the downward API.

jrnt30 · 2017-11-22T18:39:34Z

Thanks for the feedback, however not setting a limit can result in a negative impact on other pods running in the cluster. We found that without limiting this, there can be thrashing among the multiple pods on the same node and leads to non-deterministic performance .

chancez · 2017-11-22T19:44:24Z

I've opened pires/docker-elasticsearch-kubernetes#51 which should help with some of this. Once merged I'll open a PR to this repo that passes everything necessary for the containers to autodetect their memory limit/requests and set their JVM heap size appropriately.

This PR is still needed I think however so that you can control memory limits on a per-component basis.

chancez · 2017-11-22T20:04:33Z

I opened #120 which is part of the auto-setting the JVM heap size. Still requires the image to have the entrypoint script that I've opened a PR for however.

Add datanode resource limits

f652aa9

stevesloka mentioned this pull request Jul 7, 2018

StatefulSet does not apply resource limits #223

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add datanode resource limits #69

Add datanode resource limits #69

jniebuhr commented Jul 13, 2017

stevesloka commented Jul 13, 2017 •

edited

Loading

jniebuhr commented Jul 13, 2017

jniebuhr commented Jul 13, 2017

jniebuhr commented Jul 13, 2017

stevesloka commented Jul 13, 2017

jniebuhr commented Jul 13, 2017

jrnt30 commented Nov 3, 2017

jrnt30 commented Nov 8, 2017

chancez commented Nov 22, 2017

jrnt30 commented Nov 22, 2017

chancez commented Nov 22, 2017

chancez commented Nov 22, 2017

Add datanode resource limits #69

Are you sure you want to change the base?

Add datanode resource limits #69

Conversation

jniebuhr commented Jul 13, 2017

stevesloka commented Jul 13, 2017 • edited Loading

jniebuhr commented Jul 13, 2017

jniebuhr commented Jul 13, 2017

jniebuhr commented Jul 13, 2017

stevesloka commented Jul 13, 2017

jniebuhr commented Jul 13, 2017

jrnt30 commented Nov 3, 2017

jrnt30 commented Nov 8, 2017

chancez commented Nov 22, 2017

jrnt30 commented Nov 22, 2017

chancez commented Nov 22, 2017

chancez commented Nov 22, 2017

stevesloka commented Jul 13, 2017 •

edited

Loading