-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add datanode resource limits #69
base: master
Are you sure you want to change the base?
Add datanode resource limits #69
Conversation
As you can see I already had this code in there, just commented out. I found with limits on memory when you set them and the pod exceeds, then it gets killed. I think the underlying container needs some work to limit the java heap to not exceed a limit so that the pod doesn't. In my testing, the data nodes would exceed the memory limit pretty regularly and cause the cluster to be restarting always. Here's the docs:
Do you have any thoughts? Does your testing show the same? |
In my example, there was an environment variable |
oh yeah, it's hardcoded for the data nodes |
we should add something you can set that too |
Would you mind coding it so that the limits are optional? Maybe if you don't supply value for the limit then it doesn't get set? |
I'll change it |
@stevesloka This actually brings up an initial observation I had while looking at the project. What are your thoughts on making:
|
On additional thing to note here, when you add in a "Limit" to a container, this also includes the filesystem cache and the CGroup will cap it. The implication of this is you will probably want a lower ratio of JVM Heap / Limit than you would expect (ex. maybe 50% of limit goes to the Limit) to provide ES ample headroom for OffHeap storage via the Filesystem Cache. We ran into a similar issue with our custom grown ES deployment recently and figured I'd throw that out here. |
It's pretty easy to write a entrypoint.sh which calculates the java heap size based on the pod's memory request/limits. I've found that making the JVM heapsize 50% of the memory limit/request tends to work fairly well. The pod's entrypoint can get this information as environment variables via the downward API. |
Thanks for the feedback, however not setting a limit can result in a negative impact on other pods running in the cluster. We found that without limiting this, there can be thrashing among the multiple pods on the same node and leads to non-deterministic performance . |
I've opened pires/docker-elasticsearch-kubernetes#51 which should help with some of this. Once merged I'll open a PR to this repo that passes everything necessary for the containers to autodetect their memory limit/requests and set their JVM heap size appropriately. This PR is still needed I think however so that you can control memory limits on a per-component basis. |
I opened #120 which is part of the auto-setting the JVM heap size. Still requires the image to have the entrypoint script that I've opened a PR for however. |
It seems like the limits were removed so the data nodes won't be limited too much so I've added a new resources section for the data nodes.