StatefulSet does not apply resource limits #223

tmirks · 2018-07-03T20:11:53Z

Description

Elasticsearch cluster StatefulSets are created without resource limits set. Setting the limit values appear to be commented out in the code for whatever reason.

This is a problem if you've defined a LimitRange resource in your cluster to provide a default set of limits when otherwise unspecified.

Example

Our LimitRange:

...
  kind: LimitRange
  spec:
    limits:
    - default:
        cpu: "1"
        memory: 512Mi
...

The StatefulSet is created with the following resources (notice no limits):

...
          resources:
            requests:
              cpu: 500m
              memory: 1Gi
...

The result is an error on the StatefulSet because the request is higher than the LimitRange defaults:

Invalid value: "1Gi": must be less than or equal to memory limit

Type     Reason        Age                 From                    Message
----     ------        ----                ----                    -------
Warning  FailedCreate  16s (x15 over 55s)  statefulset-controller  create Pod xxxxx-0 in StatefulSet xxxxx failed error: Pod "xxxxx-0" is invalid: spec.containers[0].resources.requests: Invalid value: "1Gi": must be less than or equal to memory limit

The text was updated successfully, but these errors were encountered:

stevesloka · 2018-07-07T02:59:03Z

We did have limits at one time, but ran into issues. There was a PR (#69) which addressed, but then some other issues came up. We should check upstream and see if any work has already been done that we could incorporate.

stevesloka · 2018-07-07T03:13:24Z

There was this PR as well with some good comments: #120

tmirks · 2018-07-31T17:33:42Z

@stevesloka what's the recommended course of action for resource limits then? Remove cluster defaults?

stevesloka · 2018-08-17T13:44:08Z

@tmirks the tricky part with limits (at least with memory) is when the pod exceeds those limits the pod is restarted. Initially, I had them set and then took them out since during some testing I found my data nodes restarting all the time. So I commented out that bit in the code.

I think the proper way might be to do some tuning of the limits in combination with some Java options to also talk to the JVM.

nabadger · 2018-09-10T08:56:06Z

Hi,

I do think this should be an option - our clusters have resource limits enforced so have to set them. We would like to be able to specify the exact limits/requests for cpu/ram (this seems the norm. for other many other operators I've used).

If pods are getting killed, it's more of a cluster-admins role to ensure it's resourced up sufficiently, configured and monitored right?

stevesloka · 2018-09-10T13:49:31Z

Clusters which enforce limits is a valid reason to add back. I'm not saying we shouldn't have them, just in my experience with using the operator, managing the JVM with limits can be tricky. I'll add a way to make them optional so you could still define and others could leave out.

nabadger · 2018-09-10T18:48:40Z

Awesome, many thanks 👍

jimmyjones2 · 2018-10-06T20:02:44Z

Setting requests but not limit results in a Burstable QoS level. Given that Elasticsearch allocates pretty much all its memory on startup and it stays fixed, wouldn't it be better to set the limit and request to the same (ideally 2x heap), giving Guaranteed QoS so very unlikely to be killed (Guaranteed < Burstable < BestEfforts)?

The only situation I can see Burstable might be useful is setting the request to at least the Java heap size, allowing any unused memory on the node for the page cache - however I'm assuming when node memory pressure occurs the page cache would be reclaimed, bringing the container down to its request, rather than killing it.

jimmyjones2 · 2018-10-06T20:22:54Z

Don't think my assumption about Burstable was correct - kubernetes/kubernetes#43916:

I'd like to share some observations, though I can't say I have a good solution to offer yet, other than to set a memory limit equal to the memory request for any pod that makes use of the file cache.

kaarolch · 2018-10-16T12:04:15Z

Anyway in most env people, teams have default request and limit set in case to limit all deployment without limit. In this type of env you have to recompile operator binary and create own container. Maybe a good solution would be to allow override default limits from CRD?

alwinmarkcf · 2019-01-28T15:30:40Z

Clusters which enforce limits is a valid reason to add back. I'm not saying we shouldn't have them, just in my experience with using the operator, managing the JVM with limits can be tricky. I'll add a way to make them optional so you could still define and others could leave out.

So is there something in there already? Do we need an additional value? I hit the same problem

stevesloka added help wanted question labels Aug 17, 2018

smanpathak mentioned this issue Jan 9, 2019

resource limits for main (mysql) container bitpoke/mysql-operator#192

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StatefulSet does not apply resource limits #223

StatefulSet does not apply resource limits #223

tmirks commented Jul 3, 2018

stevesloka commented Jul 7, 2018

stevesloka commented Jul 7, 2018

tmirks commented Jul 31, 2018

stevesloka commented Aug 17, 2018

nabadger commented Sep 10, 2018 •

edited

Loading

stevesloka commented Sep 10, 2018

nabadger commented Sep 10, 2018

jimmyjones2 commented Oct 6, 2018

jimmyjones2 commented Oct 6, 2018

kaarolch commented Oct 16, 2018 •

edited

Loading

alwinmarkcf commented Jan 28, 2019

StatefulSet does not apply resource limits #223

StatefulSet does not apply resource limits #223

Comments

tmirks commented Jul 3, 2018

Description

Example

stevesloka commented Jul 7, 2018

stevesloka commented Jul 7, 2018

tmirks commented Jul 31, 2018

stevesloka commented Aug 17, 2018

nabadger commented Sep 10, 2018 • edited Loading

stevesloka commented Sep 10, 2018

nabadger commented Sep 10, 2018

jimmyjones2 commented Oct 6, 2018

jimmyjones2 commented Oct 6, 2018

kaarolch commented Oct 16, 2018 • edited Loading

alwinmarkcf commented Jan 28, 2019

nabadger commented Sep 10, 2018 •

edited

Loading

kaarolch commented Oct 16, 2018 •

edited

Loading