-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting a devfile value for cpuLimit
should not override the default value for cpuRequest
#23245
Comments
@cgruver I might be missing smth. but why don't you specify requests on devfile level if you want them to be applied?
The current behaviour is expected even though it is a nuanced approach with the limit = request strategy |
Yeah, I understand the approach. And I am also a HUGE advocate for developer autonomy. But in this particular case, I have a different opinion derived from hard experience. In an enterprise with a couple thousand developers writing devfiles, it is very common for them to not grasp the consequences of their configuration. I have seen too many underutilized but over-scaled OpenShift clusters because developers are setting their If a platform engineer can set a reasonable value for workspace requests and limits as a default, then more experienced developers can set limit values higher as needed, without causing the Kubernetes scheduler to consider a node full, when it is not. IMO, it's a different paradigm from production workloads which may need a certain guarantee. Developer workloads are much more dynamic, and spend a lot more time idle. |
Well, Regarding CPU limits in general, I advocate not to set them at all - if you set them your workloads are throttled by definition. Limits for CPU for soft-tenancy pods are probably not going to be helpful unless you’re approaching very dense setups (> 10 pods per core) - otherwise, you will waste more CPU throttling than you save. CPU Limits definitely increase tail latencies for most non-predictable workloads (almost all request-driven use cases) in a way that will result in a worse overall application environment for most users most of the time (because of how limits are sliced). At lower pods per core, you are almost certainly trading a false security for a worse quality of service. CPU Limits are most useful when dealing with bad actors on your own platform, and even then, there are far more effective ways of dealing with bad actors like detection and account blocking. I would tell not to put limits on that and instead put more effort into load monitoring, capacity planning, and reactive scheduling. Unless limits are so high as to bypass all protection for normal use, you’re doing the service users harm.
Limits / Requests for CPU is a very subtle contended topic - no matter what config engineers put in place with or without CPU limits they do need to understand exactly what they are doing. |
I totally agree with you with respect to CPU limits... except for developer CDE workloads. The one reason that I would advocate for limits, is the noisy neighbor affect. My opinion in this matter changed because of experience with multi-threaded developer workloads that inadvertently saturated the CPU of a node, impacting the other workspaces on the node. Unlike tested application deployments, developer CDE workloads are very unpredictable. It's definitely a nuanced topic. I've seen it happen enough to believe that there's a reasonable approach to be considered. This new capability may be sufficient - #23176 |
Describe the bug
As a developer, if I configure my devfile with
cpuLimit
, I expect that the default value forcpuRequest
will be used, since I did not modify it.However, when my workspace is started the
resources.requests.cpu
value for the container will be equal to theresources.requests.cpu
. This behavior can result in under-utilized cluster nodes which cannot have any more Pods scheduled.Che version
7.90
Steps to reproduce
Create a workspace from: https://github.com/cgruver/devspaces-tiny-workspace.git
Modify the
devfile.yaml
by addingcpuLimit
&memoryLimit
Restart the workspace from the local
devfile.yaml
Observe the Pod of the workspace:
Expected behavior
I expect that there is a default value for
resources.requests.cpu
Runtime
OpenShift
Screenshots
No response
Installation method
OperatorHub
Environment
macOS
Eclipse Che Logs
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: