Setting a devfile value for `cpuLimit` should not override the default value for `cpuRequest` #23245

cgruver · 2024-11-12T13:18:48Z

Describe the bug

As a developer, if I configure my devfile with cpuLimit, I expect that the default value for cpuRequest will be used, since I did not modify it.

However, when my workspace is started the resources.requests.cpu value for the container will be equal to the resources.requests.cpu. This behavior can result in under-utilized cluster nodes which cannot have any more Pods scheduled.

Che version

7.90

Steps to reproduce

Create a workspace from: https://github.com/cgruver/devspaces-tiny-workspace.git
Modify the devfile.yaml by adding cpuLimit & memoryLimit

schemaVersion: 2.2.0
attributes:
  controller.devfile.io/storage-type: per-workspace
metadata:
  name: tiny-workspace
components:
- name: dev-tools
  container:
    image: quay.io/cgruver0/che/tiny-workspace
    cpuLimit: 1000m
    memoryLimit: 5Gi

Restart the workspace from the local devfile.yaml
Observe the Pod of the workspace:

  containers:
    - resources:
        limits:
          cpu: 1500m
          memory: 6Gi
        requests:
          cpu: 1500m
          memory: 320Mi

Expected behavior

I expect that there is a default value for resources.requests.cpu

Runtime

OpenShift

Screenshots

No response

Installation method

OperatorHub

Environment

macOS

Eclipse Che Logs

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

ibuziuk · 2024-11-13T12:37:02Z

@cgruver I might be missing smth. but why don't you specify requests on devfile level if you want them to be applied?

I believe if namespace has the LimitRanges set with default requests those would be applied

The current behaviour is expected even though it is a nuanced approach with the limit = request strategy

cgruver · 2024-11-13T12:58:09Z

Yeah, I understand the approach. And I am also a HUGE advocate for developer autonomy. But in this particular case, I have a different opinion derived from hard experience.

In an enterprise with a couple thousand developers writing devfiles, it is very common for them to not grasp the consequences of their configuration.

I have seen too many underutilized but over-scaled OpenShift clusters because developers are setting their resources.cpu.request to an unreasonably high value.

If a platform engineer can set a reasonable value for workspace requests and limits as a default, then more experienced developers can set limit values higher as needed, without causing the Kubernetes scheduler to consider a node full, when it is not.

IMO, it's a different paradigm from production workloads which may need a certain guarantee. Developer workloads are much more dynamic, and spend a lot more time idle.

ibuziuk · 2024-11-13T13:13:36Z

Well, Regarding CPU limits in general, I advocate not to set them at all - if you set them your workloads are throttled by definition.

Limits for CPU for soft-tenancy pods are probably not going to be helpful unless you’re approaching very dense setups (> 10 pods per core) - otherwise, you will waste more CPU throttling than you save. CPU Limits definitely increase tail latencies for most non-predictable workloads (almost all request-driven use cases) in a way that will result in a worse overall application environment for most users most of the time (because of how limits are sliced). At lower pods per core, you are almost certainly trading a false security for a worse quality of service.

CPU Limits are most useful when dealing with bad actors on your own platform, and even then, there are far more effective ways of dealing with bad actors like detection and account blocking. I would tell not to put limits on that and instead put more effort into load monitoring, capacity planning, and reactive scheduling. Unless limits are so high as to bypass all protection for normal use, you’re doing the service users harm.

it is very common for them to not grasp the consequences of their configuration.

Limits / Requests for CPU is a very subtle contended topic - no matter what config engineers put in place with or without CPU limits they do need to understand exactly what they are doing.

cgruver · 2024-11-13T13:31:15Z

I totally agree with you with respect to CPU limits... except for developer CDE workloads. The one reason that I would advocate for limits, is the noisy neighbor affect.

My opinion in this matter changed because of experience with multi-threaded developer workloads that inadvertently saturated the CPU of a node, impacting the other workspaces on the node. Unlike tested application deployments, developer CDE workloads are very unpredictable.

It's definitely a nuanced topic.

I've seen it happen enough to believe that there's a reasonable approach to be considered.

This new capability may be sufficient - #23176

cgruver added the kind/bug Outline of a bug - must adhere to the bug report template. label Nov 12, 2024

che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Nov 12, 2024

dkwon17 added area/devworkspace-operator severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Nov 12, 2024

ibuziuk added this to Red Hat OpenShift Dev Spaces and Web Terminal Priorities Nov 15, 2024

ibuziuk moved this to Analyzing in Red Hat OpenShift Dev Spaces and Web Terminal Priorities Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting a devfile value for `cpuLimit` should not override the default value for `cpuRequest` #23245

Setting a devfile value for `cpuLimit` should not override the default value for `cpuRequest` #23245

cgruver commented Nov 12, 2024

ibuziuk commented Nov 13, 2024

cgruver commented Nov 13, 2024

ibuziuk commented Nov 13, 2024

cgruver commented Nov 13, 2024

Setting a devfile value for cpuLimit should not override the default value for cpuRequest #23245

Setting a devfile value for cpuLimit should not override the default value for cpuRequest #23245

Comments

cgruver commented Nov 12, 2024

Describe the bug

Che version

Steps to reproduce

Expected behavior

Runtime

Screenshots

Installation method

Environment

Eclipse Che Logs

Additional context

ibuziuk commented Nov 13, 2024

cgruver commented Nov 13, 2024

ibuziuk commented Nov 13, 2024

cgruver commented Nov 13, 2024

Setting a devfile value for `cpuLimit` should not override the default value for `cpuRequest` #23245

Setting a devfile value for `cpuLimit` should not override the default value for `cpuRequest` #23245