metrics and objects deployments generating tons of zombie processes and using up cluster node process limits



**What happened**:
Deploying metrics, metrics aggregator and kube-objects (all images tagged with 1.2.1) seems to cause lots of zombie processes to be created on the cluster node where the deployment is and eventually cluster node is overwhelmed and crashes (Amazon EKS 1.22)

**What you expected to happen**:
Metrics and object collections should function normally.

**How to reproduce it (as minimally and precisely as possible)**:
Deploy Splunk Connect with below YAML

global:
  logLevel: info  
  splunk:
    hec:
      host: http-inputs-hoopp.splunkcloud.com
      insecureSSL: false
      port: 443
      protocol: https
      token: <your token here>
splunk-kubernetes-logging:
  enabled: true
  journalLogPath: /var/log/journal
  logs:
    isg-containers:
      logFormatType: cri
      from:
        container: isg-
        pod: '*'
      multiline:
        firstline: /^\d{4}-\d{2}-\d{2} \d{1,2}:\d{1,2}:\d{1,2}.\d{3}/
      sourcetype: kube:container
      timestampExtraction:
        format: '%Y-%m-%d %H:%M:%S.%NZ'
        regexp: time="(?<time>\d{4}-\d{2}-\d{2}T[0-2]\d:[0-5]\d:[0-5]\d.\d{9}Z)"
  image:
    registry: docker.io
    name: splunk/fluentd-hec
    tag: 1.3.1
    pullPolicy: Always
  resources:
    limits:
      memory: 1.5Gi
  splunk:
    hec:
      indexName: eks_logs
splunk-kubernetes-metrics:
  image:
    registry: docker.io
    name: splunk/k8s-metrics
    tag: 1.2.1
    pullPolicy: Always
  imageAgg:
    registry: docker.io
    name: splunk/k8s-metrics-aggr
    tag: 1.2.1
    pullPolicy: Always
  rbac:
    create: true
  serviceAccount:
    create: true
    name: splunk-kubernetes-metrics
  splunk:
    hec:
      indexName: eks_metrics
splunk-kubernetes-objects:
  image:
    registry: docker.io
    name: splunk/kube-objects
    tag: 1.2.1
    pullPolicy: Always
  kubernetes:
    insecureSSL: true
  objects:
    apps:
      v1:
      - interval: 30s
        name: deployments
      - interval: 30s
        name: daemon_sets
      - interval: 30s
        name: replica_sets
      - interval: 30s
        name: stateful_sets
    core:
      v1:
      - interval: 30s
        name: pods
      - interval: 30s
        name: namespaces
      - interval: 30s
        name: nodes
      - interval: 30s
        name: services
      - interval: 30s
        name: config_maps
      - interval: 30s
        name: secrets
      - interval: 30s
        name: persistent_volumes
      - interval: 30s
        name: service_accounts
      - interval: 30s
        name: persistent_volume_claims
      - interval: 30s
        name: resource_quotas
      - interval: 30s
        name: component_statuses
      - mode: watch
        name: events
  rbac:
    create: true
  serviceAccount:
    create: true
    name: splunk-kubernetes-objects
  splunk:
    hec:
      indexName: eks_meta
**Anything else we need to know?**:

Scaling down the deployment for metrics and objects to 0 makes the zombie processes disappear immediately
**Environment**:
- Kubernetes version (use `kubectl version`): EKS 1.22
- Ruby version (use `ruby --version`):
- OS (e.g: `cat /etc/os-release`):
- NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"

- Splunk version: - check YAML above
- Splunk Connect for Kubernetes helm chart version: 1.4.3
- Others:


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

metrics and objects deployments generating tons of zombie processes and using up cluster node process limits #857

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

metrics and objects deployments generating tons of zombie processes and using up cluster node process limits #857

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions