es-313 Added configuration option for Scrape Interval in Target Alloc…

…ator (#433) * changelog * added scrapeInterval default * added brief note regarding configuring scrape interval in target allocator * bump chart version * version bumped to 91 * mdox fmt otel-integration/k8s-helm/README.md * mdox fmt otel-integration/CHANGELOG.md * changelog * changlog * updated dependencies * update target allocator version * version update
coralogix · Aug 7, 2024 · 4b533e8 · 4b533e8
1 parent 98db4da
commit 4b533e8
Show file tree

Hide file tree

Showing 4 changed files with 26 additions and 31 deletions.
diff --git a/otel-integration/CHANGELOG.md b/otel-integration/CHANGELOG.md
@@ -2,6 +2,10 @@
 
 ## OpenTelemtry-Integration
 
+### v0.0.94 / 2024-08-07
+- [Feat] add support for configuring scrape interval for target allocator prometheus custom resource
+- [CHORE] - Updated target allocator version to 0.105.0 in values.yaml
+
 ## v0.0.93 / 2024-08-06
 - [Feat] Add more defaults for fleet management preset
 
@@ -233,9 +237,7 @@
 - [FIX] Kubelet Stats use Node IP instead of Node name.
 
 ### v0.0.37 / 2023-11-27
-* [:warning: BREAKING CHANGE] [FEATURE] Add support for span metrics preset. This replaces the deprecated `spanmetricsprocessor`
-  with `spanmetricsconnector`. The new connector is disabled by default, as opposed the replaces processor.
-  To enable it, set `presets.spanMetrics.enabled` to `true`.
+* [:warning: BREAKING CHANGE] [FEATURE] Add support for span metrics preset. This replaces the deprecated `spanmetricsprocessor` with `spanmetricsconnector`. The new connector is disabled by default, as opposed the replaces processor. To enable it, set `presets.spanMetrics.enabled` to `true`.
 
 ### v0.0.36 / 2023-11-15
 * [FIX] Change statsd receiver port to 8125 instead of 8127

diff --git a/otel-integration/k8s-helm/Chart.yaml b/otel-integration/k8s-helm/Chart.yaml
@@ -1,7 +1,7 @@
 apiVersion: v2
 name: otel-integration
 description: OpenTelemetry Integration
-version: 0.0.93
+version: 0.0.94
 keywords:
   - OpenTelemetry Collector
   - OpenTelemetry Agent
@@ -11,22 +11,22 @@ keywords:
 dependencies:
   - name: opentelemetry-collector
     alias: opentelemetry-agent
-    version: "0.88.5"
+    version: "0.88.6"
     repository: https://cgx.jfrog.io/artifactory/coralogix-charts-virtual
     condition: opentelemetry-agent.enabled
   - name: opentelemetry-collector
     alias: opentelemetry-agent-windows
-    version: "0.88.5"
+    version: "0.88.6"
     repository: https://cgx.jfrog.io/artifactory/coralogix-charts-virtual
     condition: opentelemetry-agent-windows.enabled
   - name: opentelemetry-collector
     alias: opentelemetry-cluster-collector
-    version: "0.88.5"
+    version: "0.88.6"
     repository: https://cgx.jfrog.io/artifactory/coralogix-charts-virtual
     condition: opentelemetry-cluster-collector.enabled
   - name: opentelemetry-collector
     alias: opentelemetry-gateway
-    version: "0.88.5"
+    version: "0.88.6"
     repository: https://cgx.jfrog.io/artifactory/coralogix-charts-virtual
     condition: opentelemetry-gateway.enabled
 sources:

diff --git a/otel-integration/k8s-helm/README.md b/otel-integration/k8s-helm/README.md
@@ -158,8 +158,7 @@ type: Opaque
 
 # Installation
 
-> [!NOTE]
-> With some Helm version (< `v3.14.3`), users might experience multiple warning messages during the installation about following:
+> [!NOTE] With some Helm version (< `v3.14.3`), users might experience multiple warning messages during the installation about following:
 >
 > ```
 > index.go:366: skipping loading invalid entry for chart "otel-integration" \<version> from \<path>: validation: more than one dependency with name or alias "opentelemetry-collector"
@@ -223,8 +222,7 @@ helm upgrade --install otel-coralogix-integration coralogix-charts-virtual/otel-
   --render-subchart-notes -f values-crd-override.yaml --set global.clusterName=<cluster_name> --set global.domain=<domain>
 ```
 
-> [!NOTE]
-> Users might experience multiple warning messages during the installation about following:
+> [!NOTE] Users might experience multiple warning messages during the installation about following:
 >
 > ```
 > Warning: missing the following rules for namespaces: [get,list,watch]
@@ -245,8 +243,7 @@ helm upgrade --install otel-coralogix-integration coralogix-charts-virtual/otel-
 
 This change will configure otel-agent pods to send span data to coralogix-opentelemetry-gateway deployment using [loadbalancing exporter](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/loadbalancingexporter). Make sure to configure enough replicas and resource requests and limits to handle the load. Next, you will need to configure [tail sampling processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor) policies with your custom tail sampling policies.
 
-When running in Openshift make sure to set `distribution: "openshift"` in your `values.yaml`.
-When running in Windows environments, please use `values-windows-tailsampling.yaml` values file.
+When running in Openshift make sure to set `distribution: "openshift"` in your `values.yaml`. When running in Windows environments, please use `values-windows-tailsampling.yaml` values file.
 
 #### Why am I getting ResourceExhausted errors when using Tail Sampling?
 
@@ -277,6 +274,8 @@ If you're leveraging the Prometheus Operator custom resources (`ServiceMonitor`
 
 If enabled, the target allocator will be deployed as a separate deployment in the same namespace as the collector. It will be responsible for allocating targets for the agent collector on each node, to scrape targets that reside on the given node (a form of simple sharding). If needed, you can run multiple instances of the target allocator for high availability. This can be achieved by setting the `opentelemetry-agent.targetAllocator.replicas` value to a number greater than 1.
 
+You can specify the preferred scrape interval for the Prometheus Custom Resource by setting `opentelemetry-agent.targetAllocator.prometheusCR.scrapeInterval`, the default is `30s`
+
 For more details on Prometheus custom resources and target allocator see the documentation [here](https://github.com/open-telemetry/opentelemetry-operator/tree/main/cmd/otel-allocator#discovery-of-prometheus-custom-resources).
 
 ### Installing the chart on clusters with mixed operating systems (Linux and Windows)
@@ -318,9 +317,7 @@ GKE Autopilot has limited access to host filesystems, host networking and host p
 Notable important differences from regular `otel-integration` are:
 - Host metrics receiver is not available, though you still get some metrics about the host through `kubeletstats` receiver.
 - Kubernetes Dashboard does not work, due to missing Host Metrics.
-- Host networking and host ports are not available, users need to send tracing spans through
-  Kubernetes Service. The Service uses `internalTrafficPolicy: Local`, to send traffic to locally
-  running agents.
+- Host networking and host ports are not available, users need to send tracing spans through Kubernetes Service. The Service uses `internalTrafficPolicy: Local`, to send traffic to locally running agents.
 - Log Collection works, but does not store check points. Restarting the agent will collect logs from the beginning.
 
 To install otel-integration to GKE/Autopilot follow these steps:
@@ -369,8 +366,7 @@ Applications can send OTLP Metrics and Jaeger, Zipkin and OTLP traces to the loc
 
 ### Example Application environment configuration
 
-The following code creates a new environment variable (`NODE`) containing the node's IP address and then uses that IP in the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable.
-This ensures that each instrumented pod will send data to the local OTEL collector on the node it is currently running on.
+The following code creates a new environment variable (`NODE`) containing the node's IP address and then uses that IP in the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable. This ensures that each instrumented pod will send data to the local OTEL collector on the node it is currently running on.
 
 ```yaml
 env:
@@ -586,19 +582,16 @@ processors:
 
 ## Picking the right tracing SDK span processor
 
-OpenTelemetry tracing SDK supports two strategies to create an application traces, a “SimpleSpanProcessor” and a “BatchSpanProcessor.”
-While the SimpleSpanProcessor submits a span every time a span is finished, the BatchSpanProcessor processes spans in batches, and buffers them until a flush event occurs. Flush events can occur when the buffer is full or when a timeout is reached.
+OpenTelemetry tracing SDK supports two strategies to create an application traces, a “SimpleSpanProcessor” and a “BatchSpanProcessor.” While the SimpleSpanProcessor submits a span every time a span is finished, the BatchSpanProcessor processes spans in batches, and buffers them until a flush event occurs. Flush events can occur when the buffer is full or when a timeout is reached.
 
-Picking the right tracing SDK span processor can have an impact on the performance of the collector.
-We switched our SDK span processor from SimpleSpanProcessor to BatchSpanProcessor and noticed a massive performance improvement in the collector:
+Picking the right tracing SDK span processor can have an impact on the performance of the collector. We switched our SDK span processor from SimpleSpanProcessor to BatchSpanProcessor and noticed a massive performance improvement in the collector:
 
 | Span Processor      | Agent Memory Usage | Agent CPU Usage | Latency Samples |
 |---------------------|--------------------|-----------------|-----------------|
 | SimpleSpanProcessor | 3.7 GB             | 0.5             | >1m40s          |
 | BatchSpanProcessor  | 600 MB             | 0.02            | >1s <10s        |
 
-In addition, it improved the buffer performance of the collector, when we used the SimpleSpanProcessor, the buffer queues were getting full very quickly,
-and after switching to the BatchSpanProcessor, it stopped becoming full all the time, therefore stopped dropping data.
+In addition, it improved the buffer performance of the collector, when we used the SimpleSpanProcessor, the buffer queues were getting full very quickly, and after switching to the BatchSpanProcessor, it stopped becoming full all the time, therefore stopped dropping data.
 
 #### Example
 
@@ -693,15 +686,13 @@ Required settings:
 - `mountPath`: specifies the path at which to mount the volume. This should correspond the mount path of your MySQL data volume. Provide this parameter without trailing slash.
 
 Optional settings:
-- `logFilesPath`: specifies which directory to watch for log files. This will typically be the MySQL data directory,
-  such as `/var/lib/mysql`. If not specified, the value of `mountPath` will be used.
+- `logFilesPath`: specifies which directory to watch for log files. This will typically be the MySQL data directory, such as `/var/lib/mysql`. If not specified, the value of `mountPath` will be used.
 - `logFilesExtension`: specifies which file extensions to watch for. Defaults to `.log`.
 
 ### Common issues
 
 - Metrics collection is failing with error `"Error 1227 (42000): Access denied; you need (at least one of) the PROCESS privilege(s) for this operation"`
-  - This error indicates that the database user you provided does not have the required privileges to collect metrics. Provide the `PROCESS` privilege to the user, e.g. by running query
-    `GRANT PROCESS ON *.* TO 'user'@'%'`
+  - This error indicates that the database user you provided does not have the required privileges to collect metrics. Provide the `PROCESS` privilege to the user, e.g. by running query `GRANT PROCESS ON *.* TO 'user'@'%'`
 
 ### Example preset configuration for single instance
 

diff --git a/otel-integration/k8s-helm/values.yaml b/otel-integration/k8s-helm/values.yaml
@@ -5,7 +5,7 @@ global:
   defaultSubsystemName: "integration"
   logLevel: "debug"
   collectionInterval: "30s"
-  version: "0.0.93"
+  version: "0.0.94"
 
   extensions:
     kubernetesDashboard:
@@ -54,9 +54,11 @@ opentelemetry-agent:
     allocationStrategy: "per-node"
     prometheusCR:
       enabled: true
+      # The interval at which the target allocator will scrape the Prometheus server
+      scrapeInterval: 30s
     image:
       repository: ghcr.io/open-telemetry/opentelemetry-operator/target-allocator
-      tag: v0.101.0
+      tag: v0.105.0
 
   # Temporary feature gates to prevent breaking changes. Please see changelog for version 0.0.85 for more information.
   command: