Name	Name	Last commit message	Last commit date
parent directory ..
gradle	gradle
src/main	src/main
.gitignore	.gitignore
README.md	README.md
build.gradle.kts	build.gradle.kts
docker-compose.yml	docker-compose.yml
gradlew	gradlew
gradlew.bat	gradlew.bat
open-telemetry-collector.yaml	open-telemetry-collector.yaml
settings.gradle.kts	settings.gradle.kts
telegraf.toml	telegraf.toml

manual-instrumentation

An example program-under-observation instrumented with OpenTelemetry using manual instrumentation.

Overview

In some cases, you may use the OpenTelemetry Java agent to instrument your program because it's powerful and requires no changes to the program's source code. OpenTelemetry refers to this style of instrumentation as Automatic Instrumentation. This is especially useful for third-party programs where you don't have access to the source code. For your own software projects, you may want to exercise more precise control over the exact dependencies, configuration, and behavior of the OpenTelemetry instrumentation. In this project, we instrument an example program the manual way. Refer to the OpenTelemetry docs on Manual Instrumentation for Java.

In the same spirit of exercising more control, we'll also opt out of auto configuration and instead configure the OpenTelemetry Java instrumentation directly. To take it a step further we'll opt out of the OkHttp-based OpenTelemetry sender because we would prefer to use the HTTP client built-in to JDK itself: java.net.http.HttpClient. We want to keep our dependencies to a minimum, so that our software maintenance burden is low. OkHttp itself brings in a dependency on Okio and the Kotlin standard library and runtime. Read more about the dependencies involved in exporting telemetry data in the Dependencies section of the OpenTelemetry Java instrumentation docs.

The tech stack in this subproject:

A program-under-observation
- This is a fictional "data processing" program written in Java. This program is instrumented manually with the OpenTelemetry Java instrumentation libraries.
An HTTP/Protobuf-OTLP metrics collector (OpenTelemetry Collector)
- This runs as a Docker container and receives metrics data pushed from the OpenTelemetry instrumentation in the program-under-observation. The OpenTelemetry Collector forwards the metrics data to the Telegraf server using gRPC.
A gRPC/Protobuf-OTLP metrics collector and ILP converter/forwarder (Telegraf)
- This runs as a Docker container and accepts OTLP metrics from the OpenTelemetry Collector via gRPC, and then re-formats the metrics into an acceptable format for the metrics database (Influx Line Protocol) and then writes the metrics into the metrics backend (InfluxDB).
A metrics database (InfluxDB)
- InfluxDB is an open source time series database that's often used for metrics.

Note: The fact that we're using two metrics collectors is silly. We're working around a patchy matrix of technology support (gRPC/HTTP/OTLP/ILP) among a matrix of telemetry and metrics systems (Influx/OpenTelemetry). We want our program-under-observation to be constrained to Protobuf and HTTP. We don't want to pay for gRPC support in our program. Unfortunately, Telegraf's OpenTelemetry receiver only supports gRPC, so we have to use the OpenTelemetry Collector as an intermediary. Relatedly, in the spirit of "keep it simple", check out OpenTelemetry's support for JSON-encoded OTLP data which is described in the *JSON Protobuf Encoding * section of the OTLP 1.0 spec. Can we remove the Protobuf dependency from our program-under-observation? Usually we're using JSON already. I'd rather send gzipped JSON than pay for the software maintenance of a Protobuf dependency.

Instructions

Follow these instructions to build and run the example system.

Pre-requisites: Java and Docker
- I used Java 21.
Start infrastructure services
- ```
docker-compose up
```
- This starts the OpenTelemetry Collector, Telegraf and InfluxDB.
- Pay attention to the output of these containers as they run. It's a tricky system to set up, and you'll want to know if there are any errors, like if Telegraf is unable to connect to InfluxDB.
Build the program distribution
- ```
./gradlew installDist
```

Run the program

./build/install/manual-instrumentation/bin/manual-instrumentation

The program will run indefinitely and continuously submit OTLP-based metrics data to the OpenTelemetry Collector, and it will log metrics to the console. The program output should look something like the following.

17:00:25 [main] INFO dgroomes.manual_instrumentation.Runner - Let's simulate some fictional data processing...
17:00:25 [main] DEBUG io.opentelemetry.exporter.internal.http.HttpExporterBuilder - Using HttpSender: io.opentelemetry.exporter.sender.jdk.internal.JdkHttpSender
17:00:25 [main] DEBUG io.opentelemetry.sdk.internal.JavaVersionSpecific - Using the APIs optimized for: Java 9+
17:00:35 [PeriodicMetricReader-1] INFO io.opentelemetry.exporter.logging.LoggingMetricExporter - Received a collection of 12 metrics for export.
17:00:35 [PeriodicMetricReader-1] INFO io.opentelemetry.exporter.logging.LoggingMetricExporter - metric: ImmutableMetricData{resource=Resource{schemaUrl=null, attributes={service.name="manual-instrumentation-server", service.version="0.1.0", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.36.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.runtime-telemetry-java8, version=2.2.0-alpha, schemaUrl=null, attributes={}}, name=jvm.cpu.time, description=CPU time used by the process as reported by the JVM., unit=s, type=DOUBLE_SUM, data=ImmutableSumData{points=[ImmutableDoublePointData{startEpochNanos=1711922425858898000, epochNanos=1711922435868239000, attributes={}, value=0.669478, exemplars=[]}], monotonic=true, aggregationTemporality=CUMULATIVE}}
... other metrics omitted ...

Inspect the metrics in InfluxDB directly

Start an influx session inside the InfluxDB container with the following command.

docker exec -it manual-instrumentation-influxdb-1 influx -precision rfc3339

The influx session may remind you of a SQL session. In it, you can run commands like show databases and show measurements to explore the data. We named our database playground. You should be able to connect to it by issuing a use playground command. Then, execute a show measurements command, and hopefully it shows the following metrics that have flowed from our program through the OpenTelemetry Collector, then through Telegraf and finally into the Influx database. It should look something like the following.

$ docker exec -it manual-instrumentation-influxdb-1 influx
Connected to http://localhost:8086 version 1.8.10
InfluxDB shell version: 1.8.10
> use playground
Using database playground
> show measurements
name: measurements
name
----
jvm.class.count
jvm.class.loaded
jvm.class.unloaded
jvm.cpu.count
jvm.cpu.recent_utilization
jvm.cpu.time
jvm.gc.duration
jvm.memory.committed
jvm.memory.limit
jvm.memory.used
jvm.memory.used_after_last_gc
jvm.thread.count

Let's inspect the memory usage over time for our "data processing" program. This is captured in the jvm.memory.used metric. Look at the below snippet for an example. The output shows the memory usage in MiB over time, and it represents a typical sawtooth pattern.

> SELECT SUM(gauge) / 1024 / 1024 AS "MiB" FROM "jvm.memory.used" WHERE "jvm.memory.type" = 'heap' GROUP BY time(10s)
name: jvm.memory.used
time                 MiB
----                 ---
2024-03-30T19:29:00Z 15.698493957519531
2024-03-30T19:29:10Z 25.57617950439453
2024-03-30T19:29:20Z 35.45075225830078
2024-03-30T19:29:30Z 11.872085571289062
2024-03-30T19:29:40Z 21.716766357421875
2024-03-30T19:29:50Z 30.989227294921875
2024-03-30T19:30:00Z 41.054840087890625

Stop the Java program
- Press Ctrl+C to stop the program from the same terminal window where you ran the program.
Stop the infrastructure services
- ```
docker-compose down
```
- I think it's important to do a proper down command so that the network is cleaned up. Otherwise, you might experience some weirdness if you change the Docker Compose file and then try to bring the services back up. Not really sure.

Wish List

General clean-ups, TODOs and things I wish to implement for this project:

Reference

OpenTelemetry docs: Manual Instrumentation for Java
- Manual instrumentation is the act of adding observability code to an app yourself.
OpenTelemetry JVM Runtime Metrics library
OpenTelemetry Collector
- Vendor-agnostic way to receive, process and export telemetry data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

manual-instrumentation

manual-instrumentation

README.md

manual-instrumentation

Overview

Instructions

Wish List

Reference

Files

manual-instrumentation

Directory actions

More options

Directory actions

More options

Latest commit

History

manual-instrumentation

Folders and files

parent directory

README.md

manual-instrumentation

Overview

Instructions

Wish List

Reference