Skip to content

Telemetry: update enablement (experimental source instead of app context switch) and docs improvements #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
75 changes: 44 additions & 31 deletions docs/observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,48 +10,61 @@ OpenAI .NET instrumentation follows [OpenTelemetry Semantic Conventions for Gene

### How to enable

The instrumentation is **experimental** - volume and semantics of the telemetry items may change.
> [!NOTE]
> The instrumentation is **experimental** - semantics of telemetry items
> (such as metric or attribute names, types, value range or other properties) may change.

To enable the instrumentation:
The following code snippet shows how to enable OpenAI traces and metrics:

1. Set instrumentation feature-flag using one of the following options:
```csharp
builder.Services.AddOpenTelemetry()
.WithTracing(b =>
{
b.AddSource("Experimental.OpenAI.*", "OpenAI.*")
...
.AddOtlpExporter();
})
.WithMetrics(b =>
{
b.AddMeter("Experimental.OpenAI.*", "OpenAI.*")
...
.AddOtlpExporter();
});
```

- set the `OPENAI_EXPERIMENTAL_ENABLE_OPEN_TELEMETRY` environment variable to `"true"`
- set the `OpenAI.Experimental.EnableOpenTelemetry` context switch to true in your application code when application
is starting and before initializing any OpenAI clients. For example:
Distributed tracing is enabled with `AddSource("Experimental.OpenAI.*", "OpenAI.*")` which tells OpenTelemetry to listen to all [ActivitySources](https://learn.microsoft.com/dotnet/api/system.diagnostics.activitysource) with names starting with `Experimental.OpenAI.` (experimental ones) or sources which names start with `"OpenAI.*"` (stable ones).

```csharp
AppContext.SetSwitch("OpenAI.Experimental.EnableOpenTelemetry", true);
```
Similarly, metrics are configured with `AddMeter("Experimental.OpenAI.*", "OpenAI.*")` which enables all OpenAI-related [Meters](https://learn.microsoft.com/dotnet/api/system.diagnostics.metrics.meter).

2. Enable OpenAI telemetry:
Once experimental telemetry stabilizes, experimental sources and meters will be renamed (for example, `Experimental.OpenAI.ChatClient` will become `OpenAI.ChatClient`), so it's
recommended to enable both to avoid changing the source code later.

```csharp
builder.Services.AddOpenTelemetry()
.WithTracing(b =>
{
b.AddSource("OpenAI.*")
...
.AddOtlpExporter();
})
.WithMetrics(b =>
{
b.AddMeter("OpenAI.*")
...
.AddOtlpExporter();
});
```
Consider enabling [HTTP client instrumentation](https://www.nuget.org/packages/OpenTelemetry.Instrumentation.Http) to see all HTTP client
calls made by your application including those done by the OpenAI SDK.

Distributed tracing is enabled with `AddSource("OpenAI.*")` which tells OpenTelemetry to listen to all [ActivitySources](https://learn.microsoft.com/dotnet/api/system.diagnostics.activitysource) with names starting with `OpenAI.*`.
Check out [full example](../examples/OpenTelemetryExamples.cs) and [OpenTelemetry documentation](https://opentelemetry.io/docs/languages/net/getting-started/) for more details.

Similarly, metrics are configured with `AddMeter("OpenAI.*")` which enables all OpenAI-related [Meters](https://learn.microsoft.com/dotnet/api/system.diagnostics.metrics.meter).
### How to view telemetry

Consider enabling [HTTP client instrumentation](https://www.nuget.org/packages/OpenTelemetry.Instrumentation.Http) to see all HTTP client
calls made by your application including those done by the OpenAI SDK.
Check out [OpenTelemetry documentation](https://opentelemetry.io/docs/languages/net/getting-started/) for more details.
You can view traces and metrics in any [telemetry system](https://opentelemetry.io/ecosystem/vendors/) compatible with OpenTelemetry.

You may use [Aspire dashboard](https://learn.microsoft.com/dotnet/aspire/fundamentals/dashboard/standalone) to test things out locally.
You can run it as a docker container with the following command:

```bash
docker run --rm -it \
-p 18888:18888 \
-p 4317:18889 -d \
--name aspire-dashboard \
mcr.microsoft.com/dotnet/aspire-dashboard:latest
```

Here's a trace produced by [OpenTelemetry sample](../examples/OpenTelemetryExamples.cs):

![](./images/openai-tracing-with-opentelemetry.png)

### Available sources and meters

The following sources and meters are available:

- `OpenAI.ChatClient` - records traces and metrics for `ChatClient` operations (except streaming and protocol methods which are not instrumented yet)
- `Experimental.OpenAI.ChatClient` - records traces and metrics for `ChatClient` operations (except streaming and protocol methods which are not instrumented yet)
3 changes: 3 additions & 0 deletions examples/OpenAI.Examples.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,8 @@
<PackageReference Include="NUnit3TestAdapter" Version="4.4.2" />
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.0.0" />
<PackageReference Include="Moq" Version="[4.18.2]" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.9.0" />
<PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.9.0" />
<PackageReference Include="OpenTelemetry.Instrumentation.Http" Version="1.9.0" />
</ItemGroup>
</Project>
49 changes: 49 additions & 0 deletions examples/OpenTelemetryExamples.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
using NUnit.Framework;
using OpenAI.Chat;
using OpenTelemetry.Metrics;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using System;
using System.Threading.Tasks;

namespace OpenAI.Examples;

public partial class ChatExamples
{
[Test]
public async Task OpenTelemetryExamples()
{
// Let's configure OpenTelemetry to collect OpenAI and HTTP client traces and metrics
// and export them to console and also to the local OTLP endpoint.
//
// If you have some local OTLP listener (e.g. Aspire dashboard) running,
// you can explore traces and metrics produced by the test there.
//
// Check out https://opentelemetry.io/docs/languages/net/getting-started/ for more details and
// examples on how to set up OpenTelemetry with ASP.NET Core.

ResourceBuilder resourceBuilder = ResourceBuilder.CreateDefault().AddService("test");
using TracerProvider tracerProvider = OpenTelemetry.Sdk.CreateTracerProviderBuilder()
.SetResourceBuilder(resourceBuilder)
.AddSource("Experimental.OpenAI.*", "OpenAI.*")
.AddHttpClientInstrumentation()
.AddConsoleExporter()
.AddOtlpExporter()
.Build();

using MeterProvider meterProvider = OpenTelemetry.Sdk.CreateMeterProviderBuilder()
.SetResourceBuilder(resourceBuilder)
.AddView("gen_ai.client.operation.duration", new ExplicitBucketHistogramConfiguration { Boundaries = [0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 40.96, 81.92] })

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry/CHANGELOG.md#1100-beta1 added support for Hints, so Views maybe replaced with the hint API itself.
(Assuming this can take a dependency on preview DS,OTel nugets)

.AddMeter("Experimental.OpenAI.*", "OpenAI.*")
.AddHttpClientInstrumentation()
.AddConsoleExporter()
.AddOtlpExporter()
.Build();

ChatClient client = new("gpt-4o-mini", Environment.GetEnvironmentVariable("OPENAI_API_KEY"));

ChatCompletion completion = await client.CompleteChatAsync("Say 'this is a test.'");

Console.WriteLine($"{completion}");
}
}
33 changes: 0 additions & 33 deletions src/Utility/AppContextSwitchHelper.cs

This file was deleted.

4 changes: 2 additions & 2 deletions src/Utility/Telemetry/OpenTelemetryScope.cs
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ namespace OpenAI.Telemetry;

internal class OpenTelemetryScope : IDisposable
{
private static readonly ActivitySource s_chatSource = new ActivitySource("OpenAI.ChatClient");
private static readonly Meter s_chatMeter = new Meter("OpenAI.ChatClient");
private static readonly ActivitySource s_chatSource = new ActivitySource("Experimental.OpenAI.ChatClient");
private static readonly Meter s_chatMeter = new Meter("Experimental.OpenAI.ChatClient");

// TODO: add explicit histogram buckets once System.Diagnostics.DiagnosticSource 9.0 is used
private static readonly Histogram<double> s_duration = s_chatMeter.CreateHistogram<double>(GenAiClientOperationDurationMetricName, "s", "Measures GenAI operation duration.");
Expand Down
8 changes: 1 addition & 7 deletions src/Utility/Telemetry/OpenTelemetrySource.cs
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,6 @@ namespace OpenAI.Telemetry;
internal class OpenTelemetrySource
{
private const string ChatOperationName = "chat";
private readonly bool IsOTelEnabled = AppContextSwitchHelper
.GetConfigValue("OpenAI.Experimental.EnableOpenTelemetry", "OPENAI_EXPERIMENTAL_ENABLE_OPEN_TELEMETRY");

private readonly string _serverAddress;
private readonly int _serverPort;
private readonly string _model;
Expand All @@ -22,9 +19,6 @@ public OpenTelemetrySource(string model, Uri endpoint)

public OpenTelemetryScope StartChatScope(ChatCompletionOptions completionsOptions)
{
return IsOTelEnabled
? OpenTelemetryScope.StartChat(_model, ChatOperationName, _serverAddress, _serverPort, completionsOptions)
: null;
return OpenTelemetryScope.StartChat(_model, ChatOperationName, _serverAddress, _serverPort, completionsOptions);
}

}
5 changes: 2 additions & 3 deletions tests/Chat/ChatTests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -754,9 +754,8 @@ void HandleUpdate(StreamingChatCompletionUpdate update)
[NonParallelizable]
public async Task HelloWorldChatWithTracingAndMetrics()
{
using var _ = TestAppContextSwitchHelper.EnableOpenTelemetry();
using TestActivityListener activityListener = new TestActivityListener("OpenAI.ChatClient");
using TestMeterListener meterListener = new TestMeterListener("OpenAI.ChatClient");
using TestActivityListener activityListener = new TestActivityListener("Experimental.OpenAI.ChatClient");
using TestMeterListener meterListener = new TestMeterListener("Experimental.OpenAI.ChatClient");

ChatClient client = GetTestClient<ChatClient>(TestScenario.Chat);
IEnumerable<ChatMessage> messages = [new UserChatMessage("Hello, world!")];
Expand Down
1 change: 0 additions & 1 deletion tests/OpenAI.Tests.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,5 @@

<ItemGroup>
<Compile Include="..\src\Utility\Telemetry\*.cs" LinkBase="Telemetry\Shared" />
<Compile Include="..\src\Utility\AppContextSwitchHelper.cs" LinkBase="Telemetry\Shared" />
</ItemGroup>
</Project>
34 changes: 7 additions & 27 deletions tests/Telemetry/ChatTelemetryTests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -41,24 +41,12 @@ public void AllTelemetryOff()
Assert.IsNull(Activity.Current);
}

[Test]
public void SwitchOffAllTelemetryOn()
{
using var activityListener = new TestActivityListener("OpenAI.ChatClient");
using var meterListener = new TestMeterListener("OpenAI.ChatClient");
var telemetry = new OpenTelemetrySource(RequestModel, new Uri(Endpoint));
Assert.IsNull(telemetry.StartChatScope(new ChatCompletionOptions()));
Assert.IsNull(Activity.Current);
}

[Test]
public void MetricsOnTracingOff()
{
using var _ = TestAppContextSwitchHelper.EnableOpenTelemetry();

var telemetry = new OpenTelemetrySource(RequestModel, new Uri(Endpoint));

using var meterListener = new TestMeterListener("OpenAI.ChatClient");
using var meterListener = new TestMeterListener("Experimental.OpenAI.ChatClient");

var elapsedMax = Stopwatch.StartNew();
using var scope = telemetry.StartChatScope(new ChatCompletionOptions());
Expand All @@ -83,10 +71,8 @@ public void MetricsOnTracingOff()
[Test]
public void MetricsOnTracingOffException()
{
using var _ = TestAppContextSwitchHelper.EnableOpenTelemetry();

var telemetry = new OpenTelemetrySource(RequestModel, new Uri(Endpoint));
using var meterListener = new TestMeterListener("OpenAI.ChatClient");
using var meterListener = new TestMeterListener("Experimental.OpenAI.ChatClient");

using (var scope = telemetry.StartChatScope(new ChatCompletionOptions()))
{
Expand All @@ -100,10 +86,8 @@ public void MetricsOnTracingOffException()
[Test]
public void TracingOnMetricsOff()
{
using var _ = TestAppContextSwitchHelper.EnableOpenTelemetry();

var telemetry = new OpenTelemetrySource(RequestModel, new Uri(Endpoint));
using var listener = new TestActivityListener("OpenAI.ChatClient");
using var listener = new TestActivityListener("Experimental.OpenAI.ChatClient");

var chatCompletion = CreateChatCompletion();

Expand All @@ -129,9 +113,8 @@ public void TracingOnMetricsOff()
[Test]
public void ChatTracingAllAttributes()
{
using var _ = TestAppContextSwitchHelper.EnableOpenTelemetry();
var telemetry = new OpenTelemetrySource(RequestModel, new Uri(Endpoint));
using var listener = new TestActivityListener("OpenAI.ChatClient");
using var listener = new TestActivityListener("Experimental.OpenAI.ChatClient");
var options = new ChatCompletionOptions()
{
Temperature = 0.42f,
Expand All @@ -157,10 +140,8 @@ public void ChatTracingAllAttributes()
[Test]
public void ChatTracingException()
{
using var _ = TestAppContextSwitchHelper.EnableOpenTelemetry();

var telemetry = new OpenTelemetrySource(RequestModel, new Uri(Endpoint));
using var listener = new TestActivityListener("OpenAI.ChatClient");
using var listener = new TestActivityListener("Experimental.OpenAI.ChatClient");

var error = new SocketException(42, "test error");
using (var scope = telemetry.StartChatScope(new ChatCompletionOptions()))
Expand All @@ -176,11 +157,10 @@ public void ChatTracingException()
[Test]
public async Task ChatTracingAndMetricsMultiple()
{
using var _ = TestAppContextSwitchHelper.EnableOpenTelemetry();
var source = new OpenTelemetrySource(RequestModel, new Uri(Endpoint));

using var activityListener = new TestActivityListener("OpenAI.ChatClient");
using var meterListener = new TestMeterListener("OpenAI.ChatClient");
using var activityListener = new TestActivityListener("Experimental.OpenAI.ChatClient");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish everyone uses OTel's InMemoryExporter rather than writing own listeners, but it is upto the owners of this project!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Listeners are not extensively documented given its generally only used by small set of people like people authoring OpenTelemetry SDK etc.
OTOH, InMemoryExporter is documented, and is intended to be used by end-users to validate their own instrumentation. Changing from InMemoryExporter to anything else like Console/OTLP is often the one-line change. End-users, who are familiar with OTLP Exporter find it easy to use InMemoryExporter with OTel for testing, as opposed to learning ActivityListener/MeterListener etc.

Just a suggestion only.

using var meterListener = new TestMeterListener("Experimental.OpenAI.ChatClient");

var options = new ChatCompletionOptions();

Expand Down
25 changes: 0 additions & 25 deletions tests/Telemetry/TestAppContextSwitchHelper.cs

This file was deleted.

13 changes: 7 additions & 6 deletions tests/Telemetry/TestMeterListener.cs
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,15 @@
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics.Metrics;
using System.Linq;

namespace OpenAI.Tests.Telemetry;

internal class TestMeterListener : IDisposable
{
public record TestMeasurement(object value, Dictionary<string, object> tags);

private readonly ConcurrentDictionary<string, List<TestMeasurement>> _measurements = new();
private readonly ConcurrentDictionary<string, ConcurrentQueue<TestMeasurement>> _measurements = new();
private readonly ConcurrentDictionary<string, Instrument> _instruments = new();
private readonly MeterListener _listener;
public TestMeterListener(string meterName)
Expand All @@ -31,8 +32,8 @@ public TestMeterListener(string meterName)

public List<TestMeasurement> GetMeasurements(string instrumentName)
{
_measurements.TryGetValue(instrumentName, out var list);
return list;
_measurements.TryGetValue(instrumentName, out var queue);
return queue?.ToList();
}

public Instrument GetInstrument(string instrumentName)
Expand All @@ -46,11 +47,11 @@ private void OnMeasurementRecorded<T>(Instrument instrument, T measurement, Read
_instruments.TryAdd(instrument.Name, instrument);

var testMeasurement = new TestMeasurement(measurement, new Dictionary<string, object>(tags.ToArray()));
_measurements.AddOrUpdate(instrument.Name,
k => new() { testMeasurement },
_measurements.AddOrUpdate(instrument.Name,
k => new ConcurrentQueue<TestMeasurement>([ testMeasurement ]),
(k, l) =>
{
l.Add(testMeasurement);
l.Enqueue(testMeasurement);
return l;
});
}
Expand Down