Exporters shutdown takes longer then a minute when failing to send metrics/traces

**Environment:** Mac OS, RHEL 8 (doesn't matter)

Configure traces and/or metrics GRPC exporter with invalid collector URL 

```
  # init tracing
  trace_provider = TracerProvider(resource=resource)
  trace_exporter = create_trace_exporter(exporter, otlp_config, jaeger_config)
  trace_processor = BatchSpanProcessor(trace_exporter)
  trace_provider.add_span_processor(trace_processor)
  telemetry_trace.set_tracer_provider(trace_provider)

  # init metrics
  global _telemetry_meter  # pylint: disable=global-statement
  metrics_exporter = create_metrics_exporter(exporter, otlp_config)
  metric_reader = PeriodicExportingMetricReader(
      metrics_exporter, export_interval_millis=export_interval, export_timeout_millis=3000
  )
  metrics_provider = MeterProvider(resource=resource, metric_readers=[metric_reader])
  telemetry_metrics.set_meter_provider(metrics_provider)
  _telemetry_meter = telemetry_metrics.get_meter_provider().get_meter(service_name, str(service_version))
```

**What is the expected behavior?**
When executing:

```
meter_provider = telemetry_metrics.get_meter_provider()
if meter_provider and (shutdown_meter := getattr(meter_provider, 'shutdown', None)):
    shutdown_meter(timeout_millis=3000)
```

it should shutdown in about 3 seconds

**What is the actual behavior?**
It takes ~60 seconds to shutdown with logs like 
```
WARNING - opentelemetry.exporter.otlp.proto.grpc.exporter::_export:363 | Transient error StatusCode.UNAVAILABLE encountered while exporting metrics, retrying in 8s.
```

**Additional context**
From what it looks like both metrics and traces exporters use [this base/mixin](https://github.com/open-telemetry/opentelemetry-python/blob/main/exporter/opentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/exporter.py#L369-L376) which has a `timeout_millis`

Now going level up where `OTLPMetricExporter.shudown` calls it correctly
```
    def shutdown(self, timeout_millis: float = 30_000, **kwargs) -> None:
        OTLPExporterMixin.shutdown(self, timeout_millis=timeout_millis)
```

However, `PeriodicExportingMetricReader.shutdown` calls `OTLPMetricExporter.shutdown` with a different kwarg name which seems to be completely ignored
```
    def shutdown(self, timeout_millis: float = 30_000, **kwargs) -> None:
        deadline_ns = time_ns() + timeout_millis * 10**6

        def _shutdown():
            self._shutdown = True

        did_set = self._shutdown_once.do_once(_shutdown)
        if not did_set:
            _logger.warning("Can't shutdown multiple times")
            return

        self._shutdown_event.set()
        if self._daemon_thread:
            self._daemon_thread.join(
                timeout=(deadline_ns - time_ns()) / 10**9
            )
        self._exporter.shutdown(timeout=(deadline_ns - time_ns()) / 10**6)  # <--- timeout vs timeout_millis
```
+ given the use of `time_ns()` if correct kwarg name it would lead to the error of negative timeout value being supplied


As for traces, [exporter](https://github.com/open-telemetry/opentelemetry-python/blob/main/exporter/opentelemetry-exporter-otlp-proto-grpc/src/opentelemetry/exporter/otlp/proto/grpc/trace_exporter/__init__.py#L143-L144) calls ` OTLPExporterMixin.shutdown` without propagating any timeouts at all.


This leads to some bad behaviour when combined with k8s and async application since timeoutless thread lock blocks event loop and also leads to hanging containers in k8s cluster until they are killed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exporters shutdown takes longer then a minute when failing to send metrics/traces #3309

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Exporters shutdown takes longer then a minute when failing to send metrics/traces #3309

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions