Webstream relay crashes with decodebin typefind failure on transient upstream stalls

## What happened

After deploying v1.39.17 (2026-05-10 ~07:19 UTC), the webstream relay for the **Grim Leftover's** mount (`f58c7e4a-151a-4d69-8587-32a1c73f1210`, station `0e4edda8-fb53-44b5-83db-5792c512789d`) crashed once with:

```
WRN webstream pipeline crashed, attempting reconnection mount=f58c7e4a-… webstream="Grim Leftover's"
WRN gstreamer pipeline exited with error error="exit status 1" mount=f58c7e4a-…
    stderr="ERROR: from element /GstPipeline:pipeline0/GstDecodeBin:decodebin0/GstTypeFindElement:typefind:
            Could not determine type of stream."
```

The auto-reconnect succeeded on attempt 1 (≈8 s gap), so listener impact was a brief audio dropout, not an outage. Frequency over the last 2 h: 1 occurrence.

## Pipeline involved

```
souphttpsrc location="https://rlmradio.xyz/live/grim-leftovers" is-live=true do-timestamp=true iradio-mode=true
  ! queue max-size-time=5000000000
  ! watchdog timeout=15000
  ! decodebin
  ! audioconvert ! audioresample ! …
```

Note that the upstream URL is itself a Grimnir-served mount on the same host, so this is Grimnir relaying its own output. A momentary stall on the source mount makes `souphttpsrc` deliver too few bytes (or the wrong content-type/HTML) for `typefind` to classify, decodebin gives up, the whole pipeline exits, and the relay has to tear down and reconnect.

## Suggested mitigation

Two complementary options, in order of cost:

1. **Treat early `typefind` failures as a soft retry** in the webstream relay's reconnect logic — they are recoverable and normal for live HTTP sources, and shouldn't surface as `WRN webstream pipeline crashed`. Today the reconnect already kicks in, but every typefind miss costs an audible gap and a noisy WARN.
2. **Insert `souphttpsrc retries=N timeout=…` and/or a `multiqueue`/`hlsdemux` ahead of decodebin** so brief upstream stalls are absorbed before typefind sees them. Worth comparing with the watchdog timeout=15000 already in place.

## Severity

P3. Self-recovering, low frequency, but worth fixing because:
- Each event = a real listener-audible gap.
- The same pattern likely affects other relayed-internal mounts on stall.
- Now that the per-track signal-interrupt log noise is gone (v1.39.17), this WARN is one of the few real signals left in the playout log — addressing it keeps the signal:noise ratio high.

## Repro / further data

- Container start: 2026-05-10 07:18 UTC
- Crash log: 07:19:13 UTC
- Reconnect succeeded: 07:19:21 UTC
- Source mount: `https://rlmradio.xyz/live/grim-leftovers`
- Webstream record: `e4fd2190-f62e-444c-9fc4-f0654e03c699`

While reading the log around this incident I also noticed a **separate** SQLSTATE 22P02 from `internal/webstream/icy_metadata.go:141` on the same mount — same class as the v1.39.16 fix (empty `media_id` on a webstream PlayHistory `Save`). Worth filing as its own issue if not already known; happy to do so.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Webstream relay crashes with decodebin typefind failure on transient upstream stalls #217

What happened

Pipeline involved

Suggested mitigation

Severity

Repro / further data

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Webstream relay crashes with decodebin typefind failure on transient upstream stalls #217

Description

What happened

Pipeline involved

Suggested mitigation

Severity

Repro / further data

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions