Skip to content

Deterministic Hash for RTP Header Field SSRC #119

@Abraxas3d

Description

@Abraxas3d

Specify and implement deterministic SSRC derivation from station ID

Summary

The Opulent Voice Protocol specifies that the RTP SSRC field is derived from a hash of the station identifier. The current implementation uses Python's built-in hash() function, which is randomized per-process by default (via PYTHONHASHSEED). This means a station's SSRC changes every time its transmitter process restarts, breaking the "stable per-station identity" property that the protocol design implies and that several useful features depend on.

This issue proposes specifying a deterministic hash algorithm in the protocol and updating the reference implementation to match.

Background

Per the Opulent Voice Protocol Specification v1.1, 3.1:

"RTP provides sequence numbering, timestamps at 48 kHz sample rate, and the SSRC field takes a hash of the station identification value."

The current Interlocutor implementation (radio_protocol.py:793-795):

ssrc = hash(str(station_identifier)) % (2**32)
if ssrc == 0:
    ssrc = 1

Python's hash() is randomized per-process by default (Python 3.3+, controlled by PYTHONHASHSEED). Same input, different process, different output.

$ python3 -c 'print(hash("KB5MU") % 2**32)'
2823479876
$ python3 -c 'print(hash("KB5MU") % 2**32)'
1495273014

Two consecutive runs from the same machine produce different SSRCs for the same callsign. Across machines, languages, or implementations the situation is worse. There is no guarantee of any particular relationship. I think we want this though. We would like a relationship.

What breaks because of this

Within a single transmitter process the SSRC is stable, so a single live receiver session works correctly. The problem appears when stability across sessions or implementations is expected or wanted for some reasons.

Per-station statistics aggregation. A receiver tracking long-term metrics by SSRC (jitter history, packet loss rate, total packets received from this station) would expect "all packets from KB5MU" to share an SSRC. With the current implementation, every Dialogus restart begins a new statistical bucket. After a week of typical usage, KB5MU's statistics are fragmented across dozens of distinct SSRCs.

Cross-implementation interoperability. A future Dialogus written in Rust, or a hardware implementation in HDL, cannot produce the same SSRC for "KB5MU" that Interlocutor produces. Each implementation derives its own per-process random value. Receivers cannot pre-compute "what SSRC should KB5MU's traffic have" to do offline lookups, identification, or deny-listing.

Authentication binding. If an authentication system ever wants to bind a cryptographic credential to an SSRC (for example, certifying "this SSRC is authorized to transmit"), the binding becomes impossible because the SSRC isn't stable. The authentication framework described in Section 7 of the spec implicitly assumes some stable per-station identity in the transmitted frames.

Diagnostic and audit value. A packet capture from a couple of years ago cannot be straightforwardly correlated with a packet capture from today by SSRC alone. Without the deterministic mapping, an investigator must depend on the OPV-layer station ID field, which defeats one of the points of having SSRC-from-callsign in the first place.

Spec/implementation gap. The protocol spec implies a deterministic relationship that the implementation does not deliver. Anyone reading the spec and trying to implement OPV in another language will get visibly different SSRCs from Interlocutor for the same callsign, which is at minimum confusing and at worst leads to interop bugs.

Proposed solution

Specify the hash algorithm at the protocol level and update Interlocutor to match. Simple!

What hash? How about CRC-32 (IEEE 802.3 polynomial, 0xEDB88320 reversed, also known as PKZIP/zlib CRC).

CRC-32 is the natural choice for a protocol that must be implementable across Python, C, Rust, and HDL. Hash all the bits. Presto Bingo.

Spec 3.1 update: Specification needs to be updated too!

The SSRC field is the CRC-32 (IEEE 802.3 polynomial) of the 6-byte
Base-40 encoded Station Identifier. If the resulting value is zero,
SSRC is set to 1. This produces a stable identifier that uniquely
maps each station to a fixed SSRC value across sessions and
implementations.

Implementation change in Interlocutor (radio_protocol.py):

import zlib

# In RTPAudioFrameBuilder.__init__:
station_id_bytes = station_identifier.to_bytes()  # the 6-byte Base-40 form
ssrc = zlib.crc32(station_id_bytes)
if ssrc == 0:
    ssrc = 1
self.rtp_header = RTPHeader(payload_type=payload_type, ssrc=ssrc)

Test cases

The protocol spec already includes test cases for Base-40 encoding (Appendix C of the spec, 2.3). Adding SSRC test cases to the spec is a natural thing to add here, just so folks have a place to confirm their implementation has got it going on.

Identifier Base-40 (hex) Expected SSRC
W1AW 0x0000001680b7 (CRC-32 of these 6 bytes)
KB5MU-11 0x0447b6864a5b (CRC-32 of these 6 bytes)
W5NYV.NCS 0x71c06f55a697 (CRC-32 of these 6 bytes)
(etc., matching the existing Base-40 test cases)

These can be generated mechanically and added to the spec. Any future implementation can validate against this table.

Phasing

Let's do phases like in Arcanum.

Phase 1: Specification update and Discussion

Update the OPV protocol spec 3.1 to specify CRC-32 of the Base-40 encoding. Add the test case table to Appendix C. This is purely documentation. It captures the design decision and prevents new implementations from picking different hashes. And gives people the chance to say if it's a bad idea or not.

Phase 2: Reference implementation update

Update Interlocutor's radio_protocol.py to use zlib.crc32 of the Base-40 bytes instead of Python's hash() of the string representation. Add unit tests verifying the test vectors match the spec.

Impact is expected to be minimal. No backwards compatibilty problems that I can think of.

Out of scope!

These are deliberately not addressed by this issue:

  • Cryptographic authentication using SSRC. The deterministic SSRC enables this in principle but the actual authentication design is a separate question. This doesn't give any security or crypto.
  • Other RTP fields. The sequence number and timestamp generation are correct as-is; this issue is only about the SSRC.
  • The RTP-aware playback refactor (future possible enhancement). That refactor uses SSRC for "is this a different station?" detection, which works correctly regardless of how SSRC is computed. The two issues are independent. Fixing one does not block the other.

References

Acceptance criteria

This issue is closed when it works. Update the spec, make the change, and it better do what it was told.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions