Specify and implement deterministic SSRC derivation from station ID
Summary
The Opulent Voice Protocol specifies that the RTP SSRC field is derived from a hash of the station identifier. The current implementation uses Python's built-in hash() function, which is randomized per-process by default (via PYTHONHASHSEED). This means a station's SSRC changes every time its transmitter process restarts, breaking the "stable per-station identity" property that the protocol design implies and that several useful features depend on.
This issue proposes specifying a deterministic hash algorithm in the protocol and updating the reference implementation to match.
Background
Per the Opulent Voice Protocol Specification v1.1, 3.1:
"RTP provides sequence numbering, timestamps at 48 kHz sample rate, and the SSRC field takes a hash of the station identification value."
The current Interlocutor implementation (radio_protocol.py:793-795):
ssrc = hash(str(station_identifier)) % (2**32)
if ssrc == 0:
ssrc = 1
Python's hash() is randomized per-process by default (Python 3.3+, controlled by PYTHONHASHSEED). Same input, different process, different output.
$ python3 -c 'print(hash("KB5MU") % 2**32)'
2823479876
$ python3 -c 'print(hash("KB5MU") % 2**32)'
1495273014
Two consecutive runs from the same machine produce different SSRCs for the same callsign. Across machines, languages, or implementations the situation is worse. There is no guarantee of any particular relationship. I think we want this though. We would like a relationship.
What breaks because of this
Within a single transmitter process the SSRC is stable, so a single live receiver session works correctly. The problem appears when stability across sessions or implementations is expected or wanted for some reasons.
Per-station statistics aggregation. A receiver tracking long-term metrics by SSRC (jitter history, packet loss rate, total packets received from this station) would expect "all packets from KB5MU" to share an SSRC. With the current implementation, every Dialogus restart begins a new statistical bucket. After a week of typical usage, KB5MU's statistics are fragmented across dozens of distinct SSRCs.
Cross-implementation interoperability. A future Dialogus written in Rust, or a hardware implementation in HDL, cannot produce the same SSRC for "KB5MU" that Interlocutor produces. Each implementation derives its own per-process random value. Receivers cannot pre-compute "what SSRC should KB5MU's traffic have" to do offline lookups, identification, or deny-listing.
Authentication binding. If an authentication system ever wants to bind a cryptographic credential to an SSRC (for example, certifying "this SSRC is authorized to transmit"), the binding becomes impossible because the SSRC isn't stable. The authentication framework described in Section 7 of the spec implicitly assumes some stable per-station identity in the transmitted frames.
Diagnostic and audit value. A packet capture from a couple of years ago cannot be straightforwardly correlated with a packet capture from today by SSRC alone. Without the deterministic mapping, an investigator must depend on the OPV-layer station ID field, which defeats one of the points of having SSRC-from-callsign in the first place.
Spec/implementation gap. The protocol spec implies a deterministic relationship that the implementation does not deliver. Anyone reading the spec and trying to implement OPV in another language will get visibly different SSRCs from Interlocutor for the same callsign, which is at minimum confusing and at worst leads to interop bugs.
Proposed solution
Specify the hash algorithm at the protocol level and update Interlocutor to match. Simple!
What hash? How about CRC-32 (IEEE 802.3 polynomial, 0xEDB88320 reversed, also known as PKZIP/zlib CRC).
CRC-32 is the natural choice for a protocol that must be implementable across Python, C, Rust, and HDL. Hash all the bits. Presto Bingo.
Spec 3.1 update: Specification needs to be updated too!
The SSRC field is the CRC-32 (IEEE 802.3 polynomial) of the 6-byte
Base-40 encoded Station Identifier. If the resulting value is zero,
SSRC is set to 1. This produces a stable identifier that uniquely
maps each station to a fixed SSRC value across sessions and
implementations.
Implementation change in Interlocutor (radio_protocol.py):
import zlib
# In RTPAudioFrameBuilder.__init__:
station_id_bytes = station_identifier.to_bytes() # the 6-byte Base-40 form
ssrc = zlib.crc32(station_id_bytes)
if ssrc == 0:
ssrc = 1
self.rtp_header = RTPHeader(payload_type=payload_type, ssrc=ssrc)
Test cases
The protocol spec already includes test cases for Base-40 encoding (Appendix C of the spec, 2.3). Adding SSRC test cases to the spec is a natural thing to add here, just so folks have a place to confirm their implementation has got it going on.
| Identifier |
Base-40 (hex) |
Expected SSRC |
| W1AW |
0x0000001680b7 |
(CRC-32 of these 6 bytes) |
| KB5MU-11 |
0x0447b6864a5b |
(CRC-32 of these 6 bytes) |
| W5NYV.NCS |
0x71c06f55a697 |
(CRC-32 of these 6 bytes) |
| (etc., matching the existing Base-40 test cases) |
|
|
These can be generated mechanically and added to the spec. Any future implementation can validate against this table.
Phasing
Let's do phases like in Arcanum.
Phase 1: Specification update and Discussion
Update the OPV protocol spec 3.1 to specify CRC-32 of the Base-40 encoding. Add the test case table to Appendix C. This is purely documentation. It captures the design decision and prevents new implementations from picking different hashes. And gives people the chance to say if it's a bad idea or not.
Phase 2: Reference implementation update
Update Interlocutor's radio_protocol.py to use zlib.crc32 of the Base-40 bytes instead of Python's hash() of the string representation. Add unit tests verifying the test vectors match the spec.
Impact is expected to be minimal. No backwards compatibilty problems that I can think of.
Out of scope!
These are deliberately not addressed by this issue:
- Cryptographic authentication using SSRC. The deterministic SSRC enables this in principle but the actual authentication design is a separate question. This doesn't give any security or crypto.
- Other RTP fields. The sequence number and timestamp generation are correct as-is; this issue is only about the SSRC.
- The RTP-aware playback refactor (future possible enhancement). That refactor uses SSRC for "is this a different station?" detection, which works correctly regardless of how SSRC is computed. The two issues are independent. Fixing one does not block the other.
References
Acceptance criteria
This issue is closed when it works. Update the spec, make the change, and it better do what it was told.
Specify and implement deterministic SSRC derivation from station ID
Summary
The Opulent Voice Protocol specifies that the RTP SSRC field is derived from a hash of the station identifier. The current implementation uses Python's built-in
hash()function, which is randomized per-process by default (viaPYTHONHASHSEED). This means a station's SSRC changes every time its transmitter process restarts, breaking the "stable per-station identity" property that the protocol design implies and that several useful features depend on.This issue proposes specifying a deterministic hash algorithm in the protocol and updating the reference implementation to match.
Background
Per the Opulent Voice Protocol Specification v1.1, 3.1:
The current Interlocutor implementation (
radio_protocol.py:793-795):Python's
hash()is randomized per-process by default (Python 3.3+, controlled byPYTHONHASHSEED). Same input, different process, different output.Two consecutive runs from the same machine produce different SSRCs for the same callsign. Across machines, languages, or implementations the situation is worse. There is no guarantee of any particular relationship. I think we want this though. We would like a relationship.
What breaks because of this
Within a single transmitter process the SSRC is stable, so a single live receiver session works correctly. The problem appears when stability across sessions or implementations is expected or wanted for some reasons.
Per-station statistics aggregation. A receiver tracking long-term metrics by SSRC (jitter history, packet loss rate, total packets received from this station) would expect "all packets from KB5MU" to share an SSRC. With the current implementation, every Dialogus restart begins a new statistical bucket. After a week of typical usage, KB5MU's statistics are fragmented across dozens of distinct SSRCs.
Cross-implementation interoperability. A future Dialogus written in Rust, or a hardware implementation in HDL, cannot produce the same SSRC for "KB5MU" that Interlocutor produces. Each implementation derives its own per-process random value. Receivers cannot pre-compute "what SSRC should KB5MU's traffic have" to do offline lookups, identification, or deny-listing.
Authentication binding. If an authentication system ever wants to bind a cryptographic credential to an SSRC (for example, certifying "this SSRC is authorized to transmit"), the binding becomes impossible because the SSRC isn't stable. The authentication framework described in Section 7 of the spec implicitly assumes some stable per-station identity in the transmitted frames.
Diagnostic and audit value. A packet capture from a couple of years ago cannot be straightforwardly correlated with a packet capture from today by SSRC alone. Without the deterministic mapping, an investigator must depend on the OPV-layer station ID field, which defeats one of the points of having SSRC-from-callsign in the first place.
Spec/implementation gap. The protocol spec implies a deterministic relationship that the implementation does not deliver. Anyone reading the spec and trying to implement OPV in another language will get visibly different SSRCs from Interlocutor for the same callsign, which is at minimum confusing and at worst leads to interop bugs.
Proposed solution
Specify the hash algorithm at the protocol level and update Interlocutor to match. Simple!
What hash? How about CRC-32 (IEEE 802.3 polynomial, 0xEDB88320 reversed, also known as PKZIP/zlib CRC).
CRC-32 is the natural choice for a protocol that must be implementable across Python, C, Rust, and HDL. Hash all the bits. Presto Bingo.
Spec 3.1 update: Specification needs to be updated too!
Implementation change in Interlocutor (
radio_protocol.py):Test cases
The protocol spec already includes test cases for Base-40 encoding (Appendix C of the spec, 2.3). Adding SSRC test cases to the spec is a natural thing to add here, just so folks have a place to confirm their implementation has got it going on.
These can be generated mechanically and added to the spec. Any future implementation can validate against this table.
Phasing
Let's do phases like in Arcanum.
Phase 1: Specification update and Discussion
Update the OPV protocol spec 3.1 to specify CRC-32 of the Base-40 encoding. Add the test case table to Appendix C. This is purely documentation. It captures the design decision and prevents new implementations from picking different hashes. And gives people the chance to say if it's a bad idea or not.
Phase 2: Reference implementation update
Update Interlocutor's
radio_protocol.pyto usezlib.crc32of the Base-40 bytes instead of Python'shash()of the string representation. Add unit tests verifying the test vectors match the spec.Impact is expected to be minimal. No backwards compatibilty problems that I can think of.
Out of scope!
These are deliberately not addressed by this issue:
References
zlib.crc32https://docs.python.org/3/library/zlib.html#zlib.crc32Acceptance criteria
This issue is closed when it works. Update the spec, make the change, and it better do what it was told.