p2p design improvements and bug fixes by mateeullahmalik · Pull Request #152 · LumeraProtocol/supernode

mateeullahmalik · 2025-09-03T13:01:13Z

P2P: Make hot path fast & resilient — pooled TCP, dynamic deadlines, chunked BatchStore, smarter health/bans

Summary

This PR hardens and speeds up the P2P stack without changing the wire protocol. The hot path no longer wastes RPCs or times out on heavy payloads, connections are responsibly reused, and node health is managed by a sober background loop (not by request-time heuristics).

Headline changes

Robust connection pooling
- Keyed by bech32@host:port for invariant identity.
- connWrapper serializes whole-RPC I/O (prevents cross-talk on pooled sockets).
- One safe redial on stale pooled sockets (EOF / reset / broken pipe).
- Idle pruner (10m tick / 1h idle) + metrics: adds, hits, misses, evictions, open count.
- TCP NoDelay (low latency) + server-side TCP keepalive (detect half-open).
Dynamic write deadlines
- Write deadline scaled to encoded size with cushion; outer call timeout widened for heavy ops.
- Server read timeout keeps the connection open on timeouts (no churn).
Chunked BatchStoreData
- Per-node payload split into ~180 MB chunks (stays well under the 200 MB hard cap).
- Skips empty batches (no-op RPCs) and uses the long-timeout lane for large chunks.
Timeout tuning (realistic, payload-aware)
- BatchStoreData: 90s, BatchGetValues: 90s, Find*/StoreData: 5–15s.
- Client write deadlines derived from message size; read bounded by operation timeout.
Health & banlist behavior
- Background checkNodeActivity loop (2m, bounded concurrency) with 3s pings.
- Only demote to inactive when failures exceed threshold=3 (was 1).
- Unban immediately on successful ping; routing table updated via health, not hot-path pings.
Bootstrap strategy (full network view)
- Periodic chain sync (10m) updates replication_info and seeds routing table.
- No bootstrap-time pings; hot path relies on full view in-memory routing table (sized for ≤1000 validators).
Message safety
- Hard 200 MB cap enforced in encode/decode; responders compress batch GETs.
- Server continues on read timeout; client retries once on stale pooled socket.

Key Changes (by area)

Networking / I/O

Client (Network.Call)
- Whole-RPC lock on pooled connections; one redial on stale socket.
- Dynamic writeDeadline = base + size/throughputFloor + cushion.
- Deadlines cleared after every RPC (reusable pooled conns).
Server
- Per-message read deadline 90s, but on timeout we continue (don’t close).
- Write deadline per response (prevents hung writers).
- TCP keepalive enabled on accepted sockets.

Batch store

Per-node chunking (~180 MB) to respect the 200 MB envelope after gob overhead.
Skips empty batches (no more useless RPCs).
Heavy chunks run with long-timeout profile; smaller ones complete faster.

Health & bans

Ban threshold 3; health loop (2m) handles ban/unban and Active flip.
Hot paths no longer make ban decisions; fewer false negatives and less churn.

Bootstrap

One-time + periodic (10m) chain sync → upsert replication_info and seed routing.
No eager pings; routing table maintains full view for ≤1000 validators.

Performance & Reliability Impact

Latency: lower due to TCP_NODELAY, pooled conns, and no-reopen on server timeouts.
Throughput: better because heavy writes get appropriate deadlines; chunking avoids cap hits.
Stability: resilient to transient resets; background health avoids flapping & over-banning.
Resource use: limited concurrent batch sends; idle pruner keeps pool bounded.

Backward Compatibility

No protocol changes; existing peers interoperate.
Timeouts increased only for heavy ops; light RPCs unchanged or lowered.

Configuration Knobs

BatchStoreData timeout: 90s (execTimeouts map).
Chunk target: ~180 MB raw payload (headroom under 200 MB cap).
Pool pruner: 10m tick, 1h idle; pool capacity 256 (tunable).
Health loop: 2m cadence; per-ping timeout 3s; ban threshold 3.

Risks & Mitigations

Large WAN variance: dynamic write deadlines + longer outer timeouts for heavy ops.
Oversized messages: 200 MB hard cap; chunking at ~180 MB prevents rejects.
Banlist sensitivity: threshold bumped to 3; unban on successful ping.

Rollback: revert timeout map and chunker; server read timeout remains safe even when reduced.

Testing Performed

Unit: encode/decode size guard; stale-socket classification; chunker sizing.
Integration:
- Batch store with mixed 1–50 KB records up to caps; verified chunks <200 MB.
- Induced server-side close/reset; ensured single redial recovers.
- Ban/unban via health loop; confirmed no hot-path bans.

Observability

Pool metrics: adds/replacements/hits/misses/evictions/open_current/capacity.
Logs: “Stale pooled connection on write/read; redialing” markers; batch chunk sizing & timing.
Health: banlist size; Active flips; last_seen updates.

Checklist

Connection pooling safe under concurrency
One redial on stale pooled socket
Dynamic write deadlines
Server read timeout keeps conn open
BatchStore chunking (~180 MB), no empty batches
200 MB cap enforced
Health loop governs ban/unban (threshold=3)
Bootstrap refresher seeds routing; no hot-path find-node
Timeouts tuned (heavy ops 60–90s)

mateeullahmalik requested a review from j-rafique September 3, 2025 13:02

p2p design improvements and bug fixes

40f27a0

mateeullahmalik force-pushed the p2pDesignImporvements branch from f8dda5b to 40f27a0 Compare September 3, 2025 19:41

j-rafique approved these changes Sep 4, 2025

View reviewed changes

mateeullahmalik merged commit 2aee34a into master Sep 4, 2025
7 checks passed

mateeullahmalik deleted the p2pDesignImporvements branch September 5, 2025 11:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2p design improvements and bug fixes#152

p2p design improvements and bug fixes#152
mateeullahmalik merged 1 commit intomasterfrom
p2pDesignImporvements

mateeullahmalik commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mateeullahmalik commented Sep 3, 2025

P2P: Make hot path fast & resilient — pooled TCP, dynamic deadlines, chunked BatchStore, smarter health/bans

Summary

Key Changes (by area)

Networking / I/O

Batch store

Health & bans

Bootstrap

Performance & Reliability Impact

Backward Compatibility

Configuration Knobs

Risks & Mitigations

Testing Performed

Observability

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants