Distributed State Configuration
Configure state synchronization for horizontally scaled deployments.
Overview
When running multiple RevenProx instances, distributed state ensures:
- Subscriptions are synchronized across all instances
- Events reach clients regardless of which proxy they're connected to
- Consistent view of active topics and connections
NNG Configuration
NNG (nanomsg-next-generation) handles inter-proxy communication.
listen_addresses
Addresses to listen for peer connections.
Formats:
- tcp://*:5555 - Listen on all interfaces
- tcp://192.168.1.10:5555 - Listen on specific interface
- ipc:///tmp/revenprox.sock - Unix socket (same host)
peer_addresses
Addresses of peer proxies to connect to.
Tip
Use DNS names or service discovery for dynamic peer lists.
Buffer Sizes
Configure message buffers:
[nng]
pub_buffer_size = 65536 # Publisher buffer (bytes)
sub_buffer_size = 131072 # Subscriber buffer (bytes)
Larger buffers handle burst traffic but use more memory.
reconnect_interval_ms
Time between reconnection attempts to failed peers.
heartbeat_interval_sec
Interval for peer health checks.
Message Batching
Batch messages for efficiency:
Distributed State Configuration
gossip_interval_sec
Interval for gossip protocol updates.
Lower values = faster sync, higher network overhead.
Bloom Filter Settings
Bloom filters enable efficient subscription lookups:
[distributed_state]
bloom_filter_capacity = 1000000 # Expected subscriptions
bloom_filter_fpr = 0.01 # False positive rate (1%)
Capacity planning: - Set to expected max subscriptions - Higher capacity = more memory - Memory usage: capacity * -ln(fpr) / (ln(2)^2) / 8 bytes
vector_clock_cleanup_interval_sec
Interval for cleaning up old vector clock entries.
max_subscription_events
Maximum subscription events to track for sync.
merkle_tree_rebuild_threshold
Trigger Merkle tree rebuild after this many changes.
peer_timeout_sec
Time before considering a peer dead.
Deployment Topologies
Two-Node Active-Active
┌─────────────┐ NNG ┌─────────────┐
│ Proxy A │◄────────────►│ Proxy B │
│ :8080/:5555 │ │ :8080/:5555 │
└─────────────┘ └─────────────┘
Proxy A config:
Proxy B config:
Multi-Node Mesh
┌─────────────┐
│ Proxy A │
└──────┬──────┘
│
┌──────────┼──────────┐
│ │ │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐
│Proxy B│ │Proxy C│ │Proxy D│
└───────┘ └───────┘ └───────┘
Each proxy connects to all others:
[nng]
listen_addresses = ["tcp://*:5555"]
peer_addresses = [
"tcp://proxy-a:5555",
"tcp://proxy-b:5555",
"tcp://proxy-c:5555",
"tcp://proxy-d:5555"
]
Kubernetes Deployment
Use headless service for peer discovery:
apiVersion: v1
kind: Service
metadata:
name: revenprox-peers
spec:
clusterIP: None
selector:
app: revenprox
ports:
- port: 5555
name: nng
Config:
[nng]
listen_addresses = ["tcp://*:5555"]
# Use DNS SRV records or init container for peer discovery
Consistency Model
RevenProx uses eventual consistency:
- Local First: Operations apply locally immediately
- Async Sync: Changes propagate to peers asynchronously
- Conflict Resolution: Last-write-wins with vector clocks
Sync Guarantees
| Operation | Local | Remote |
|---|---|---|
| Subscribe | Immediate | < gossip_interval |
| Unsubscribe | Immediate | < gossip_interval |
| Event delivery | Immediate | < message_batch_timeout |
Monitoring
Key Metrics
| Metric | Description |
|---|---|
peers_connected |
Number of active peer connections |
sync_lag_ms |
Time since last successful sync |
bloom_filter_fpr |
Actual false positive rate |
gossip_messages_sent |
Gossip messages transmitted |
Health Checks
Verify peer connectivity:
Troubleshooting
Peers Not Connecting
- Check network connectivity between hosts
- Verify firewall allows NNG port (default 5555)
- Check peer addresses are correct
- Review logs for connection errors
Slow Synchronization
- Reduce
gossip_interval_sec - Increase
message_batch_size - Check network latency between peers
- Monitor
sync_lag_msmetric
Memory Usage High
- Reduce
bloom_filter_capacity - Lower
max_subscription_events - Decrease buffer sizes
- Check for subscription leaks
Example: Production Cluster
# Unique ID per instance
proxy_id = "proxy-prod-1"
[nng]
listen_addresses = ["tcp://*:5555"]
peer_addresses = [
"tcp://proxy-prod-2.internal:5555",
"tcp://proxy-prod-3.internal:5555"
]
pub_buffer_size = 131072
sub_buffer_size = 262144
reconnect_interval_ms = 2000
heartbeat_interval_sec = 15
[distributed_state]
gossip_interval_sec = 10
bloom_filter_capacity = 5000000
bloom_filter_fpr = 0.001
max_subscription_events = 500000
peer_timeout_sec = 60