Skip to content

RevenProx SSE Proxy

Distributed State

Distributed State Configuration

Configure state synchronization for horizontally scaled deployments.

Overview

When running multiple RevenProx instances, distributed state ensures:

Subscriptions are synchronized across all instances
Events reach clients regardless of which proxy they're connected to
Consistent view of active topics and connections

NNG Configuration

NNG (nanomsg-next-generation) handles inter-proxy communication.

listen_addresses

Addresses to listen for peer connections.

[nng]
listen_addresses = ["tcp://*:5555"]

Formats: - tcp://*:5555 - Listen on all interfaces - tcp://192.168.1.10:5555 - Listen on specific interface - ipc:///tmp/revenprox.sock - Unix socket (same host)

peer_addresses

Addresses of peer proxies to connect to.

[nng]
peer_addresses = [
  "tcp://proxy-2.internal:5555",
  "tcp://proxy-3.internal:5555"
]

Tip

Use DNS names or service discovery for dynamic peer lists.

Buffer Sizes

Configure message buffers:

[nng]
pub_buffer_size = 65536    # Publisher buffer (bytes)
sub_buffer_size = 131072   # Subscriber buffer (bytes)

Larger buffers handle burst traffic but use more memory.

reconnect_interval_ms

Time between reconnection attempts to failed peers.

[nng]
reconnect_interval_ms = 5000

heartbeat_interval_sec

Interval for peer health checks.

[nng]
heartbeat_interval_sec = 30

Message Batching

Batch messages for efficiency:

[nng]
message_batch_size = 100       # Messages per batch
message_batch_timeout_ms = 50  # Max wait time

Distributed State Configuration

gossip_interval_sec

Interval for gossip protocol updates.

[distributed_state]
gossip_interval_sec = 30

Lower values = faster sync, higher network overhead.

Bloom Filter Settings

Bloom filters enable efficient subscription lookups:

[distributed_state]
bloom_filter_capacity = 1000000   # Expected subscriptions
bloom_filter_fpr = 0.01           # False positive rate (1%)

Capacity planning: - Set to expected max subscriptions - Higher capacity = more memory - Memory usage: capacity * -ln(fpr) / (ln(2)^2) / 8 bytes

vector_clock_cleanup_interval_sec

Interval for cleaning up old vector clock entries.

[distributed_state]
vector_clock_cleanup_interval_sec = 3600

max_subscription_events

Maximum subscription events to track for sync.

[distributed_state]
max_subscription_events = 100000

merkle_tree_rebuild_threshold

Trigger Merkle tree rebuild after this many changes.

[distributed_state]
merkle_tree_rebuild_threshold = 1000

peer_timeout_sec

Time before considering a peer dead.

[distributed_state]
peer_timeout_sec = 180

Deployment Topologies

Two-Node Active-Active

┌─────────────┐     NNG      ┌─────────────┐
│   Proxy A   │◄────────────►│   Proxy B   │
│ :8080/:5555 │              │ :8080/:5555 │
└─────────────┘              └─────────────┘

Proxy A config:

[nng]
listen_addresses = ["tcp://*:5555"]
peer_addresses = ["tcp://proxy-b:5555"]

Proxy B config:

[nng]
listen_addresses = ["tcp://*:5555"]
peer_addresses = ["tcp://proxy-a:5555"]

Multi-Node Mesh

        ┌─────────────┐
        │   Proxy A   │
        └──────┬──────┘
               │
    ┌──────────┼──────────┐
    │          │          │
┌───▼───┐  ┌───▼───┐  ┌───▼───┐
│Proxy B│  │Proxy C│  │Proxy D│
└───────┘  └───────┘  └───────┘

Each proxy connects to all others:

[nng]
listen_addresses = ["tcp://*:5555"]
peer_addresses = [
  "tcp://proxy-a:5555",
  "tcp://proxy-b:5555",
  "tcp://proxy-c:5555",
  "tcp://proxy-d:5555"
]

Kubernetes Deployment

Use headless service for peer discovery:

apiVersion: v1
kind: Service
metadata:
  name: revenprox-peers
spec:
  clusterIP: None
  selector:
    app: revenprox
  ports:
    - port: 5555
      name: nng

Config:

[nng]
listen_addresses = ["tcp://*:5555"]
# Use DNS SRV records or init container for peer discovery

Consistency Model

RevenProx uses eventual consistency:

Local First: Operations apply locally immediately
Async Sync: Changes propagate to peers asynchronously
Conflict Resolution: Last-write-wins with vector clocks

Sync Guarantees

Operation	Local	Remote
Subscribe	Immediate	< gossip_interval
Unsubscribe	Immediate	< gossip_interval
Event delivery	Immediate	< message_batch_timeout

Monitoring

Key Metrics

Metric	Description
`peers_connected`	Number of active peer connections
`sync_lag_ms`	Time since last successful sync
`bloom_filter_fpr`	Actual false positive rate
`gossip_messages_sent`	Gossip messages transmitted

Health Checks

Verify peer connectivity:

# Check peer status
curl http://localhost:8080/health/peers

Troubleshooting

Peers Not Connecting

Check network connectivity between hosts
Verify firewall allows NNG port (default 5555)
Check peer addresses are correct
Review logs for connection errors

Slow Synchronization

Reduce gossip_interval_sec
Increase message_batch_size
Check network latency between peers
Monitor sync_lag_ms metric

Memory Usage High

Reduce bloom_filter_capacity
Lower max_subscription_events
Decrease buffer sizes
Check for subscription leaks

Example: Production Cluster

# Unique ID per instance
proxy_id = "proxy-prod-1"

[nng]
listen_addresses = ["tcp://*:5555"]
peer_addresses = [
  "tcp://proxy-prod-2.internal:5555",
  "tcp://proxy-prod-3.internal:5555"
]
pub_buffer_size = 131072
sub_buffer_size = 262144
reconnect_interval_ms = 2000
heartbeat_interval_sec = 15

[distributed_state]
gossip_interval_sec = 10
bloom_filter_capacity = 5000000
bloom_filter_fpr = 0.001
max_subscription_events = 500000
peer_timeout_sec = 60

Next Steps