⚠️ Notice: This project is a prototype/proof-of-concept and is not intended for production use. It is provided as-is for experimentation and learning purposes.
Distributed rate limiting service for Envoy Proxy written in Rust.
Hivemind is a high-performance, distributed rate limiting service that integrates with Envoy Proxy's global rate limiting API. It uses a peer-to-peer mesh architecture for state synchronization to avoid requiring a centralized storage or other potential single point of failure.
Rate limiting architectures generally fall into two categories, each with distinct trade-offs:
A centralized approach uses a single source of truth (typically Redis or a similar data store) for all rate limit counters.
Pros:
- Exact enforcement: All nodes see the same counter values with strong consistency
- Simple mental model: One counter per rate limit key, no synchronization complexity
- Immediate propagation: Counter updates are instantly visible to all nodes
Cons:
- Single point of failure: If the central store goes down, rate limiting fails (or must fail-open)
- Network latency: Every rate limit check requires a round-trip to the central store
- Scalability limits: Central store can become a bottleneck under high load
- Operational complexity: Requires provisioning, monitoring, and maintaining additional infrastructure
Best for: Scenarios requiring exact rate limit enforcement, such as financial/billing use cases.
Hivemind uses gossip-based eventual consistency where each node maintains its own counters and shares state with peers.
Pros:
- No single point of failure: Cluster continues operating if nodes fail
- Lower latency: Rate limit decisions are made locally without network round-trips
- Horizontal scalability: Adding nodes increases capacity without bottlenecks
- Simpler deployment: No external dependencies; runs as a sidecar alongside your application
Cons:
- Eventual consistency: Counter values may temporarily diverge across nodes
- Potential overshoot: During propagation delay, the cluster may allow slightly more requests than the configured limit
- Consistency window: The gossip interval (default 100ms) determines how quickly state converges
Best for: High-throughput APIs where approximate enforcement is acceptable, DDoS protection, preventing abuse, or environments where operational simplicity is valued over exact precision.
Hivemind also supports standalone mode (no mesh) for single-instance deployments or when running behind a load balancer that routes consistently to the same instance.
graph LR
Client["Client"]
subgraph pod1 ["Pod / Instance 1"]
direction LR
Envoy1["Envoy Proxy"] --> App1["Application"]
App1 --> Envoy1
Envoy1 -->|"ShouldRateLimit()"| HM1["Hivemind"]
end
subgraph pod2 ["Pod / Instance 2"]
direction LR
Envoy2["Envoy Proxy"] --> App2["Application"]
App2 --> Envoy2
Envoy2 -->|"ShouldRateLimit()"| HM2["Hivemind"]
end
subgraph pod3 ["Pod / Instance N"]
direction LR
Envoy3["Envoy Proxy"] --> App3["Application"]
App3 --> Envoy3
Envoy3 -->|"ShouldRateLimit()"| HM3["Hivemind"]
end
Client --> Envoy1
Client --> Envoy2
Client --> Envoy3
classDef client fill:#f0f4ff,stroke:#4a6fa5,color:#222
classDef app fill:#dff5e1,stroke:#3a9e5c,color:#222
classDef envoy fill:#fff4d6,stroke:#c59a1a,color:#222
classDef hivemind fill:#f5dff5,stroke:#9a3a9e,color:#222
class Client client
class App1,App2,App3 app
class Envoy1,Envoy2,Envoy3 envoy
class HM1,HM2,HM3 hivemind
Each pod runs three co-located components as sidecars:
- Envoy Proxy — the pod's entry point. Intercepts inbound traffic, calls Hivemind's gRPC API (
ShouldRateLimit()) for a rate limit decision, and proxies allowed requests to the application. - Application — your service handling business logic, communicating bidirectionally with Envoy.
- Hivemind — makes sub-millisecond rate limit decisions locally, then synchronizes counter state with peers over a gossip-based mesh (UDP). No centralized datastore is required.
Nodes discover each other through seed peers and form a full mesh. Counter state converges across the cluster via the gossip protocol (default interval: 100ms).
- Rust Implementation: High performance and memory safety
- gRPC API: Compatible with Envoy Proxy's rate limit service v3
- Distributed Architecture: Peer mesh for state sharing without centralized storage
- Sidecar Deployment: Runs alongside your application and Envoy proxy
- Low Latency: Sub-millisecond rate limit decisions with lock-free counters
- Maximum requests per window: 4,294,967,295 (2³²-1). The counter uses a lock-free design that packs the window epoch and count into a single 64-bit atomic value, limiting each to 32 bits. This is sufficient for most rate limiting use cases. If you need higher limits, consider using longer time windows or distributing load across multiple rate limit keys.
- Rust 1.70 or later
- Docker (optional, for containerized deployment)
- Kubernetes (optional, for production deployment)
cargo build --releasecargo runEdit config.yaml to configure the service. Key settings:
server.grpc_port: Port for Envoy to connect to (default: 8081)mesh.bootstrap_peers: List of peer nodes to connect torate_limiting.config_path: Path to rate limit rules configuration
See config/ratelimit.yaml for rate limit rule examples.
hivemind [OPTIONS]
Options:
-c, --config <PATH> Path to the rate limit configuration file
-a, --addr <ADDR> gRPC server address [default: 127.0.0.1:8081]
--mesh Enable mesh networking for distributed rate limiting
--node-id <ID> Mesh node ID (auto-generated if not specified)
--mesh-addr <ADDR> Mesh bind address [default: 0.0.0.0:7946]
--peers <ADDRS> Bootstrap peer addresses (comma-separated)
-h, --help Print help
-V, --version Print versionBy default, Hivemind runs in standalone mode with local rate limiting:
hivemind -c config/ratelimit.yaml -a 0.0.0.0:8081For distributed rate limiting across multiple instances, enable mesh mode with the --mesh flag. Nodes use a gossip protocol (Chitchat) to synchronize rate limit counters.
First node (seed):
hivemind -c config/ratelimit.yaml -a 0.0.0.0:8081 \
--mesh --node-id node-1 --mesh-addr 0.0.0.0:7946Additional nodes:
hivemind -c config/ratelimit.yaml -a 0.0.0.0:8081 \
--mesh --node-id node-2 --mesh-addr 0.0.0.0:7946 \
--peers node-1:7946Nodes automatically discover each other through gossip, so you only need to specify one seed peer to join the cluster.
### Building from Source
```bash
# Development build
cargo build
# Release build with optimizations
cargo build --release
# Run unit tests
cargo test
# Run with logging
RUST_LOG=info cargo run
The project includes integration tests that verify Hivemind works correctly with Envoy Proxy. The tests use Docker Compose to spin up a complete environment with Hivemind, Envoy, and a backend service.
Prerequisites:
- Docker and Docker Compose
Running integration tests:
cd test
make test-integrationThis will:
- Build the Hivemind Docker image
- Start Hivemind, Envoy, and a backend service
- Run tests that verify rate limiting behavior
- Clean up all containers
Running distributed integration tests:
cd test
make test-distributedThis runs a 3-node Hivemind cluster to verify distributed rate limiting and gossip-based state synchronization.
Manual testing:
# Start the test environment
cd test
make test-integration-up
# Make requests (rate limit is 5/sec)
curl http://localhost:10000/status/200
# View logs
make test-integration-logs
# Stop the environment
make test-integration-downSee the specification document for detailed Kubernetes deployment examples.
# Build image
docker build -t hivemind:latest .
# Run
docker run -p 8081:8081 -v $(pwd)/config.yaml:/etc/hivemind/config.yaml hivemind:latest