Skip to content

SIDDHANTCOOKIE/QuantCraft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QuantCraft


1. Business Context — What Problem Does This Solve?

QuantCraft is a distributed hackathon judging platform built for the IICPC Summer Hackathon 2026. The problem it solves is non-trivial:

"Given that 10,000 contestants each submit a compiled trading matching-engine binary, how do you fairly, securely, and automatically benchmark all of them for correctness, latency, and throughput — and publish live rankings?"

A matching engine is the core of any financial exchange. It receives BUY/SELL orders and executes fills according to Price-Time Priority (FIFO). The platform:

  1. Accepts compiled Linux ELF binaries from contestants.
  2. Sandboxes them in isolated Docker/Kubernetes pods with no internet access.
  3. Bombards them with a deterministic sequence of 50,000 orders via WebSocket.
  4. Compares the fills they produce against a pre-computed reference matching engine.
  5. Scores them on a composite formula: 40% Latency + 30% Throughput + 30% Correctness.
  6. Broadcasts a live leaderboard to a Next.js/Vite dashboard via WebSocket.

2. High-Level Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                         SUBMISSION FLOW                               │
│                                                                        │
│  Contestant → API (Go :8081) → MinIO (binary store) → PostgreSQL     │
│                                     ↓                                 │
│                               Redis pub/sub: new-submissions          │
│                                     ↓                                 │
│                        Scheduler (Go daemon)                          │
│                         ↓          ↓                                  │
│                    Downloads    Spins up Docker container             │
│                    from MinIO   (Ubuntu 22.04 sandbox)                │
│                                     ↓ (health check passes)           │
│                             LOAD TESTING PIPELINE                     │
│                                                                        │
│   Phase A: 1 bot (sequential, correctness)                            │
│   Phase B: 10 bots (concurrent, stress)                               │
│          ↓          ↓                                                  │
│   Load Generator (Go) ←── trace:standard:chunk_* (Redis Lists)       │
│          │                                                             │
│          │ WebSocket orders → Contestant Sandbox                      │
│          │ WebSocket fills  ← Contestant Sandbox                      │
│          │                                                             │
│          └──→ Redis Stream: telemetry.raw                             │
│                    ↓                        ↓                         │
│          Aggregator (Go)            Validator (Go)                    │
│          p50/p90/p99 + TPS          Compare fills vs expected_fills   │
│          → TimescaleDB              → PostgreSQL (correctness_score)  │
│          → Redis leaderboard-updates         ↓                        │
│                                     Scoring Engine (Go)               │
│                                     final_score → PostgreSQL          │
│                                     ZADD leaderboard Redis sorted set │
│                                              ↓                        │
│                              Leaderboard WS Server (Go :8082)        │
│                              Subscribes Redis → broadcasts to clients │
│                                              ↓                        │
│                              Frontend (Vite/React :5173)             │
│                              Dark-themed live dashboard               │
└──────────────────────────────────────────────────────────────────────┘

3. Project Structure

QuantCraft/
├── agent.md                    # Master design doc (architecture + build checklist)
├── CHANGELOG.md                # Phase-by-phase build history
├── docs/
│   ├── architecture.md         # Mermaid diagram + bottleneck analysis
│   ├── scoring.md              # Formal scoring spec with formulas
│   ├── contestant-guide.md     # How to build a submission
│   ├── decisions/              # Architectural Decision Records (ADR 001-005)
│   └── load-profiles/
│       └── standard.json       # Pre-generated 50k order trace (fallback)
├── services/                   # 9 microservices, all independently deployable
│   ├── api/                    # Go: Submission API
│   ├── scheduler/              # Go: Sandbox orchestrator daemon
│   ├── trace-generator/        # Go: Deterministic order trace generator
│   ├── load-generator/         # Go: WebSocket bot fleet
│   ├── aggregator/             # Go: Redis Stream → TimescaleDB metrics
│   ├── validator/              # Go: Reference matching engine + correctness check
│   ├── scoring/                # Go: Composite score computation
│   ├── leaderboard/            # Go: WebSocket push server
│   └── frontend/               # Vite + React + TypeScript dashboard
├── infra/
│   ├── docker-compose.yml      # Local dev: Postgres+TimescaleDB, Redis, MinIO
│   ├── postgres-init.sql       # DB schema (3 tables + TimescaleDB hypertable)
│   ├── helm/                   # K8s Helm charts
│   ├── k8s/                    # Raw K8s manifests
│   └── terraform/              # IaC for cloud deployment
├── testing/
│   ├── dummy_engines/          # Echo server for testing the pipeline
│   └── chaos/                  # Chaos testing scripts
└── scripts/
    ├── setup.sh                # Initialize MinIO buckets + Redis streams
    └── generate-trace.sh       # Run trace-generator with seed=42, count=50000

4. Service-by-Service Breakdown

4.1 Submission API — services/api/main.go

Port: :8081 | Language: Go | Pattern: Standard net/http mux

Purpose: The front door for contestant submissions.

Startup Sequence:

  1. Parses CLI flags (MinIO, Redis, DB endpoints).
  2. Initializes slog.JSONHandler for structured logging.
  3. Opens connections to MinIO, Redis (with Ping validation), and PostgreSQL.
  4. Attaches a CORS middleware (allows all origins — for the frontend on a different port).
  5. Registers two HTTP handlers and starts the server with 15s read/write timeouts.
  6. Runs a signal handler goroutine for graceful shutdown (SIGINT/SIGTERM).

Endpoints:

Method Path Description
GET /health Liveness probe — returns {"status":"healthy"}
POST /api/submissions Upload a binary + contestant_id
GET /api/submissions/{id}/status Poll run status

POST /api/submissions — Detailed Flow:

1. Parse multipart form (max 50MB)
2. Extract contestant_id (required) + file
3. Read first 4 bytes → check ELF magic: 0x7f 'E' 'L' 'F'
   → Rejects non-ELF files (security: no scripts, no PE binaries)
4. Seek file back to 0
5. Generate UUIDv4 as run_id
6. PutObject to MinIO: submissions/{run_id}/binary
7. INSERT INTO submissions (id, contestant_id, status='queued', ...)
8. rdb.Publish("new-submissions", run_id)  ← non-fatal if Redis is down
9. Return 201 {"run_id": "..."}

Why ELF check? Contestants can only submit compiled Linux binaries. This prevents scripts, Windows executables, or arbitrary data from being deployed.

Why both MinIO + PostgreSQL? MinIO is the durable object store (binary is large, S3-compatible). PostgreSQL is the source of truth for the run's lifecycle state machine.


4.2 Sandbox Scheduler — services/scheduler/main.go

Language: Go | Pattern: Long-running daemon with dual trigger model

Purpose: Watches for new submissions and orchestrates the entire test lifecycle for each run.

Startup Sequence:

  1. Connects to Redis, MinIO, PostgreSQL.
  2. Subscribes to Redis pub/sub channel new-submissions.
  3. Starts a background goroutine that polls PostgreSQL for status='queued' submissions every 5 seconds (catches any pub/sub messages missed during downtime).
  4. Main loop: The main execution thread enters an infinite loop and blocks (waits without consuming CPU cycles) on the pubsub.Channel() Go channel. When a message arrives from Redis, the loop unblocks, processes the run_id by calling processQueuedSubmissions, and loops back to wait for the next message.

Dual-trigger design: Pub/sub for low-latency notification + polling as a safety net. This is a classic pattern for "at-least-once" processing without Kafka-level complexity.

runSubmissionSandbox — The Core Orchestration Function:

Step 1: Update status → 'deploying'
Step 2: Create temp dir on host: /workspace/tmp/submissions/{run_id}/
Step 3: Download binary from MinIO to temp dir
Step 4: docker run -d
          --name contestant-{run_id}
          --network infra_default
          -p 8080                        ← mapped to random host port
          -v /workspace/tmp/submissions/{run_id}:/app
          -w /app
          ubuntu:22.04
          sh -c "chmod +x /app/binary && /app/binary --port 8080"
Step 5: Health watchdog (up to 30 attempts × 1s interval)
          → docker port contestant-{run_id} 8080 → get mapped host port
          → GET http://localhost:{port}/health → expect 200
          → Note: The watchdog stops polling and proceeds to the next step immediately as soon as it receives a successful HTTP 200 response (via a `break` statement in the loop). It only polls for the full 30 seconds if the container is unresponsive.
Step 6: Update status → 'running'
Step 7: Run load-generator --bots 1 --phase correctness  (PHASE A)
Step 8: Run load-generator --bots 10 --phase stress      (PHASE B)
Step 9: time.Sleep(5s) — Wait for the aggregator to flush windows. (The aggregator flushes a window 3 seconds after the last seen event for a run. The scheduler explicitly sleeps for 5 seconds to provide a safe buffer for this 3s window to close and be flushed.)
Step 10: Update status → 'validating'
Step 11: Run validator-service --run-id {run_id} --trace standard
Step 12: defer → docker rm -f contestant-{run_id} + os.RemoveAll(tempDir)

updateRunStatus — Dual-write Pattern: Every status change writes to PostgreSQL AND publishes JSON to Redis leaderboard-updates pub/sub. This way, the leaderboard frontend gets live status updates (queued → deploying → running → validating → finished) without polling.


4.3 Trace Generator — services/trace-generator/main.go

Language: Go | Run mode: One-shot CLI tool

Purpose: Pre-generate a deterministic order sequence and store it in Redis + JSON.

Why pre-generate? Bot fleet never stalls waiting for order generation. Same seed = same trace = reproducible results. Expected fills can be pre-computed offline against the reference engine.

Generation Logic:

Seed: 42 (default), Count: 50,000, ChunkSize: 5,000

For each order:
  - Type distribution: 50% LIMIT, 35% MARKET, 15% CANCEL
  - CANCEL falls back to LIMIT if no active orders exist
  - Price: Gaussian N(100, 2) rounded to 2 decimal places
  - Quantity: log-uniform [1, 1000]
  - CANCEL picks a random active order ID to target

Redis Storage Layout:

trace:standard:metadata   → HSet { name, seed, count, chunks, chunk_size }
trace:standard:chunk_0    → List of 5,000 order JSONs
trace:standard:chunk_1    → List of 5,000 order JSONs
...
trace:standard:chunk_9    → List of ~5,000 order JSONs

Fallback: If Redis is unreachable, the trace is only written to docs/load-profiles/standard.json. The load-generator uses this JSON file as a fallback.


4.4 Load Generator — services/load-generator/main.go

Language: Go | Key libraries: gorilla/websocket, redis/go-redis

Purpose: The bot fleet that fires orders at the contestant's sandbox and records nanosecond latency.

Two-Phase Architecture:

Phase --bots --phase Goal
A (Correctness) 1 correctness Sequential, deterministic fill comparison
B (Stress) 10 stress Concurrent, high-throughput latency measurement

Startup Flow:

  1. Load the full trace from Redis (chunked lists) → in-memory []Order slice.
    • Fallback: load from docs/load-profiles/standard.json.
  2. Start a publishTelemetryBatch goroutine — reads TelemetryEvent structs from an internal in-memory Go channel (telemetryChan) populated by the bot workers, and batches writes to the Redis Stream telemetry.raw every 25ms or 500 events.
  3. Fill a shared ordersChan channel with all orders (closed after filling).
  4. Spawn N bot worker goroutines.
  5. Wait for all bots to finish (botsWG.Wait()).
  6. Close telemetry channel and wait for publisher to flush remaining events.

runBotWorker — Per-Bot WebSocket Loop:

Attempt WebSocket connection (5 retries with exponential backoff)
  
Spawn reader goroutine (concurrent with writer):
  → ReadMessage() → parse OrderResponse JSON
  → Look up matching sent order in sync.Map by order_id+suffix
  → Compute latency = recvTime - sendTime
  → Push TelemetryEvent to telemetryChan
  
Writer loop (reads from shared ordersChan):
  → Optional rate-limiting via time.Ticker
  → Record sendTime in sync.Map
  → conn.WriteMessage(websocket.TextMessage, orderJSON)
  → On write error: emit REJECTED telemetry, return
  
After all orders sent: wait up to 10s for pending responses (activeSends WaitGroup)

Latency Measurement: Uses sync.Map keyed by order_id+("_active" | "_cancel") to track the exact send timestamp of each in-flight order. When the matching response arrives, recvTime - sendTime gives per-order round-trip latency.

Telemetry Batching: Instead of calling XAdd once per event (would be 50,000 Redis calls), events are batched into Redis pipeline calls every 25ms. This is critical for not saturating Redis with single-message writes.


4.5 Aggregator — services/aggregator/main.go

Language: Go | Pattern: Redis Consumer Group → rolling time-window aggregation

Purpose: Consume raw telemetry events and compute per-second latency percentiles and TPS, writing to TimescaleDB.

Consumer Group Setup:

rdb.XGroupCreateMkStream(ctx, "telemetry.raw", "aggregator-group", "0")
// consumer: "aggregator-worker-1"

Using a Consumer Group means:

  • Messages are delivered once and only once to this consumer.
  • Messages are acknowledged after processing (XAck).
  • If the aggregator crashes, it can resume from the last unacked message.

Window Accumulation:

windows: map[run_id+"/"+phase] → map[int64_second] → WindowStats{
    Latencies: []float64 (microseconds),
    Total: int,
    Errors: int,
}

Each telemetry event is bucketed by timestamp_ns / 1e9 (integer second). This creates rolling 1-second windows.

flushOldWindows — When to Flush: A window at second S is flushed when:

  • The stream has advanced past S+1 (i.e., newer events have arrived), OR
  • No events for this run_id/phase have arrived for >3 seconds (stream ended).

On Flush:

1. Sort latency values
2. Compute p50 = latencies[n*0.50], p90 = latencies[n*0.90], p99 = latencies[n*0.99]
3. INSERT INTO run_metrics (time, run_id, phase, p50_latency_us, p90_latency_us, p99_latency_us, throughput_tps, error_rate)
4. SET metrics:{run_id}:{phase}:latest = JSON blob
5. PUBLISH leaderboard-updates = JSON blob  ← live metric updates to frontend

Why TimescaleDB? Plain PostgreSQL inserts into a time-series table cause index bloat. TimescaleDB's hypertable auto-partitions by timestamp — writes append to the latest chunk, keeping write performance stable even after millions of rows.


4.6 Validator — services/validator/

This service has two components:

4.6a Reference Matching Engine — orderbook.go

Language: Go | Package: validator

The OrderBook is a correct implementation of a Price-Time Priority (FIFO) limit order book. It's used during the pre-compute phase to generate expected_fills ahead of time.

Data Structures:

type OrderBook struct {
    Bids       []*PriceLevel   // Sorted DESCENDING (best bid first)
    Asks       []*PriceLevel   // Sorted ASCENDING (best ask first)
    OrderIndex map[string]*OrderLocation  // O(1) cancel lookup
}

type PriceLevel struct {
    Price  int64      // Fixed-point: price × 10000 (no floats!)
    Orders []*Order   // FIFO queue at this price level
}

Why int64 for prices? Floating-point comparison is non-deterministic. 100.01 × 10000 = 1000100 exactly as an integer. This matches the correctness validation requirement.

ProcessOrder — LIMIT order matching:

BUY order incoming:
  While Asks exist AND qty remaining AND incoming.Price >= bestAsk.Price:
    matchQty = min(incoming.Quantity, askOrder.Quantity)
    Fill(incoming.OrderID, askOrder.OrderID, askPrice, matchQty)
    Deduct from both
    Remove fully filled ask orders
  If qty remaining → addLimitOrder to bid side

SELL order → mirror logic against bid side

CancelOrder: Uses OrderIndex for O(1) location lookup, then splices the order out of its PriceLevel.Orders slice. IndexBook rebuilds the index after each operation.

4.6b Correctness Validator Service — cmd/validator-service/main.go

Purpose: Compare contestant fills vs pre-computed expected fills.

Execution Flow:

1. Fetch expected_fills from PostgreSQL WHERE trace_id='standard' ORDER BY order_index ASC
   → Build map: order_id → []MatchFill{MatchOrderID, Price(fixed), Qty}

2. XRange telemetry.raw "-" "+"  (read all stream entries)
   → Filter: run_id = this_run AND phase = 'correctness' AND status = 'FILLED'
   → Convert float price string → int64 fixed-point (price × 10000)
   → Build map: order_id → []MatchFill

3. For each order_id in expected_fills:
   For each fill position i:
     if expected[i].MatchOrderID == actual[i].MatchOrderID
     AND expected[i].Price == actual[i].Price
     AND expected[i].Qty == actual[i].Qty:
       matches++

4. correctness_score = (matches / total_expected) × 100

5. UPDATE submissions SET correctness_score=$1, status='validating_complete' WHERE id=$2

6. exec scoring binary → chain into scoring engine

Strict positional comparison: Fill position 0 must match fill position 0. This enforces Price-Time Priority — you can't reorder fills to game the score.


4.7 Scoring Engine — services/scoring/main.go

Language: Go | Run mode: One-shot CLI tool (invoked by validator)

Purpose: Compute the composite score from correctness + latency + throughput, write to PostgreSQL, update Redis sorted set.

Score Formula:

# Raw metrics (from TimescaleDB, phase='stress' only):
avgP99 = AVG(p99_latency_us) from run_metrics WHERE run_id=X AND phase='stress'
avgTPS = AVG(throughput_tps)

# Latency Score (logarithmic decay — rewards sub-millisecond engines):
p99_ms = max(1.0, avgP99 / 1000.0) 
  → (Why `max(1.0, ...)`? This caps the minimum value at 1.0. If p99_ms was < 1, log10 would be negative, yielding a score > 100. This cap ensures the maximum latency score never exceeds 100 points.)
latencyScore = max(0, 100 - 20 × log10(p99_ms))
  → p99 < 1ms  → score = 100
  → p99 = 10ms → score = 80
  → p99 = 300ms → score ≈ 50

# Throughput Score (exponential saturation curve):
throughputScore = 100 × (1 - e^(-0.000032188 × TPS))
  → This is NOT a linear cap at 50k TPS from the docs
  → It's a soft exponential — 50k TPS ≈ 80, never exactly 100

# Correctness Score: read from submissions.correctness_score

# Composite:
finalScore = 0.40 × latencyScore + 0.30 × throughputScore + 0.30 × correctnessScore

Outputs:

  1. UPSERT submissions SET status='finished', all scores, ...
  2. ZADD leaderboard finalScore contestantID (Redis sorted set — O(log N))
  3. PUBLISH leaderboard-updates with full ScoreResult JSON

Note: The scoring engine uses an exponential throughput formula in code (1 - e^(-λ×TPS)) rather than the linear formula in agent.md (TPS/50000 × 100). The code is the truth.


4.8 Leaderboard WebSocket Server — services/leaderboard/main.go

Port: :8082 | Language: Go | Libraries: gorilla/websocket

Purpose: Bridge between Redis pub/sub and browser WebSocket clients.

Hub Pattern:

type Hub struct {
    clients    map[*websocket.Conn]bool  // active connections
    clientsMu  sync.RWMutex             // reader-writer lock
}

Broadcast(msg) uses sync.RWMutex with a read lock (many concurrent readers) and spawns a goroutine per client to avoid one slow client blocking others.

Connection Lifecycle:

Client connects to ws://localhost:8082/ws
  → hub.Register(conn)
  → Spawn goroutine:
      1. fetchLeaderboard(db)  → CTE: best run per contestant, ORDER BY final_score DESC
      2. fetchRecentSubmissions(db) → last 15 runs of any status
      3. Send {"type":"init", "leaderboard":[...], "recent":[...]}
      4. Read loop (discards all client messages — server-push only)
      5. On read error → hub.Unregister(conn)

Redis Pub/Sub → Broadcast:

Subscribe to "leaderboard-updates"
On message:
  Parse JSON → detect type by field presence:
    throughput_tps → "metric_update"
    final_score    → "final_score"
    status         → "status_update"
  Wrap: {"type": "...", "data": {...}}
  hub.Broadcast(wrappedBytes)

Why detect type by field presence? All updates go through the same leaderboard-updates channel (from scheduler status updates, aggregator metric flushes, and scoring engine). The leaderboard server disambiguates them at broadcast time.


4.9 Frontend — services/frontend/

Stack: Vite + React + TypeScript | Theme: Dark, premium design system

Pages:

  • Leaderboard — Live rankings table with WebSocket updates, animated rank changes.
  • Submission Upload — Drag-and-drop binary upload form (POST /api/submissions).
  • Contestant Detail — Per-run metrics charts (latency over time, TPS).

WebSocket Connection: Connects to ws://localhost:8082/ws, handles init, status_update, metric_update, and final_score message types. Uses reconnect logic.


5. Database Design

Schema Overview

-- 3 tables total

-- 1. submissions — lifecycle state machine for each run
CREATE TABLE submissions (
    id                UUID PRIMARY KEY,
    contestant_id     TEXT NOT NULL,
    status            TEXT NOT NULL DEFAULT 'queued',
    -- Status values: queued → deploying → running → validating
    --                → validating_complete → finished | failed
    minio_key         TEXT,           -- path in MinIO
    endpoint_url      TEXT,           -- ws://contestant-{id}:8080/ws/orders
    correctness_score DOUBLE PRECISION DEFAULT 0.0,
    latency_score     DOUBLE PRECISION DEFAULT 0.0,
    throughput_score  DOUBLE PRECISION DEFAULT 0.0,
    raw_sustained_tps DOUBLE PRECISION DEFAULT 0.0,
    final_score       DOUBLE PRECISION DEFAULT 0.0,
    error_msg         TEXT,
    created_at        TIMESTAMPTZ DEFAULT now(),
    updated_at        TIMESTAMPTZ DEFAULT now()
);

-- Indexes: idx_submissions_status, idx_submissions_contestant

-- 2. expected_fills — pre-computed reference fills (truth table)
CREATE TABLE expected_fills (
    trace_id          TEXT NOT NULL,     -- e.g., "standard"
    order_index       INT NOT NULL,      -- positional index (FIFO order)
    order_id          TEXT NOT NULL,
    match_order_id    TEXT NOT NULL DEFAULT '',  -- resting order that was crossed
    expected_price    BIGINT NOT NULL,   -- fixed-point: price × 10000
    expected_qty      INT NOT NULL,
    order_sequence_no INT NOT NULL,
    PRIMARY KEY (trace_id, order_index)
);

-- 3. run_metrics — time-series aggregated per 1-second window
-- TimescaleDB hypertable (auto-partitioned by time)
CREATE TABLE run_metrics (
    time           TIMESTAMPTZ NOT NULL,  -- hypertable partition key
    run_id         UUID NOT NULL,
    phase          TEXT NOT NULL DEFAULT 'stress',  -- 'correctness' | 'stress'
    p50_latency_us DOUBLE PRECISION NOT NULL,
    p90_latency_us DOUBLE PRECISION NOT NULL,
    p99_latency_us DOUBLE PRECISION NOT NULL,
    throughput_tps INT NOT NULL,
    error_rate     DOUBLE PRECISION NOT NULL
);
SELECT create_hypertable('run_metrics', 'time');

Data Access Patterns

Service Table Operation Notes
API submissions INSERT On upload
Scheduler submissions SELECT (queued), UPDATE (status) Poll + event-driven
Validator expected_fills SELECT (read-only) Preloaded once
Validator submissions UPDATE (correctness_score) After comparison
Aggregator run_metrics INSERT (hypertable append) Per 1s window flush
Scoring submissions SELECT (correctness_score) Input
Scoring run_metrics SELECT AVG (phase=stress) Input
Scoring submissions UPSERT (all scores) Output
Leaderboard submissions SELECT (CTE + ORDER BY) Initial hydration

6. Redis Data Model

# Pub/Sub Channels
new-submissions         → Published by API, consumed by Scheduler
leaderboard-updates     → Published by Scheduler, Aggregator, Scoring; consumed by Leaderboard WS

# Streams
telemetry.raw           → Produced by Load Generator; consumed by Aggregator (Consumer Group)
                          Fields: run_id, order_id, match_order_id, phase, type, side, 
                                  status, price, quantity, latency_ns, timestamp_ns

# Lists (Trace Storage)
trace:standard:chunk_0  → 5000 order JSON strings (RPUSH/LRANGE)
trace:standard:chunk_1  → ...
trace:standard:chunk_N  → ...

# Hash (Trace Metadata)
trace:standard:metadata → { name, seed, count, chunks, chunk_size }

# Strings (Metric Cache)
metrics:{run_id}:{phase}:latest → JSON MetricUpdate

# Sorted Set (Live Leaderboard)
leaderboard             → ZADD score member=contestant_id
                          ZREVRANGE → ordered rankings

7. Complete End-to-End Execution Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                    END-TO-END EXECUTION TRACE                               │
│                                                                              │
│ 0. SETUP (one-time)                                                          │
│    trace-generator --seed 42 --count 50000 --name standard                  │
│    → 50k orders → Redis Lists + standard.json                                │
│    → Run against reference OrderBook → expected_fills → PostgreSQL           │
│                                                                              │
│ 1. SUBMISSION                                                                │
│    POST /api/submissions (multipart: contestant_id + binary file)            │
│    → ELF magic check                                                         │
│    → MinIO PutObject submissions/{run_id}/binary                             │
│    → INSERT submissions status='queued'                                      │
│    → PUBLISH new-submissions run_id                                          │
│    ← 201 {"run_id": "uuid"}                                                  │
│                                                                              │
│ 2. SCHEDULING                                                                │
│    Scheduler receives pub/sub OR polls every 5s                              │
│    → SELECT FROM submissions WHERE status='queued'                           │
│    → updateRunStatus: 'deploying'                                            │
│    → os.MkdirAll /workspace/tmp/submissions/{run_id}/                        │
│    → MinIO FGetObject → /workspace/tmp/submissions/{run_id}/binary           │
│                                                                              │
│ 3. SANDBOX                                                                   │
│    docker run -d --name contestant-{run_id}                                  │
│              --network infra_default                                          │
│              -p 8080 (random host port)                                      │
│              -v /workspace/.../binary:/app/binary                             │
│              ubuntu:22.04 "chmod +x /app/binary && /app/binary --port 8080"  │
│    → Health watchdog: GET localhost:{hostPort}/health → 200 OK               │
│    → updateRunStatus: 'running'                                              │
│                                                                              │
│ 4. PHASE A — CORRECTNESS                                                     │
│    load-generator --bots 1 --phase correctness --target ws://localhost:{port}│
│    → Load trace:standard:chunk_* from Redis                                  │
│    → 1 bot connects WebSocket                                                │
│    → Sequential: send order → wait for fill response → compute latency       │
│    → Batch publish to Redis Stream telemetry.raw {phase: "correctness"}      │
│    (Aggregator also consumes this, writes to run_metrics phase=correctness)  │
│                                                                              │
│ 5. PHASE B — STRESS                                                          │
│    load-generator --bots 10 --phase stress --target ws://localhost:{port}    │
│    → Same trace, 10 concurrent bot connections                               │
│    → Concurrent: all 10 bots draw from shared ordersChan                     │
│    → Batch publish to Redis Stream telemetry.raw {phase: "stress"}           │
│    (Aggregator writes to run_metrics phase=stress)                            │
│                                                                              │
│ 6. AGGREGATOR (running continuously in background)                           │
│    XReadGroup telemetry.raw aggregator-group (100 msgs, 2s block)            │
│    → Bucket by (run_id/phase, integer_second) → accumulate latencies         │
│    → flushOldWindows: p50/p90/p99/TPS/errorRate                              │
│    → INSERT run_metrics (TimescaleDB append)                                 │
│    → SET metrics:{run_id}:{phase}:latest                                     │
│    → PUBLISH leaderboard-updates (metric_update)                             │
│                                                                              │
│ 7. WAIT 5 SECONDS                                                            │
│    Scheduler: time.Sleep(5s) to let aggregator flush final windows           │
│                                                                              │
│ 8. VALIDATION                                                                │
│    updateRunStatus: 'validating'                                              │
│    exec validator-service --run-id {run_id} --trace standard                 │
│    → SELECT expected_fills WHERE trace_id='standard' ORDER BY order_index    │
│    → XRange telemetry.raw "-" "+" → filter phase=correctness + status=FILLED│
│    → Positional comparison (MatchOrderID + Price + Qty)                      │
│    → correctnessScore = matches/total × 100                                  │
│    → UPDATE submissions SET correctness_score=$1 status='validating_complete'│
│                                                                              │
│ 9. SCORING                                                                   │
│    exec scoring --run-id {run_id} (called by validator)                      │
│    → SELECT correctness_score FROM submissions WHERE id=$1                   │
│    → SELECT AVG(p99), AVG(tps) FROM run_metrics WHERE run_id=$1 phase=stress │
│    → Compute latencyScore, throughputScore                                   │
│    → finalScore = 0.4×L + 0.3×T + 0.3×C                                     │
│    → UPSERT submissions (all scores, status='finished')                      │
│    → ZADD leaderboard finalScore contestantID                                │
│    → PUBLISH leaderboard-updates (final_score event)                         │
│                                                                              │
│ 10. BROADCAST                                                                │
│     Leaderboard WS server receives PUBLISH on leaderboard-updates            │
│     → Detect type: "final_score"                                             │
│     → Wrap: {"type": "final_score", "data": {...}}                           │
│     → hub.Broadcast → all connected frontend clients                         │
│                                                                              │
│ 11. CLEANUP                                                                  │
│     Scheduler defer: docker rm -f contestant-{run_id}                        │
│                       os.RemoveAll /workspace/tmp/submissions/{run_id}/       │
└─────────────────────────────────────────────────────────────────────────────┘

8. Contestant API Contract

POST /order
  Request:  { "order_id", "type": "LIMIT|MARKET|CANCEL",
              "side": "BUY|SELL", "price", "quantity" }
  Response: { "order_id", "match_order_id",  ← REQUIRED for scoring
              "status": "FILLED|CANCELLED|REJECTED",
              "fill_price", "fill_qty", "timestamp_ns" }

GET  /orderbook    → Current L2 book state
GET  /health       → 200 {"status":"ok"} (required for watchdog)
WS   /ws/orders    → WebSocket alternative for streaming orders/responses

Resource Limits: 2 CPU, 4GB RAM, 100MB tmpfs, NO internet access

**Important Notes on Orders:**
- **Order IDs:** Yes, unique Order IDs are generated and maintained for all orders. The Trace Generator creates IDs like `ord_standard_000000` to `ord_standard_049999`. The contestant engine MUST return this exact ID in the execution response. The Validator relies on this ID to map actual fills to expected fills.
- **Order Sequencing & Dummy Engines:** The `perfect.go` dummy engine preserves exact order sequencing. Since the stress test uses 10 concurrent WebSocket bots, messages arrive over the network out of order. However, the dummy engine implements a `Sequencer` component that buffers incoming requests and processes them strictly by the `sequence_no` field, guaranteeing deterministic FIFO matching.

9. Error Handling Strategy

Layer Strategy
API — MinIO upload fails Attempt MinIO object deletion, return 500
API — DB insert fails Attempt MinIO cleanup (rollback), return 500
API — Redis publish fails Non-fatal — logged as Warn, scheduler polling catches it
Scheduler — Docker fails updateRunStatus 'failed' with error_msg
Scheduler — Health watchdog timeout 'failed': "process unresponsive"
Scheduler — Load gen fails 'failed': "Correctness/Stress load phase failed"
Scheduler — Validation fails 'failed': "Fills validation phase failed"
Load Generator — WS connect fails 5 retries with backoff; increments failedBots; exits 1 if >0
Load Generator — WS write fails Emits REJECTED telemetry; worker returns
Aggregator — TimescaleDB insert fails Logs Error, continues (no crash)
Scoring — No metrics in TimescaleDB Defaults: p99=1500µs, TPS=5000 (baseline)
Leaderboard — Client write fails Goroutine logs Warn, calls hub.Unregister

All services use Go slog.JSONHandler for structured JSON logs, making them compatible with log aggregators like Loki or CloudWatch.


10. Infrastructure

Local Development

# infra/docker-compose.yml
postgres:  timescale/timescaledb:latest-pg16  :5432
redis:     redis:7-alpine (--appendonly yes)   :6379
minio:     minio/minio:latest                  :9000 (API), :9001 (Console)

Services run natively on the host, connecting to containers via localhost:*.

Production (Kubernetes)

  • EKS Spot instances: c6i.2xlarge (8 vCPU, 16GB), auto-scaling 1–10 nodes.
  • Helm charts in infra/helm/ for each service.
  • ArgoCD for GitOps deployment.
  • NetworkPolicies: Contestant pods have zero egress — they cannot reach the internet or internal cluster services.
  • Init container pattern: In K8s production mode, an init container downloads the binary from MinIO before the main container starts — the main container never sees MinIO credentials.
  • Prometheus + Grafana: Metrics configurations in testing/ directory.

CI/CD

  • GitHub Actions for Docker image builds.
  • Terraform in infra/terraform/ for cloud resource provisioning.

11. Architecture Summary

┌────────────────────────────── DEPENDENCY MAP ─────────────────────────────┐
│                                                                             │
│   Frontend (React)                                                          │
│       │ WebSocket                                                           │
│       ▼                                                                     │
│   Leaderboard Server ──────────────────────────── Redis (pub/sub)          │
│                                                        ▲    ▲    ▲         │
│   API (Go) ──→ MinIO ──→ PostgreSQL ──→ Redis (pub/sub)│    │    │         │
│                                                     Scheduler│    │         │
│   Scheduler (Go) ◄── Redis (pub/sub) ◄──────────────────────┘    │         │
│       │                                                            │         │
│       ├──→ Load Generator ──→ Contestant Sandbox (Docker)          │         │
│       │          │                    ↕ WebSocket                  │         │
│       │          └──→ Redis Stream (telemetry.raw)                 │         │
│       │                    │                  │                    │         │
│       │               Aggregator         Validator                │         │
│       │                    │                  │                    │         │
│       │            TimescaleDB          PostgreSQL                │         │
│       │                    │                  │                    │         │
│       └──→ Validator ──→ Scoring ─────────────┴──→ Redis (leaderboard-updates)
│                                                                             │
│  Storage:  PostgreSQL (state) + TimescaleDB (metrics) + MinIO (binaries)   │
│  Messaging: Redis Streams (telemetry) + Redis Pub/Sub (events)             │
│  Compute:  Docker/K8s sandboxes per run                                    │
└─────────────────────────────────────────────────────────────────────────────┘

Key Design Decisions Worth Knowing

Decision Rationale
Pre-generate traces Bots never stall; expected fills computable offline; reproducible
Redis Streams over Kafka Fewer ops, Consumer Groups provide at-least-once semantics
TimescaleDB over plain PostgreSQL Hypertable prevents index bloat for time-series append workload
Dual trigger (pub/sub + poll) Pub/sub for fast path; polling catches missed events on restart
Positional fill comparison Enforces Price-Time Priority; can't reorder fills to game score
ELF magic check at upload Early rejection of non-Linux binaries; avoids wasted sandbox resources
Fixed-point int64 prices Eliminates float comparison drift across platforms
Two-phase load (1 bot then 10) Phase A isolates correctness with deterministic sequential delivery; Phase B measures real concurrency perf

12. How to Contribute — Practical Onboarding

Starting the platform locally:

# 1. Start infrastructure
cd infra && docker-compose up -d

# 2. Generate trace and load into Redis
cd services/trace-generator && go run main.go --redis localhost:6379

# 3. Pre-compute expected fills
cd services/validator && go run ./cmd/precompute --trace standard --redis localhost:6379 --db <DSN>

# 4. Start all services (separate terminals)
cd services/api       && go run main.go --redis localhost:6379 --db <DSN>
cd services/aggregator && go run main.go --redis localhost:6379 --db <DSN>
cd services/leaderboard && go run main.go --redis localhost:6379 --db <DSN>
cd services/scheduler   && go run main.go --redis localhost:6379 --db <DSN> --workspace /path/to/QuantCraft

# 5. Start frontend
cd services/frontend && npm install && npm run dev

# 6. Test with the dummy echo server
cd dummy_engines && # run the echo server binary
# Then POST a binary to localhost:8081/api/submissions

Adding a new service: Every service must have:

  1. A /health endpoint returning 200.
  2. Structured JSON logging via slog.
  3. context.Context propagation throughout.
  4. Errors wrapped with fmt.Errorf("...: %w", err).
  5. A README.md.

13. System Inventories

A. Port & Service Inventory

Port Service Role
5432 PostgreSQL / TimescaleDB Primary persistent database and time-series metrics store.
6379 Redis In-memory message broker (pub/sub, streams), trace cache, and leaderboard sorted set.
9000 MinIO API Object storage for uploaded contestant binaries.
9001 MinIO Console Web UI for MinIO bucket management.
8081 Submission API Go Gateway receiving binary uploads (services/api).
8082 Leaderboard Server Go WebSocket server pushing live rankings (services/leaderboard).
5173 Frontend Vite/React local dev server (services/frontend).
8080 Contestant Sandbox Port exposed by the sandboxed binary inside its isolated Docker container (mapped to a random dynamic port on the host).

B. Redis Data Structures Inventory

Structure Name/Key Pattern Purpose
Pub/Sub new-submissions Published by API on upload. Wakes up the Scheduler to start sandbox.
Pub/Sub leaderboard-updates Status changes (Scheduler), metrics updates (Aggregator), final scores (Scoring). Consumed by WS Server to push to browsers.
Stream telemetry.raw High-throughput raw event log (fills, latency) produced by Load Generator. Consumed by Aggregator and Validator via Consumer Groups.
Lists trace:{name}:chunk_{i} Deterministic JSON order lists generated by Trace Generator. Read by bots.
Hash trace:{name}:metadata Metadata (seed, total chunks, count) created by Trace Generator.
String metrics:{run_id}:{phase}:latest Cached rolling metric updates set by the Aggregator.
Sorted Set leaderboard Official rankings maintained by the Scoring Engine. Mapped score -> contestant_id.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors