Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,9 @@ C++20 (`-std=c++20`): Follow standard conventions
- Do not include task numbers (e.g. T001, T010) in code comments. Comments should describe *what* or *why*, not reference planning artifacts.

## Recent Changes
- 021-architecture-remediation: Added C++20 (`-std=c++20`) + Boost (Asio, Serialization), OpenSSL (EVP SHA-256), nlohmann/json (vendored `src/json.hpp`)
- 020-address-test-quality: Added C++20 (`-std=c++20`) + Boost (Asio, Serialization), OpenSSL (EVP SHA-256), nlohmann/json (vendored `src/json.hpp`), Catch2 (test framework)
- 019-perf-dedup-cleanup: Added C++20 (`-std=c++20`) + Boost (Asio, Serialization), OpenSSL (EVP SHA-256), nlohmann/json (vendored `src/json.hpp`), Catch2 (test only)
- 018-audit-bug-security-fixes: Added C++20 (`-std=c++20`) + Boost (Asio, Serialization), OpenSSL (EVP SHA-256), nlohmann/json (vendored `src/json.hpp`)


<!-- MANUAL ADDITIONS START -->
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ tests/p2p_sync_integration_tests
tests/chain_persistence_module_tests
tests/difficulty_engine_tests
tests/merkle_proof_tests
tests/chain_service_tests

# Test artifacts
*.trs
Expand Down
2 changes: 1 addition & 1 deletion .specify/feature.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
"feature_directory": "specs/020-address-test-quality"
"feature_directory": "specs/021-architecture-remediation"
}
69 changes: 25 additions & 44 deletions docs/AUDIT.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,16 @@ All 4 bugs, both security issues, and 1 performance issue were resolved in
**018-audit-bug-security-fixes**. Three more performance issues, both
duplication clusters, and 1 test-quality issue were resolved in
**019-perf-dedup-cleanup**. The remaining 5 test-quality issues were resolved
in **020-address-test-quality**. Remaining items are tracked below.
in **020-address-test-quality**. All 3 architecture concerns and the remaining
performance issue were resolved in **021-architecture-remediation**.

| Category | Found | Resolved | Remaining | Highest Open Severity |
|-----------------------|------:|---------:|----------:|----------------------:|
| Bugs | 4 | 4 | 0 | — |
| Security issues | 2 | 2 | 0 | — |
| Performance issues | 5 | 4 | 1 | Low |
| Performance issues | 5 | 5 | 0 | |
| Code duplication | 2 | 2 | 0 | — |
| Architecture concerns | 3 | 0 | 3 | Medium |
| Architecture concerns | 3 | 3 | 0 | |
| Test quality issues | 6 | 6 | 0 | — |

---
Expand Down Expand Up @@ -118,13 +119,11 @@ Hot-path `logMessage()` calls in `PeerManager.cpp`, `BlockPropagation.cpp`,
`LOG_INFO`/`LOG_WARN`/`LOG_ERROR`/`LOG_DEBUG` macros that check `getLogLevel()`
before evaluating the message expression.

### 4.5 `replaceChain()` loads entire candidate into memory — LOW
### 4.5 ~~`replaceChain()` loads entire candidate into memory — LOW~~ ✅ RESOLVED (021)

[Blockchain.cpp](../src/Blockchain.cpp#L486-L536)

`replaceChain()` accepts `const std::vector<Block> &candidateBlocks` — the
full chain. For very long chains this means the entire history must fit in RAM
simultaneously. A streaming/chunked replacement would bound memory usage.
`replaceChainStreaming()` accepts a batch-fetcher callback and validates the
candidate chain in 100-block batches, bounding memory usage regardless of chain
length. The original `replaceChain()` remains for backward compatibility.

---

Expand All @@ -147,45 +146,27 @@ shared `TestHelpers::` namespace.

## 6. Architecture Concerns

### 6.1 `IBlockchain` interface is wide

[IBlockchain.hpp](../src/IBlockchain.hpp) exposes 28 methods including
persistence (`saveChunk`, `saveKeys`), mining (`publish`), querying
(`getStreamEntries`), and sync (`replaceChain`). Consumers that only need read
access (e.g. `RpcServer` for query endpoints) are coupled to the full
interface.

**Suggestion:** Split into `IChainReader` (query methods) and `IChainWriter`
(mutation methods). `RpcServer` depends only on `IChainReader` plus a small
`IChainWriter` for `publish` and `createStream`.

### 6.2 No separation between domain and network layers

`PeerClient`, `PeerServer`, and `BlockPropagation` directly call
`IBlockchain` methods. If the consensus rules or block format change, the
network layer must change too.
### 6.1 ~~`IBlockchain` interface is wide~~ ✅ RESOLVED (021)

**Suggestion:** Introduce a thin service layer (e.g. `ChainService`) that
mediates between the network and domain layers. The network layer would submit
blocks to the service, which validates and delegates to `Blockchain`.
`IBlockchain` now inherits from `IChainReader` (read-only query methods) and
`IChainWriter` (mutation methods). `RpcServer` holds narrow `IChainReader &`
and `IChainWriter &` references, decoupling it from the full interface.

Additionally, the sync protocol currently leaks storage details: `SyncResponse`
includes a `chunk_index` field, coupling the wire format to the internal chunk
storage scheme. The `ChainService` should own the batching strategy so that
the network layer exchanges blocks only, with no awareness of chunks.
### 6.2 ~~No separation between domain and network layers~~ ✅ RESOLVED (021)

### 6.3 Error handling is inconsistent
`ChainService` now mediates between network and domain layers. `PeerClient`
and `BlockPropagation` submit blocks through `ChainService::submitBlock()` and
`submitSyncBatch()` instead of calling `IBlockchain` directly. The wire-format
`SyncResponse::chunk_index` has been renamed to `start_index`, removing the
storage-detail leak.

| Pattern | Used by |
|------------------------|--------------------------------------------|
| Throw `std::runtime_error` | `publish()`, `createStream()`, `getStreamEntry()` |
| Return `bool` | `add_peer()`, `remove_peer()` |
| Log and continue | `loadKeys()`, `saveAllChunks()` |
| Silent no-op | `freeChunk()` on already-freed chunks |
### 6.3 ~~Error handling is inconsistent~~ ✅ RESOLVED (021)

This inconsistency makes it hard to reason about failure modes. For example,
`saveAllChunks()` logs chunk-save failures but continues saving indexes —
leaving a partially-saved state with no caller notification.
A domain-specific exception hierarchy (`ChainError` → `ValidationError`,
`PersistenceError`, `PeerError`) replaces ad-hoc `std::runtime_error` throws.
`saveAllChunks()` now throws `PersistenceError` on any save failure.
`add_peer()` returns void (always succeeds with eviction), and `remove_peer()`
throws `PeerError` when the peer is not found.

---

Expand Down Expand Up @@ -274,6 +255,6 @@ Ordered by impact and effort:
| 7 | Cache chunk during `recoverChain()` validation (§2.2, §4.3) | 3× faster startup | Low | ✅ Done (018) |
| 8 | Replace O(n) peer lookups with `unordered_map` (§4.1) | O(1) peer operations | Medium | ✅ Done (019) |
| 9 | Extract RPC dispatch table from `do_read()` (§4.2) | Maintainability, testability | Medium | ✅ Done (019) |
| 10 | Narrow `IBlockchain` into reader/writer interfaces (§6.1) | Reduces coupling | Medium | Open |
| 10 | Narrow `IBlockchain` into reader/writer interfaces (§6.1) | Reduces coupling | Medium | ✅ Done (021) |
| 11 | Remove local test helpers in favor of `TestHelpers.hpp` (§7.6) | Consistency | Low | ✅ Done (019) |
| 12 | Make integration tests deterministic (§7.4) | Reduces CI flakiness | Medium | ✅ Done (020) |
1 change: 1 addition & 0 deletions docs/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Last updated: 2026-04-13
| 018 | Audit Bug & Security Fixes | Fixed 4 bugs (sync response block append, RPC getBlockByIndex bounds check, getBlockByIndex resize chunk IDs, recovery triple-load) and 2 security issues (port range validation in parsePeerKey, seed node input validation in main); single-pass recovery optimization |
| 019 | Performance & Deduplication Cleanup | O(1) peer lookups via `unordered_map`, RPC dispatch table replacing 21-branch `if`/`else`, shared `serialize_packet<T>()` template, consolidated `TestHelpers`, lazy `LOG_*` macros |
| 020 | Address Test Quality | Replaced all trivial/vacuous assertions with behavioral checks, rewrote RPC expansion tests against real handlers, made integration tests deterministic, added 6 coverage-gap tests, split `IBlockchain` into `IChainReader`/`IChainWriter` |
| 021 | Architecture Remediation | Segregated `IBlockchain` into `IChainReader`/`IChainWriter` with narrow references in `RpcServer`, introduced `ChainService` mediator between network and domain layers, domain-specific exception hierarchy (`ChainError` → `ValidationError`/`PersistenceError`/`PeerError`), streaming `replaceChainStreaming()` with 100-block batch validation, wire-format cleanup (`chunk_index` → `start_index`) |

## Suggested Specs

Expand Down
36 changes: 36 additions & 0 deletions specs/021-architecture-remediation/checklists/requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Specification Quality Checklist: Architecture Remediation

**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-04-13
**Feature**: [spec.md](../spec.md)

## Content Quality

- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed

## Requirement Completeness

- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified

## Feature Readiness

- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification

## Notes

- All items pass validation. Spec is ready for `/speckit.clarify` or `/speckit.plan`.
- The spec deliberately uses interface/component names from the existing codebase (e.g. `IChainReader`, `RpcServer`) as domain vocabulary rather than implementation prescriptions.
- Assumptions section documents key design decisions made based on codebase context (e.g. exception-based error handling, chunk-sized batches).
52 changes: 52 additions & 0 deletions specs/021-architecture-remediation/contracts/ChainError.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Contract: ChainError Exception Hierarchy

**Source**: `src/ChainError.hpp`
**Consumers**: All components that catch blockchain exceptions

## Hierarchy

```
std::runtime_error
└── ChainError — base for all blockchain domain exceptions
├── ValidationError — invalid blocks, streams, inputs, consensus violations
├── PersistenceError — chunk/key/stream save/load failures, file I/O errors
└── PeerError — peer management operation failures
```

## Construction

All exceptions carry a descriptive `what()` message (inherited from `std::runtime_error`).

```
ChainError("message")
ValidationError("message")
PersistenceError("message")
PeerError("message")
```

## Catch Patterns

Callers may catch at the granularity they need:

- `catch (const ValidationError&)` — handle just validation failures
- `catch (const PersistenceError&)` — handle just I/O failures
- `catch (const ChainError&)` — handle all blockchain domain errors
- `catch (const std::runtime_error&)` — handle everything (existing behavior preserved)

## Migration Rules

| Current Pattern | New Pattern |
|----------------|------------|
| `throw std::runtime_error("Publish: shutting down")` | `throw ValidationError("Publish: shutting down")` |
| `throw std::runtime_error("Stream already exists")` | `throw ValidationError("Stream already exists")` |
| `throw std::runtime_error("Entry not found")` | `throw ValidationError("Entry not found")` |
| `logMessage("ERROR", ...) + continue` in `saveAllChunks` | `throw PersistenceError("Failed to save chunk N")` |
| `return false` in `PeerManager::remove_peer` | `throw PeerError("Peer not found: host:port")` |
| `return bool` in `PeerManager::add_peer` (capacity full) | `throw PeerError("Peer capacity exceeded")` |
| Silent no-op in `freeChunk` on freed chunk | Unchanged — not an error condition requiring notification |

## Preserved Behaviors

- `std::invalid_argument` in `parsePeerKey()`, `Block` constructor: Kept as-is. These are input validation at system boundaries, not domain errors.
- `std::out_of_range` in `MerkleProofService`: Kept as-is. Standard library convention for index errors.
- `nlohmann::json` exceptions: External library, unchanged.
46 changes: 46 additions & 0 deletions specs/021-architecture-remediation/contracts/ChainService.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Contract: ChainService (Network-Domain Mediator)

**Source**: `src/ChainService.hpp` / `src/ChainService.cpp`
**Consumers**: `PeerClient`, `PeerServer`, `BlockPropagation`

## Methods

```
// Submit a single received block (validation + append + persist)
void submitBlock(const Block &block)

// Submit a batch of blocks from sync (overlap check + append new + persist)
void submitSyncBatch(const std::vector<Block> &blocks, size_t local_height)

// Read-through convenience (delegates to IChainReader)
size_t getChainHeight() const
Block getBlockAtTip() const
const ConsensusConfig& getConsensusConfig() const
```

## Error Contract

- `submitBlock()`: Throws `ValidationError` if block is invalid (fails `isValidNewBlock()`). Throws `PersistenceError` if chunk save or key save fails.
- `submitSyncBatch()`: Throws `ValidationError` if overlap hash mismatch detected (fork). Throws `PersistenceError` on save failure.
- Read-through methods: Same contract as `IChainReader`.

## Behavioral Contract

1. **submitBlock(block)**:
- Calls `isValidNewBlock(block, tipBlock, config)` statically
- Calls `bc.appendBlock(block)`
- Computes `chunk_idx = block.index / bc.chunkSize`
- Calls `bc.saveChunk(chunk_idx)` then `bc.saveKeys()`
- If any step throws, exception propagates to caller (network layer handles it)

2. **submitSyncBatch(blocks, local_height)**:
- For blocks with `index < local_height`: verify hash matches local via `bc.getBlockByIndex()`; throw `ValidationError` on mismatch
- For blocks with `index >= local_height`: validate and append
- Persist affected chunks and keys after all blocks appended
- Caller (PeerClient) does NOT need to know about chunk boundaries

## Guarantees

- No block is persisted without prior validation.
- Persistence always follows successful append (never skipped).
- Network components never call `appendBlock()`, `saveChunk()`, or `saveKeys()` directly.
39 changes: 39 additions & 0 deletions specs/021-architecture-remediation/contracts/IChainReader.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Contract: IChainReader (Widened Read Interface)

**Source**: `src/IChainReader.hpp`
**Consumers**: `RpcServer`, `PeerServer`, `ChainService`, any future read-only components

## Methods

All methods are `const` (or logically read-only). No method modifies chain state.

```
// Existing methods (unchanged)
bool isShuttingDown() const
std::set<std::string> listStreams() const
std::vector<std::pair<size_t, StreamEntry>> getStreamEntries(stream, key?) const
std::pair<size_t, StreamEntry> getStreamEntry(stream, key) const
size_t getChainBlockCount() const
size_t getChainLength() const
size_t getChunkCount() const
uint32_t getCurrentDifficulty() const
const ConsensusConfig& getConfig() const

// Moved from IBlockchain (pure query, no side effects)
Block getBlockByIndex(size_t index) const
std::vector<Block> getBlocksByKeys(const std::vector<std::string> &keys) const
nlohmann::json getInclusionProof(size_t blockIndex, size_t entryIndex) const
nlohmann::json verifyInclusionProof(size_t blockIndex, const std::string &leafHash, const nlohmann::json &proofArray) const
```

## Error Contract

- `getBlockByIndex()`: Throws `ValidationError` if index >= chain length.
- `getStreamEntry()`: Throws `ValidationError` if stream or key not found.
- `getInclusionProof()`: Throws `std::out_of_range` if entry index out of range (existing behavior preserved).
- All other methods: Return empty collections or zero for absent data; never throw.

## Guarantees

- Thread safety: Same as current — callers must not interleave reads with writes without external synchronization.
- The interface does not expose `chunkSize`, `loadChunk()`, `freeChunk()`, or any persistence/lifecycle methods.
50 changes: 50 additions & 0 deletions specs/021-architecture-remediation/contracts/SyncMessages.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Contract: SyncMessages (Wire Format)

**Source**: `src/network/SyncMessages.hpp`
**Consumers**: `PeerClient`, `PeerServer` (Boost.Serialization binary archives over TLS)

## SyncQuery (Unchanged)

```
{
local_chain_height: uint64_t // Requester's current chain height
}
```

## SyncResponse (Modified)

```
{
total_chain_height: uint64_t // Responder's total chain height
start_index: uint64_t // First block index in this batch (NEW, replaces chunk_index)
blocks: vector<Block> // Blocks in this batch
}
```

### Changes from Previous Format

| Field | Old | New |
|-------|-----|-----|
| `chunk_index` | Present — storage-specific chunk number | **REMOVED** |
| `start_index` | Absent | **ADDED** — first block index in batch, storage-agnostic |

### Batching Strategy

- `PeerServer` sends blocks in batches of up to 100 (matching the internal chunk size for efficiency, but this is an implementation detail not exposed in the wire format).
- Batch boundaries are computed by block index range, not by chunk identity.
- `start_index` allows the receiver to validate that batches arrive in order.

### Backward Compatibility

- None required. All nodes update simultaneously per clarification decision.
- The Boost.Serialization archive format changes (field removed + field added), making the new format binary-incompatible with the old one.

## Serialization

Boost.Serialization binary archive format via `serialize()` template method. Fields serialized in declaration order:

```
ar & total_chain_height;
ar & start_index;
ar & blocks;
```
Loading
Loading