Skip to content

Architecture

AyhamAsfoor edited this page May 6, 2026 · 5 revisions

🏗️ Architecture: Pipeline, Cost Maps, and Container Format

This page traces a payload through the complete StegX v2.0 pipeline — from plaintext input to embedded LSB output — with mathematical definitions for each stage. All structural details are taken directly from the source code.


1. High-Level Pipeline

Secret File
    │
    ▼
┌──────────────────────┐
│ Multiplexed           │  Select optimal codec: zstd, brotli, lzma, zlib, bz2
│ Compression           │  M_opt = argmin_c |c(M)|
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ Multi-Factor KDF      │  Argon2id(password ∥ keyfile ∥ yubikey, salt)
│ + HKDF Sub-Keys       │  → K_aes, K_chacha, K_seed, K_sentinel
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ Container Assembly    │  Header(magic, version, flags, KDF params, salt,
│ (v2 or v3 format)     │         nonces, ct_length) + Ciphertext + Tag
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ AEAD Encryption       │  AES-256-GCM(K_aes, N1, container, AAD)
│ (+ optional dual)     │  [+ ChaCha20(K_chacha, N2, ...)]
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ Sentinel Prepend      │  HKDF-derived 4-byte magic marker
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ Cost Map Generation   │  Laplacian or HILL edge-detection filter
│ + Position Masking    │  Exclude flat/smooth pixel regions
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ PRNG Pixel Shuffle    │  Fisher-Yates permutation seeded by K_seed
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ LSB Embedding         │  LSB Matching, LSB Replacement, or
│                       │  Matrix Hamming (--extreme)
└──────────┬───────────┘
           │
           ▼
       Stego Image (PNG)

2. Multiplexed Compression

2.1 Codec Selection

StegX implements six compression backends in compression.py:

Codec Algorithm Level Notes
zstd Zstandard 22 (max) Dictionary matching + Finite State Entropy
zstd_dict_v1 Zstandard + trained dictionary 22 Custom dictionary in data/stegx_dict_v1.zstd
brotli Brotli 11 (max) 2nd-order context modeling + static dictionary
lzma LZMA2 9 + EXTREME Highest ratio, slowest speed
bz2 BZip2 9 Burrows-Wheeler transform
zlib DEFLATE 9 Fastest fallback, FIPS-safe

2.2 Selection Algorithm

In MODE_BEST, all available codecs are evaluated and the smallest output is selected:

$$M_{\text{opt}} = \arg\min_{c \in \mathcal{C}} |c(M)|$$

If no compressed output is smaller than the raw input, compression is skipped entirely (ALG_NONE).

In MODE_FAST, only zstd is attempted (or zlib in FIPS mode).

2.3 Decompression Bomb Protection

To prevent malicious payloads from triggering excessive memory allocation, decompressed output is hard-capped at:

MAX_DECOMPRESS_SIZE = 256 MiB

Any decompression exceeding this limit raises a DecompressionBombError.


3. Adaptive Cost Maps

StegX supports two cost-map algorithms that determine which pixels are safe for embedding.

3.1 Laplacian Edge-Detection (Default)

The cover image is first converted to grayscale with LSBs cleared (to prevent the cost map from being influenced by previously embedded data). A Laplacian edge-detection filter is then applied using Pillow's ImageFilter.FIND_EDGES.

From embedding.py → _laplacian_edge_map():

def _laplacian_edge_map(image):
    edges = _lsb_cleared_gray(image).filter(ImageFilter.FIND_EDGES)
    return edges.point(lambda v: v & 0xFC)

The output is a grayscale image where bright pixels indicate high-frequency regions (edges, texture, noise) and dark pixels indicate flat/smooth regions.

Pixel Selection: By default, min_cost_percentile = 0.40, meaning the bottom 40% of pixels (by edge intensity) are excluded. Only the top 60% most textured pixels are eligible for embedding.

3.2 HILL Cost Map (Advanced)

The HILL (High-pass, Low-pass, Low-pass) algorithm is a more sophisticated distortion function used in modern academic steganalysis research. From embedding.py → _hill_cost_map():

  1. The image is converted to grayscale with LSBs cleared.
  2. A residual filter (KB kernel) is applied: $$K_{\text{KB}} = \begin{bmatrix} -1 & 2 & -1 \ 2 & -4 & 2 \ -1 & 2 & -1 \end{bmatrix}$$
  3. The absolute residuals are computed and smoothed with a BoxBlur(1).
  4. The result is inverted: high-residual areas (edges) get LOW cost, flat areas get HIGH cost.
  5. A final BoxBlur(7) smoothing pass is applied.

HILL assigns a continuous cost value per pixel rather than a binary include/exclude decision, allowing finer-grained embedding optimization.

3.3 Position Masking

build_adaptive_position_mask() returns a set of (x, y) coordinates that pass the cost threshold. During embedding, the PRNG-shuffled position sequence is filtered through this mask via filter_positions_by_mask(), ensuring that only textured pixels are used.


4. Binary Container Format

4.1 Version 2 Header (Legacy)

The v2 header is a fixed 57-byte structure:

Offset  Size  Field
──────  ────  ─────────────────────────────
0       1     Magic (0x58 = 'X')
1       1     Version (0x02)
2       1     KDF ID (0x01=PBKDF2, 0x02=Argon2id)
3       1     Flags (bitmask)
4       8     KDF Parameters (packed)
12      16    Salt
28      12    AES-GCM Nonce
40      12    ChaCha20 Nonce (zeros if not dual-cipher)
52      4     Inner Ciphertext Length (uint32, big-endian)
──────  ────  ─────────────────────────────
Total:  56 bytes

4.2 Version 3 Header (Current)

The v3 header extends v2 with additional fields for header salt, YubiKey nonce, and KMS key wrapping:

Offset  Size  Field
──────  ────  ─────────────────────────────
0       1     Magic (0x58)
1       1     Version (0x03)
2       1     KDF ID
3       1     Flags
4       8     KDF Parameters
12      16    Salt
28      12    AES-GCM Nonce
40      12    ChaCha20 Nonce
52      4     Inner Ciphertext Length
56      16    Header Salt (pre-extraction for factor mixing)
72      16    YubiKey Challenge Nonce
88      2     KMS Wrap Length (uint16, big-endian)
90      var   KMS Wrapped Key Material (0–1024 bytes)
──────  ────  ─────────────────────────────
Base:   90 bytes (+ KMS wrap)

4.3 Flags Bitmask

From header.py:

Bit Constant Hex Meaning
0 FLAG_COMPRESSED 0x01 Payload is compressed
1 FLAG_DUAL_CIPHER 0x02 ChaCha20-Poly1305 outer layer active
2 FLAG_KEYFILE 0x04 Keyfile factor was used
3 FLAG_ADAPTIVE 0x08 Adaptive cost-map embedding
4 FLAG_MATRIX 0x10 Matrix Hamming embedding
5 FLAG_YUBIKEY 0x20 YubiKey factor was used

4.4 KDF Parameter Packing

For Argon2id (kdf_id = 0x02):

Byte 0:     time_cost (uint8)
Bytes 1-4:  memory_cost_kib (uint32, big-endian)
Byte 5:     parallelism (uint8)
Bytes 6-7:  reserved (zero)

For PBKDF2 (kdf_id = 0x01):

Bytes 0-3:  iterations (uint32, big-endian)
Bytes 4-7:  reserved (zero)

4.5 Header Validation and DoS Protection

Header.unpack() performs strict bounds checking on all KDF parameters:

Parameter Minimum Maximum
Argon2id time_cost ARGON2_MIN_TIME_COST ARGON2_MAX_TIME_COST
Argon2id memory_cost ARGON2_MIN_MEMORY_KIB ARGON2_MAX_MEMORY_KIB
Argon2id parallelism ARGON2_MIN_PARALLELISM ARGON2_MAX_PARALLELISM
PBKDF2 iterations PBKDF2_MIN_ITERATIONS PBKDF2_MAX_ITERATIONS
KMS wrap length 0 KMS_WRAP_MAX (1024)

Any value outside these bounds raises HeaderParameterOutOfRange, preventing an attacker from crafting a malicious header that forces absurdly expensive KDF computations (Denial-of-Service via algorithmic complexity).


5. Decoy Region Splitting

When panic/decoy mode is active, the cover image's pixel positions are deterministically partitioned into two disjoint halves.

5.1 Algorithm

From decoy.py → split_regions():

  1. Compute the cover image fingerprint: $F = \text{SHA-256}(\text{pixel data})$
  2. Derive a deterministic seed: $S = \text{SHA-256}(\text{"stegx/v2/decoy-split"} | F)$
  3. Initialize a PRNG with $S_{[0:16]}$ and perform a Fisher-Yates shuffle on all position indices.
  4. The first half of the shuffled indices → decoy region (sacrificial payload).
  5. The second half → real region (actual secret).

5.2 Properties

  • Deterministic: The same cover image always produces the same split, enabling the panic system to locate the real region without storing any metadata.
  • Uniform: Each pixel has exactly 50% probability of being in either region, preventing spatial clustering artifacts.
  • Independent of password: The split depends only on the image content, not the user's secret. This allows the panic mechanism to operate even without the real password.

6. Complete Embedding Sequence

Putting it all together, the full embedding sequence in steganography.py is:

  1. Load the cover image and compute its fingerprint.
  2. Build the cost map (Laplacian or HILL) and generate the adaptive position mask.
  3. Enumerate all pixel positions (x, y, channel) in raster order.
  4. If decoy mode: split positions into decoy and real halves.
  5. Filter positions through the adaptive mask.
  6. Shuffle filtered positions using the PRNG seeded by $K_{\text{seed}}$.
  7. Serialize the payload: sentinel + header + ciphertext.
  8. Convert to a bit string.
  9. Call embed_bits() with the selected method (LSB Matching, LSB Replacement, or Matrix Hamming).
  10. Save the stego image as PNG with metadata stripped.

The extraction process mirrors this exactly: regenerate the same PRNG sequence, extract LSBs, locate the sentinel, parse the header, decrypt, decompress, and write the output file.