Skip to content

Advanced Features

AyhamAsfoor edited this page May 6, 2026 · 5 revisions

🛠️ Advanced Features: Mathematical & Cryptographic Deep Dive

StegX v2.0 moves beyond heuristic steganography into mathematically provable data concealment and distribution. This page provides the formal academic foundation for each enterprise feature, with all claims traceable to the source code.


1. Matrix Embedding (F5 Algorithm & Hamming Codes)

1.1 The Problem with Classical LSB

Classical Least Significant Bit (LSB) substitution replaces the LSB of each cover pixel with a message bit. When the message bit already matches the cover LSB, no change occurs; otherwise, a forced replacement is made. On average, this yields a modification rate:

$$R_m^{\text{classical}} \approx 0.5$$

meaning roughly 50% of touched pixels are altered. This creates massive first-order statistical anomalies detectable by Chi-Square analysis.

StegX also implements LSB Matching (±1 perturbation), which avoids the Pairs-of-Values artifact by randomly incrementing or decrementing the pixel value instead of forcing the LSB. However, even with matching, the modification rate remains $\approx 0.5$.

1.2 Hamming Code Formulation

StegX implements Matrix Embedding using systematic Hamming codes $\mathcal{H}(n, k)$ where $n = 2^k - 1$. The implementation uses $k = 3$, giving block size $n = 7$.

Let $\mathbf{x} = (x_1, x_2, \dots, x_7)$ be the LSBs of a block of 7 cover pixels. We wish to embed $k = 3$ message bits $\mathbf{m} = (m_1, m_2, m_3)$.

Step 1 — Compute the syndrome: $$s = \bigoplus_{i=1}^{7} i \cdot x_i$$

where $\oplus$ is XOR and $i \cdot x_i$ means "include index $i$ in the XOR if $x_i = 1$".

Step 2 — Compute the flip index: $$\text{flip} = s \oplus \text{int}(\mathbf{m})$$

Step 3 — Apply the modification:

  • If $\text{flip} = 0$: the block already encodes $\mathbf{m}$. Zero modifications needed.
  • If $\text{flip} \neq 0$: flip exactly one bit at position $\text{flip}$ in $\mathbf{x}$.

This is implemented in embedding.py → _embed_matrix_hamming(), which iterates over cover positions in blocks of 7, computes the syndrome via syndrome ^= i for each set LSB, and flips the single bit at syndrome ^ message_int.

1.3 Modification Rate Analysis

The expected number of modifications per block is: $$E[\text{flips}] = \frac{n}{n+1} \cdot 1 = \frac{7}{8}$$

Since each block embeds $k = 3$ bits, the modification rate per embedded bit is: $$R_m = \frac{7/8}{3} \approx 0.292 \text{ modifications per message bit}$$

Compared to classical LSB ($R_m = 0.5$), this is a 41.6% reduction in modifications. For larger block sizes ($k=4, n=15$), the rate drops further: $$R_m(k=4) = \frac{15/16}{4} \approx 0.234$$

1.4 Adaptive Fallback

When adaptive mode is active (Laplacian or HILL cost maps), StegX forces the Matrix Embedding layer to use LSB Replacement instead of ±1 perturbation for the single-bit flip. This is critical because the cost map was computed on the pre-embedding image; a ±1 perturbation could shift a pixel value in a direction that breaks the cost-map's assumptions about which neighbor values are "safe." The use_replacement_for_matrix parameter in embed_bits() controls this behavior.


2. Shamir's Secret Sharing (Quorum Protocol)

2.1 Motivation

A single steganographic image represents a single point of compromise. If an adversary obtains both the image and the password, the secret is fully exposed. Shamir's Secret Sharing (SSS) distributes the payload across $N$ images such that any $K$ are required for reconstruction, while $K-1$ images reveal absolutely zero information about the secret.

2.2 Galois Field Arithmetic: GF(2⁸)

StegX operates over the finite field $GF(2^8)$ using the irreducible polynomial: $$P(x) = x^8 + x^4 + x^3 + x^2 + 1$$

This corresponds to the hex constant 0x11D used in the source code.

All arithmetic (addition, multiplication, division) is performed in this field using precomputed logarithm and exponential lookup tables. This is implemented in shamir.py:

_LOG = [0] * 256
_EXP = [0] * 512
_a = 1
for _i in range(255):
    _EXP[_i] = _a
    _LOG[_a] = _i
    _a <<= 1
    if _a & 0x100:
        _a ^= 0x11D  # reduction polynomial
  • Addition in GF(2⁸) is XOR: $a + b = a \oplus b$
  • Multiplication: $a \cdot b = \text{EXP}[\text{LOG}[a] + \text{LOG}[b]]$
  • Division: $a / b = \text{EXP}[(\text{LOG}[a] - \text{LOG}[b] + 255) \bmod 255]$

2.3 Splitting Algorithm

For each byte $S$ of the secret payload, StegX constructs a random polynomial of degree $K-1$: $$f(x) = S + a_1 x + a_2 x^2 + \cdots + a_{K-1} x^{K-1}$$

where coefficients $a_1, \dots, a_{K-1}$ are drawn from os.urandom() (cryptographically secure). The constant term $f(0) = S$ is the secret byte.

For each share $i \in {1, 2, \dots, N}$, the share value is $y_i = f(i)$ evaluated in GF(2⁸). Each share is serialized as [x_coordinate, threshold_k, y_1, y_2, ...] and embedded into a separate cover image.

2.4 Reconstruction via Lagrange Interpolation

Given $K$ shares $(x_1, y_1), \dots, (x_K, y_K)$, the secret byte is recovered by evaluating the Lagrange interpolant at $x = 0$:

$$S = f(0) = \sum_{j=1}^{K} y_j \prod_{\substack{i=1 \ i \neq j}}^{K} \frac{x_i}{x_i \oplus x_j}$$

where division and multiplication are in GF(2⁸). This is implemented in shamir.py → _lagrange_at_zero().

2.5 Information-Theoretic Security Proof

This scheme provides perfect secrecy in the information-theoretic sense. An adversary holding $K-1$ shares can construct $2^8 = 256$ equally valid polynomials (one for each possible value of $S$), each consistent with the observed shares. Therefore: $$H(S \mid K-1 \text{ shares}) = H(S) = 8 \text{ bits}$$

No computational advantage is gained. This holds regardless of the adversary's computing power.


3. Hardware-Bound 2FA (YubiKey HMAC-SHA1)

3.1 Protocol

StegX binds steganographic payloads to a physical YubiKey token using the HMAC-SHA1 Challenge-Response protocol on Slot 2.

Encoding flow (implemented in yubikey.py):

  1. Generate a 16-byte random nonce $N$ via os.urandom(16).
  2. Compute the challenge: $C = \text{SHA-256}(\text{"stegx/v3/yk-challenge"} | N | F)$ where $F$ is the cover image fingerprint.
  3. Send $C$ to the YubiKey hardware over USB/NFC.
  4. The YubiKey computes $R = \text{HMAC-SHA1}(K_y, C)$ using its hardware-burned secret $K_y$. This produces a 20-byte response.
  5. The response $R$ is mixed with the user's password $P$ via the multi-factor framing protocol: mixed = frame(PWD0, P) || frame(KFL0, "") || frame(YKR0, R)
  6. The mixed material is fed into Argon2id to produce the master key.

3.2 Security Properties

  • Physical binding: Even if an attacker keyloggers the password $P$, they cannot compute $R$ without the physical YubiKey.
  • Challenge freshness: The challenge incorporates the cover image fingerprint, preventing replay attacks across different images.
  • Non-extractability: The YubiKey's HMAC secret $K_y$ resides in tamper-resistant silicon and cannot be read via software.

3.3 Keyfile Factor

StegX also supports a file-based second factor (--keyfile). The keyfile bytes are framed under tag KFL0 and mixed into the Argon2id input alongside the password. This provides a software-based 2FA alternative when hardware tokens are unavailable.


4. Panic Mode: Cryptographic Destruction and Decoy Architecture

4.1 Threat Model

In hostile environments, an adversary may compel the operator to reveal the password (rubber-hose cryptanalysis). StegX implements two complementary defenses:

  1. Silent Panic: Overwrites all LSBs with cryptographically random noise, destroying both the real payload and any structural signature.
  2. Decoy Panic: The cover image's pixel positions are deterministically split into two halves: a real region (carrying the actual payload) and a decoy region (carrying a sacrificial decoy file). The panic password destroys only the real region while leaving the decoy intact and extractable.

4.2 Region Splitting Algorithm

The split is deterministic and based on the cover image's SHA-256 fingerprint (computed from pixel data), ensuring the same split is reproducible during both encoding and panic. From decoy.py:

def split_regions(all_positions, cover_fingerprint):
    seed = SHA256("stegx/v2/decoy-split\x00" + fingerprint)
    rng = Random(seed[:16])
    rng.shuffle(indices)
    half = len(indices) // 2
    return indices[:half], indices[half:]  # decoy, real

4.3 Cryptographic Overwrite

The destruction in panic.py → _overwrite_lsbs_randomly() uses os.urandom() to generate the replacement bits:

bits = os.urandom((len(positions) + 7) // 8)
for idx, pos in enumerate(positions):
    bit = (bits[idx // 8] >> (idx & 7)) & 1

This is superior to zeroing because:

  • Zeroed LSBs create an obvious statistical anomaly ($H \approx 0$).
  • Random LSBs are indistinguishable from either encrypted data or natural sensor noise ($H \approx 1.0$ per bit).

4.4 Secure File Replacement

After overwriting, StegX performs an atomic file replacement sequence:

  1. Save the modified image to a temporary file in the same directory.
  2. Call shred -u -n 3 on the original file (Linux only) for secure deletion.
  3. Atomically rename the temporary file to the original path via os.replace().
  4. A file-level lock (*.stegx-lock) prevents concurrent panic operations on the same file.

5. FIPS 140 Compliance Mode

StegX supports a --fips flag that restricts the entire cryptographic pipeline to NIST-validated algorithms only:

Component Normal Mode FIPS Mode
KDF Argon2id PBKDF2-SHA256 (600,000 iterations)
Cipher AES-256-GCM or ChaCha20-Poly1305 AES-256-GCM only
Compression All codecs zlib only
YubiKey Allowed Banned
Dual-cipher Allowed Banned (ChaCha20 not FIPS-validated)

The --fips flag calls fips.py → assert_fips_runtime(), which verifies that the underlying OpenSSL library is actually running in FIPS mode. Any attempt to use a banned primitive raises a FipsPolicyViolation exception.

This is critical for government and defense deployments where FIPS 140-2/3 compliance is mandatory.