Skip to content

Security and Cryptography

AyhamAsfoor edited this page May 6, 2026 · 5 revisions

🔐 Security and Cryptography: Full Pipeline Analysis

StegX v2.0 implements a layered cryptographic architecture where each layer is mathematically independent. This page provides a complete formal analysis of every cryptographic primitive in the pipeline, with parameters taken directly from the source code.


1. Multi-Factor Input Mixing

Before any key derivation occurs, StegX combines all authentication factors into a single deterministic byte stream using a tagged-length-value (TLV) framing protocol. This is implemented in kdf.py → _mix_factors():

mixed = frame(PWD0, password_bytes)
      + frame(KFL0, keyfile_bytes or "")
      + frame(YKR0, yubikey_response or "")

Each frame is: [4-byte tag][4-byte big-endian length][data]. The maximum factor size is 16 MiB (_MAX_FACTOR_LEN = 16 * 1024 * 1024).

The mixed material undergoes an additional HMAC-SHA256 extraction before entering Argon2id:

mixed' = HMAC-SHA256(header_salt, mixed)

This pre-extraction ensures that even if two different images share the same password, their derived master keys are completely unrelated (due to different header salts).


2. Key Derivation: Argon2id

2.1 Algorithm Selection

Argon2id is a hybrid variant of Argon2 that combines:

  • Argon2d (data-dependent memory access — resistant to GPU/ASIC attacks)
  • Argon2i (data-independent memory access — resistant to side-channel attacks)

The first pass uses data-independent addressing (safe against cache-timing), while subsequent passes use data-dependent addressing (safe against TMTO attacks).

2.2 Hardcoded Parameters

From kdf.py:

Parameter Symbol Value Purpose
Time cost $t$ 3 Number of passes over memory
Memory cost $m$ 65,536 KiB (64 MB) Minimum RAM required per hash
Parallelism $p$ 4 Concurrent threads
Output length 32 bytes (256 bits) Master key size

2.3 Time-Memory Trade-Off (TMTO) Analysis

GPU Attack Scenario:

An NVIDIA RTX 4090 has 24 GB of VRAM. Each Argon2id evaluation requires $m = 64$ MB of dedicated memory. The maximum number of parallel evaluations is:

$$\text{threads}_{\max} = \left\lfloor \frac{24{,}576 \text{ MB}}{64 \text{ MB}} \right\rfloor = 384$$

With $t = 3$ passes and the inherent serialization of memory-dependent operations, each evaluation takes approximately $\tau \approx 112$ ms on consumer hardware. The throughput of a single RTX 4090 is therefore bounded by:

$$\text{H/s}_{\text{GPU}} \leq \frac{384}{\tau} \approx 3{,}429 \text{ H/s}$$

Comparison with legacy tools:

Tool KDF Cracking Rate
Steghide MD5 (unsalted) $> 2 \times 10^7$ H/s
OpenStego PBKDF2 (1,000 iter) $> 5 \times 10^5$ H/s
StegX Argon2id (64 MB, t=3) ≤ 3,429 H/s (GPU)
StegX Argon2id (64 MB, t=3) ≈ 9 H/s (single CPU)

For a password with 40 bits of entropy ($\approx 10^{12}$ candidates), exhaustive search against StegX would take:

$$T = \frac{10^{12}}{3{,}429} \approx 2.9 \times 10^8 \text{ seconds} \approx 9.2 \text{ years (single GPU)}$$

2.4 Adaptive Calibration

StegX provides kdf.py → calibrate_argon2_for_target_ms() which tests progressively larger memory costs (32 MB → 64 MB → 128 MB → 256 MB) and selects the first configuration that exceeds the target latency (default 500ms). This allows administrators to tune the TMTO bound to their specific hardware.


3. Key Expansion: HKDF Sub-Key Independence

3.1 The Separation Principle

Using the same key for encryption, authentication, and PRNG seeding violates the cryptographic separation principle. A vulnerability in one operation (e.g., a nonce reuse in GCM) could leak information about the key used in another.

StegX derives independent sub-keys using HKDF-Expand (RFC 5869) with SHA-256:

$$K_{\text{sub}} = \text{HKDF-Expand}(\text{MasterKey}, \text{info}, 32)$$

3.2 Sub-Key Derivation Table

From kdf.py, the following info labels are used:

Sub-Key HKDF Info Label Length Purpose
$K_{\text{aes}}$ stegx/v2/aes-256-gcm 32 B AES-256-GCM encryption key
$K_{\text{chacha}}$ stegx/v2/chacha20-poly1305 32 B ChaCha20-Poly1305 key (dual-cipher only)
$K_{\text{seed}}$ stegx/v2/pixel-shuffle-seed 32 B PRNG seed for pixel permutation
$K_{\text{sentinel}}$ stegx/v2/sentinel varies Magic sentinel derivation
$K_{\text{decoy}}$ stegx/v2/decoy-shuffle-seed 32 B Decoy region PRNG seed

3.3 Independence Guarantee

HKDF-Expand is a PRF (Pseudorandom Function) under the assumption that HMAC-SHA256 is a PRF. For two distinct info labels $I_1 \neq I_2$:

$$\Pr[K_1 = K_2] = \Pr[\text{HMAC}(MK, I_1 | 0x01) = \text{HMAC}(MK, I_2 | 0x01)] \leq 2^{-256}$$

This guarantees cryptographic independence: compromising $K_{\text{aes}}$ reveals nothing about $K_{\text{seed}}$ or $K_{\text{sentinel}}$.


4. AEAD Encryption: Dual-Cipher Strategy

4.1 Primary Cipher: AES-256-GCM

AES-256-GCM (Galois/Counter Mode) is always used as the primary encryption layer. The key $K_{\text{aes}}$ is 256 bits and the nonce is 96 bits (12 bytes), generated via os.urandom(12).

GCM provides:

  • Confidentiality via CTR-mode AES encryption
  • Authenticity via GHASH polynomial evaluation over GF(2¹²⁸), producing a 128-bit authentication tag

4.2 Secondary Cipher: ChaCha20-Poly1305

When --dual-cipher is enabled, the AES-GCM ciphertext is further encrypted with ChaCha20-Poly1305 using an independent key $K_{\text{chacha}}$ and a fresh 96-bit nonce. This creates a nested AEAD construction:

$$C_{\text{final}} = \text{ChaCha20-Poly1305}(K_{\text{chacha}}, N_2, \text{AES-GCM}(K_{\text{aes}}, N_1, P, \text{AAD}), \text{AAD})$$

4.3 Associated Authenticated Data (AAD)

Both ciphers bind the header as AAD. The AAD is computed by header.py → Header.as_aad(), which serializes the full header with inner_ct_length set to zero. This ensures that any modification to the header (KDF parameters, flags, salt, nonces) will cause authentication to fail, even if the ciphertext itself is untouched.

4.4 Authentication Tag Forgery Bound

Each AEAD layer produces a 128-bit tag. The probability of forging a valid tag without the key is:

$$P_{\text{forge}} \leq 2^{-128} \approx 2.9 \times 10^{-39}$$

With dual-cipher, an attacker must forge both tags independently:

$$P_{\text{forge}}^{\text{dual}} \leq 2^{-256}$$


5. Non-Linear PRNG Pixel Shuffling

5.1 Permutation Generation

The PRNG seed $K_{\text{seed}}$ is converted to a 64-bit integer via seed_int_from_subkey():

def seed_int_from_subkey(subkey: bytes) -> int:
    return int.from_bytes(subkey[:8], "big")

This integer seeds Python's Mersenne Twister PRNG (random.Random(seed_int)), which generates a Fisher-Yates shuffle of all embeddable pixel positions. The result is a pseudo-random permutation $\pi$ of the pixel coordinate space.

5.2 Spatial Deniability

Let $C$ be the total number of embeddable pixel positions (after cost-map filtering) and $L$ be the number of bits to embed. An adversary who does not know $K_{\text{seed}}$ must consider all possible orderings:

$$|\text{search space}| = \frac{C!}{(C - L)!}$$

For a typical 1920×1080 RGB image with 60% adaptive filtering ($C \approx 3.7 \times 10^6$) and a 10 KB payload ($L = 81{,}920$), this is astronomically large — far beyond any brute-force enumeration.

5.3 Decoy Region Reordering

When panic mode is active, the pixel positions are first split into two halves (decoy and real) using a cover-fingerprint-derived PRNG (see Architecture § Decoy Region Splitting). Each half is then independently shuffled using its own HKDF-derived seed, ensuring that the real and decoy PRNG sequences are cryptographically unrelated.


6. Secure Memory Management

6.1 The Threat: Cold Boot and Swap Attacks

Cryptographic keys stored in userspace memory can be recovered via:

  • Cold boot attacks: Freezing RAM modules and reading residual data after power-off
  • Swap file leaks: The OS paging key material to disk

6.2 StegX Mitigations

StegX implements secure_memory.py → SecureBuffer, which provides:

  1. OS-Level Memory Locking:

    • Linux/macOS: mlock() via libc.so.6 / libc.dylib
    • Windows: VirtualLock() via kernel32.dll

    This prevents the OS from swapping the buffer to disk.

  2. Deterministic Zeroization: After use, all key material is overwritten with zeros using ctypes.memset():

    ctypes.memset(
        (ctypes.c_char * len(buf)).from_buffer(buf),
        0, len(buf)
    )

    This bypasses Python's garbage collector, which does not guarantee timely deallocation.

  3. Automatic Cleanup: SecureBuffer implements __enter__/__exit__ (context manager) and __del__ (destructor), ensuring keys are zeroized even if an exception occurs.

6.3 Key Lifecycle

Password entered
    → _mix_factors() → mixed bytes (bytearray)
    → Argon2id → master_key (SecureBuffer)
        → HKDF → K_aes (SecureBuffer)
        → HKDF → K_chacha (SecureBuffer, if dual-cipher)
        → HKDF → K_seed (used, then discarded)
    → Encryption/Decryption
    → master_key.close()  → zeroize + munlock
    → K_aes.close()       → zeroize + munlock
    → K_chacha.close()    → zeroize + munlock

Every SecureBuffer is wrapped in a try/finally block in crypto.py, guaranteeing zeroization regardless of success or failure.