Skip to content

Latest commit

 

History

History
142 lines (130 loc) · 7.86 KB

File metadata and controls

142 lines (130 loc) · 7.86 KB

RFC 9112 — HTTP/1.1

Summary

RFC 9112 defines the HTTP/1.1 message syntax: the request line, status line, header block, chunked transfer-coding, and the rules by which a recipient parses a stream of bytes into a sequence of requests or responses. In ExpressGateway it maps almost entirely to the lb-h1 crate, which provides a hand-written frame-by-frame parser (parse_request_line, parse_status_line, parse_headers, parse_trailers) and a chunked codec (ChunkedDecoder, ChunkedEncoder). Smuggling-specific rules from RFC 9112 Section 6.1 are enforced by lb-security::SmuggleDetector. The proxy is a direct intermediary in the sense of Section 9, so it is obligated to reject anything ambiguous rather than forward it. Huffman-free, allocation- bounded, panic-free parsing is a deliberate choice — the #![deny(...)] block in crates/lb-h1/src/lib.rs:2 forbids unwrap, expect, panics, and indexing slicing.

Scope in ExpressGateway

  • crates/lb-h1/src/lib.rs — public surface of the HTTP/1.x codec.
  • crates/lb-h1/src/parse.rs — request/status line and header parsing (parse_request_line:34, parse_status_line:61, parse_headers:86, parse_trailers:131).
  • crates/lb-h1/src/chunked.rs — chunked transfer encoding decoder (ChunkedDecoder::feed:56, try_read_size:126, try_read_trailers:161) and encoder (ChunkedEncoder::encode:243, ChunkedEncoder::finish:263).
  • crates/lb-security/src/smuggle.rs — RFC 9112 §6.1 safety checks (SmuggleDetector::check_cl_te:48, check_te_cl:73, check_duplicate_cl:22, check_h2_downgrade:119).
  • tests/conformance_h1.rs — syntactic conformance.
  • tests/security_smuggling_cl_te.rs, tests/security_smuggling_te_cl.rs — smuggling regressions for CL-TE and TE-CL variants.

MUST clauses we implement

  • Request line syntax (§3)parse_request_line splits exactly three space-separated tokens and requires a trailing CRLF; InvalidRequestLine is returned on any deviation (missing token, non-UTF-8 byte, unknown method), matching the "400 Bad Request" disposition in Section 3.
  • HTTP-version grammar (§2.3) — Only HTTP/1.0 and HTTP/1.1 are accepted (parse_version in crates/lb-h1/src/parse.rs:136); any other token yields InvalidRequestLine, refusing to speak 0.9-era or imaginary versions as required by §2.3.
  • Header-field syntax (§5.1)parse_headers locates the colon, trims optional whitespace, requires CRLF termination, and rejects non-UTF-8 bytes. It does not accept the obsolete line-folding LWS (no \r\n\s+ continuation), satisfying the §5.2 "MUST NOT" against obs-fold generation and allowing the implementation to treat obs-fold as an error.
  • Transfer-Encoding chunked final coding (§6.1)SmuggleDetector:: check_te_cl enforces that when Transfer-Encoding is present the final coding MUST be chunked; otherwise the message is rejected. This is the §6.1 "un-chunked final coding" rule; regression test tests/security_smuggling_te_cl.rs.
  • Prohibition of CL + TE (§6.1) — When both Content-Length and Transfer-Encoding arrive on the same message the recipient MUST treat it as invalid or MUST strip Content-Length; we chose the stricter option (reject), implemented in check_cl_te and regressed by tests/security_smuggling_cl_te.rs.
  • Duplicate Content-Length values (§6.3 / RFC 9110 §8.6) — Multiple Content-Length headers with differing values MUST be rejected; check_duplicate_cl compares every value after ASCII trim and returns SecurityError::SmuggleDuplicateCL on mismatch.
  • Chunk framing (§7.1) — The chunked decoder is a state machine with explicit states ReadingSize, ReadingData, ReadingDataCrlf, ReadingTrailers, Done (see DecoderState at crates/lb-h1/src/chunked.rs:9). A missing CRLF after chunk data returns InvalidChunkEncoding, which maps to §7.1's "MUST close the connection" posture.
  • Chunk extensions (§7.1.1) — Extensions after ; in the size line are parsed and discarded (try_read_size:139), never propagated upstream, complying with "a recipient MUST ignore unrecognized chunk extensions".
  • Trailer-section handling (§7.1.2) — Trailers after the zero-size chunk are parsed with try_read_trailers:161; the block requires a terminating \r\n\r\n before Done is reported, preventing half- open trailer sections that smuggling tools exploit.
  • Connection management headers are hop-by-hop (§9.6 / RFC 9110 §7.6.1) — PROMPT.md Section 10 defines the static hop-by-hop set (connection, keep-alive, te, trailers, transfer-encoding, upgrade, proxy-connection, proxy-authenticate, proxy-authorization), and the H1 pipeline strips these before forwarding, satisfying the MUST in §9.6 that proxies MUST NOT forward connection-specific headers.
  • H2 downgrade hygiene (§6.1 cross-reference to RFC 9113 §8.2.2) — When a request originated from H2 and is re-serialised as H1 on the origin side, SmuggleDetector::check_h2_downgrade forbids connection, transfer-encoding, keep-alive, upgrade, proxy-connection, pseudo-headers starting with :, and TE with any value other than trailers.

Edge cases & security

  • CL-TE / TE-CL desync — Both canonical smuggling primitives are handled by single-purpose checks; see tests/security_smuggling_cl_te.rs and tests/security_smuggling_te_cl.rs for the regression matrix.
  • H2-to-H1 downgrade smugglingtests/security_smuggling_h2_downgrade.rs exercises pseudo-header leakage and prohibited hop-by-hop headers.
  • Slowloris / slow-POST — Not strictly 9112 but governed by the H1 pipeline. lb-security::SlowlorisDetector and SlowPostDetector bound per-connection header/body deadlines; regression in tests/security_slowloris.rs, tests/security_slow_post.rs.
  • Request URI too long, headers too large, body too large — PROMPT.md Section 10 mandates the 414/431/413 response codes; enforcement is configurable at the pipeline level (the parsers themselves are length-agnostic and leave size policing to the caller).
  • Panic-free — Parsing uses .get(...)? indexing throughout (see parse.rs:34 onward); the crate-level #![deny(clippy::unwrap_used, clippy::expect_used, clippy::indexing_slicing)] pragma is load-bearing for the halting-gate rule that treats unwrap/panic! in non-test code as failure.

Known deviations / TODO

  • The parser does not accept obs-fold (line folding with leading whitespace on a continuation line). RFC 9112 §5.2 permits recipients to reject or collapse; we reject. This is stricter than the RFC allows and can break legacy clients that still emit folded values.
  • Chunk extensions are parsed only to the extent needed to find the ; separator — we do not expose them programmatically. Extension-aware trailers (e.g. Trailer: Content-MD5) are preserved verbatim in the trailer block but not validated.
  • 100-Continue is handled at the pipeline layer (PROMPT.md §10 item 4), not in lb-h1 itself; the parser does not mark the Expect header specially.
  • Pipelining support is described in PROMPT.md but the current lb-h1 exposes only frame-level primitives; a serial-ordering queue lives in the lb-l7 pipeline, not in the codec.
  • Version is fixed to HTTP/1.0 or HTTP/1.1. HTTP/0.9 is not accepted (RFC 9112 deprecates it; we match the modern posture).

Sources