Skip to content

BPF verifier rejects variable-length packet data passed to bpf_csum_diff in XDP (operand ordering / pkt_ptr tracking) #1562

@marcus-sa

Description

@marcus-sa

Summary

When an XDP program compiled from Rust via aya-ebpf needs to pass variable-length packet data to bpf_csum_diff (helper #28), the BPF verifier rejects the program with:

invalid access to packet, off=34 size=65535, R3(id=0,off=34,r=35)
R3 offset is outside of the packet

The same code shape works when compiled from C via clang (e.g., Katran's ipv4_l4_csum). The root cause appears to be the Rust LLVM BPF backend emitting pointer arithmetic in scalar += pkt_ptr form instead of pkt_ptr += scalar, which causes the verifier to lose packet-pointer tracking on the register passed to the helper.

Motivation

XDP programs that perform L4 NAT/LB on veth interfaces must handle CHECKSUM_PARTIAL (the kernel's TX checksum offload default on veth). The standard approach is full L4 checksum recomputation from scratch using bpf_csum_diff(NULL, 0, l4_data_ptr, l4_len, pseudo_header_seed) — this is what Katran does in C.

The kernel's bpf_csum_diff_proto has .pkt_access = true, so passing XDP packet pointers is architecturally supported. But the Rust LLVM BPF backend's code generation prevents it from working.

Reproduction

Minimal shape that triggers the verifier rejection:

#![no_std]
#![no_main]

use aya_ebpf::{macros::xdp, programs::XdpContext, bindings::xdp_action};

// bpf_csum_diff — helper #28
#[inline(always)]
unsafe fn bpf_csum_diff(
    from: *mut u32, from_size: u32,
    to: *mut u32, to_size: u32,
    seed: u32,
) -> i64 {
    let fun: unsafe extern "C" fn(*mut u32, u32, *mut u32, u32, u32) -> i64 =
        core::mem::transmute(28usize);
    fun(from, from_size, to, to_size, seed)
}

#[xdp]
pub fn xdp_csum_repro(ctx: XdpContext) -> u32 {
    match try_csum(&ctx) {
        Ok(v) => v,
        Err(()) => xdp_action::XDP_PASS,
    }
}

#[inline(always)]
fn try_csum(ctx: &XdpContext) -> Result<u32, ()> {
    let start = ctx.data();
    let end = ctx.data_end();

    // Assume L4 starts at offset 34 (Eth 14 + IPv4 20)
    let l4_off: usize = 34;
    let l4_ptr = start + l4_off;
    if l4_ptr >= end {
        return Err(());
    }

    let l4_len = (end - l4_ptr) as u32;
    // Even with a mask to bound the value, verifier still rejects:
    let l4_len = l4_len & 0xffff;
    if l4_len == 0 {
        return Err(());
    }

    let csum = unsafe {
        bpf_csum_diff(
            core::ptr::null_mut(), 0,
            l4_ptr as *mut u32, l4_len,  // <-- verifier rejects this
            0,
        )
    };

    if csum < 0 { return Err(()); }
    Ok(xdp_action::XDP_PASS)
}

#[cfg(not(test))]
#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! { loop {} }

Expected: Verifier accepts, since l4_ptr + l4_len == end (the two are derived from the same bounds check).

Actual: invalid access to packet, off=34 size=65535, R3(id=0,off=34,r=35) — the verifier does not propagate the l4_len = end - l4_ptr relationship into R4's scalar bounds.

Kernel: 6.8.0-111-generic (Ubuntu 24.04). Same failure expected on 5.10–6.6.

Analysis

The verifier's check_helper_mem_access for ARG_PTR_TO_MEM with .pkt_access = true checks R3.offset + R4.umax_value <= R3.range. The issue:

  1. R3.range = 35 — established by the if l4_ptr >= end bounds check, proving only 1 byte at offset 34.
  2. R4.umax_value = 65535 — from the & 0xffff mask; the verifier doesn't know that l4_len is actually bounded by the packet length.
  3. 34 + 65535 > 35 → rejection.

Why it works in C (clang BPF backend): clang emits the pointer arithmetic as r3 = r_pkt; r3 += r_offset (packet register on the left), which preserves the verifier's packet-pointer tracking through the addition. The Rust LLVM BPF backend appears to emit r3 = r_offset; r3 += r_pkt (scalar on the left), which converts the result to a scalar register and loses all packet-pointer provenance.

Cilium works around this in their XDP context helpers using inline assembly (asm volatile) to force the correct operand ordering (see bpf/include/bpf/ctx/xdp.h DEFINE_FUNC_CTX_POINTER macro).

Workarounds

Current workarounds for aya-ebpf users:

  1. Word-by-word bounded loop — read packet data one u16 at a time via per-access ptr_at-style bounds checks, accumulate checksum in a scalar register. Avoids passing packet pointers to helpers entirely. Risk: 750 iterations for MTU-sized packets may hit verifier instruction budget.

  2. Fixed-size chunk bpf_csum_diff — process 64 bytes at a time with compile-time constant size arguments (24 unrolled blocks). The constant size eliminates the variable-length verifier issue. This is the canonical C BPF community workaround per iovisor/bcc#2463.

  3. Per-CPU map buffer copy — copy packet data to a PerCpuArray value, then pass the map pointer (not a packet pointer) to bpf_csum_diff. Map pointers bypass the pkt_access check.

  4. Disable TX checksum offloadethtool -K $iface tx off on XDP-attached veth interfaces. Avoids the need for full recomputation entirely by ensuring full checksums on the wire. This is what Cilium does (they use TC, not XDP, on veth).

Possible upstream fix

The root issue is in the LLVM BPF backend's code generation for pointer arithmetic involving packet pointers and runtime-computed offsets/sizes. A fix could:

  • Ensure that when one operand of an add instruction is a packet pointer, the LLVM backend emits the packet-pointer register as the destination/first operand (r_pkt += r_scalar), preserving the verifier's pointer-type tracking.
  • Or provide an aya-ebpf-level API (similar to Cilium's xdp_load_bytes inline asm wrapper) that wraps the bpf_csum_diff call with correct register ordering via core::arch::asm!.

Related issues

Environment

  • aya-ebpf 0.1.1, aya-ebpf-bindings 0.1.2
  • Target: bpfel-unknown-none
  • Kernel: 6.8.0-111-generic (reproduced), project matrix 5.10–6.8
  • Rust: nightly (required for BPF target)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions