Skip to content

feat(capture v1): v1 analytics event implements SInk event serialization contract#53808

Open
eli-r-ph wants to merge 3 commits intoeli.r/capture-v1-sinks-5from
eli.r/capture-v1-sinks-5a
Open

feat(capture v1): v1 analytics event implements SInk event serialization contract#53808
eli-r-ph wants to merge 3 commits intoeli.r/capture-v1-sinks-5from
eli.r/capture-v1-sinks-5a

Conversation

@eli-r-ph
Copy link
Copy Markdown
Contributor

@eli-r-ph eli-r-ph commented Apr 9, 2026

Problem

Stacked PR, based on this parent PR (5th in Sinks series.) Implements the new v1::sinks::Event serialization contract for the v1 analytics event types (request's v1::Context, v1::analytics::WrappedEvent and friends) to produce the new v1::analytics::IngestionEvent which is in parity with legacy CapturedEvent shape for ingestion backend compatibility, as verified by the new test suite.

This step includes some DIY event.properties surgery:

At this stage, to avoid changing the downstream schema contract, we're injecting (with legacy prop renames!) the new v1::analytics::Event's event.option props, if present, into the legacy event.properties map as ingestion workers expect. This is a bit hacky but well tested, and shouldn't cause trouble at this step. The incoming v1 events are well validated by this step.

This PR includes a couple of other loose ends this unit of work surfaced:

  • Address some corner cases in partition key resolution on WrappedEvent
  • Expose a few more Kafka producer configs that will be handy to have; set prod defaults where practical to avoid mismatches

Changes

  • Implement SinkEvent serialization contract for v1 analytics events
  • Full validation test suite to match output parity with CapturedEvent
  • Handle a few partition key resolution corner cases
  • Expose a few missing Kafka produce config values
  • Small spot fixes for previous oversights, footguns

How did you test this code?

Locally and in CI. Integration test suite coming in a follow on after this series lands (too big to include here)

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Publish to changelog?

N/A

Docs update

N/A

@eli-r-ph eli-r-ph requested a review from a team April 9, 2026 00:03
@eli-r-ph eli-r-ph self-assigned this Apr 9, 2026
@eli-r-ph eli-r-ph changed the base branch from master to eli.r/capture-v1-sinks-5 April 9, 2026 00:05
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 9, 2026

Vulnerabilities

No security concerns identified. The IP redaction path (capture_internal"127.0.0.1") is applied consistently in both serialize_into and write_partition_key. The unsafe block in build_property_injections (String::as_mut_vec + serde_json::to_writer) is sound: Vec<u8>'s Write impl is infallible and serde_json only emits valid UTF-8.

Comments Outside Diff (2)

  1. rust/capture/src/v1/analytics/types.rs, line 405-428 (link)

    P1 Property splice assumes JSON object, but deserialization accepts any JSON

    If event.properties is a non-object value (e.g. an array — accepted by deserialization as shown by parse_event_properties_array_accepted_by_serde) and any injection is needed (session_id set, window_id set, or any Option field present), build_ingestion_data will splice the last byte off a non-}-terminated string, produce structurally invalid JSON (e.g. [1,2,3,"$session_id":"x"}), and RawValue::from_string will return an error. The event is then silently dropped as a FatalError even though it passed the deserialization layer fine.

    Either validate that properties is a JSON object before attempting the splice, or enforce it at deserialization time:

    if injection.is_empty() {
        self.event.properties.clone()
    } else {
        if raw.len() < 2 || !raw.starts_with('{') {
            return Err(format!(
                "properties must be a JSON object for injection, got: {raw:.32}"
            ));
        }
        // ... rest of splice
    }

    There is no test covering the case of non-object properties with a non-empty injection.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: rust/capture/src/v1/analytics/types.rs
    Line: 405-428
    
    Comment:
    **Property splice assumes JSON object, but deserialization accepts any JSON**
    
    If `event.properties` is a non-object value (e.g. an array — accepted by deserialization as shown by `parse_event_properties_array_accepted_by_serde`) and any injection is needed (session_id set, window_id set, or any Option field present), `build_ingestion_data` will splice the last byte off a non-`}`-terminated string, produce structurally invalid JSON (e.g. `[1,2,3,"$session_id":"x"}`), and `RawValue::from_string` will return an error. The event is then silently dropped as a `FatalError` even though it passed the deserialization layer fine.
    
    Either validate that `properties` is a JSON object before attempting the splice, or enforce it at deserialization time:
    
    ```rust
    if injection.is_empty() {
        self.event.properties.clone()
    } else {
        if raw.len() < 2 || !raw.starts_with('{') {
            return Err(format!(
                "properties must be a JSON object for injection, got: {raw:.32}"
            ));
        }
        // ... rest of splice
    }
    ```
    
    There is no test covering the case of non-object properties with a non-empty injection.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. rust/capture/src/v1/analytics/types.rs, line 544-580 (link)

    P2 Prefer parameterised tests for variation-of-same-logic groups

    Multiple test functions test variations of the same logic as individual unit tests. Per the codebase convention ("we always prefer parameterised tests"), groups like:

    • should_publish_ok_and_non_drop / should_publish_false_when_dropped / should_publish_false_when_destination_drop / should_publish_false_when_limited
    • partition_key_force_disable_analytics_main / …_overflow / …_dlq / …_historical / …_custom / …_cookieless_dlq

    should be collapsed into a single parameterised test. For example:

    #[test]
    fn should_publish_cases() {
        let cases: &[(EventResult, Destination, bool)] = &[
            (EventResult::Ok, Destination::AnalyticsMain, true),
            (EventResult::Drop, Destination::AnalyticsMain, false),
            (EventResult::Ok, Destination::Drop, false),
            (EventResult::Limited, Destination::AnalyticsMain, false),
        ];
        for (result, dest, expected) in cases {
            let mut ev = ok_wrapped("$pageview", "user-1");
            ev.result = *result;
            ev.destination = dest.clone();
            assert_eq!(ev.should_publish(), *expected, "{result:?} + {dest:?}");
        }
    }

    The same pattern applies to the partition_key_force_disable_* family.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: rust/capture/src/v1/analytics/types.rs
    Line: 544-580
    
    Comment:
    **Prefer parameterised tests for variation-of-same-logic groups**
    
    Multiple test functions test variations of the same logic as individual unit tests. Per the codebase convention ("we always prefer parameterised tests"), groups like:
    
    - `should_publish_ok_and_non_drop` / `should_publish_false_when_dropped` / `should_publish_false_when_destination_drop` / `should_publish_false_when_limited`
    - `partition_key_force_disable_analytics_main` / `…_overflow` / `…_dlq` / `…_historical` / `…_custom` / `…_cookieless_dlq`
    
    should be collapsed into a single parameterised test. For example:
    
    ```rust
    #[test]
    fn should_publish_cases() {
        let cases: &[(EventResult, Destination, bool)] = &[
            (EventResult::Ok, Destination::AnalyticsMain, true),
            (EventResult::Drop, Destination::AnalyticsMain, false),
            (EventResult::Ok, Destination::Drop, false),
            (EventResult::Limited, Destination::AnalyticsMain, false),
        ];
        for (result, dest, expected) in cases {
            let mut ev = ok_wrapped("$pageview", "user-1");
            ev.result = *result;
            ev.destination = dest.clone();
            assert_eq!(ev.should_publish(), *expected, "{result:?} + {dest:?}");
        }
    }
    ```
    
    The same pattern applies to the `partition_key_force_disable_*` family.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
This is a comment left during a code review.
Path: rust/capture/src/v1/analytics/types.rs
Line: 405-428

Comment:
**Property splice assumes JSON object, but deserialization accepts any JSON**

If `event.properties` is a non-object value (e.g. an array — accepted by deserialization as shown by `parse_event_properties_array_accepted_by_serde`) and any injection is needed (session_id set, window_id set, or any Option field present), `build_ingestion_data` will splice the last byte off a non-`}`-terminated string, produce structurally invalid JSON (e.g. `[1,2,3,"$session_id":"x"}`), and `RawValue::from_string` will return an error. The event is then silently dropped as a `FatalError` even though it passed the deserialization layer fine.

Either validate that `properties` is a JSON object before attempting the splice, or enforce it at deserialization time:

```rust
if injection.is_empty() {
    self.event.properties.clone()
} else {
    if raw.len() < 2 || !raw.starts_with('{') {
        return Err(format!(
            "properties must be a JSON object for injection, got: {raw:.32}"
        ));
    }
    // ... rest of splice
}
```

There is no test covering the case of non-object properties with a non-empty injection.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: rust/capture/src/v1/analytics/types.rs
Line: 544-580

Comment:
**Prefer parameterised tests for variation-of-same-logic groups**

Multiple test functions test variations of the same logic as individual unit tests. Per the codebase convention ("we always prefer parameterised tests"), groups like:

- `should_publish_ok_and_non_drop` / `should_publish_false_when_dropped` / `should_publish_false_when_destination_drop` / `should_publish_false_when_limited`
- `partition_key_force_disable_analytics_main` / `…_overflow` / `…_dlq` / `…_historical` / `…_custom` / `…_cookieless_dlq`

should be collapsed into a single parameterised test. For example:

```rust
#[test]
fn should_publish_cases() {
    let cases: &[(EventResult, Destination, bool)] = &[
        (EventResult::Ok, Destination::AnalyticsMain, true),
        (EventResult::Drop, Destination::AnalyticsMain, false),
        (EventResult::Ok, Destination::Drop, false),
        (EventResult::Limited, Destination::AnalyticsMain, false),
    ];
    for (result, dest, expected) in cases {
        let mut ev = ok_wrapped("$pageview", "user-1");
        ev.result = *result;
        ev.destination = dest.clone();
        assert_eq!(ev.should_publish(), *expected, "{result:?} + {dest:?}");
    }
}
```

The same pattern applies to the `partition_key_force_disable_*` family.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "address partition key handling corner ca..." | Re-trigger Greptile

// (e.g. `drop_partition_key: bool`) is needed for the
// `Limited + !preserve_locality` case that drops key WITHOUT
// setting the person processing header.
if self.force_disable_person_processing
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a TODO here but I think we have the plumbing to handle this already in v1 at this point, will follow up on this after this PR stack is landed

// remapping, IP redaction, property merging. Tackled separately.
unimplemented!("WrappedEvent::serialize_into")
fn serialize_into(&self, ctx: &Context, buf: &mut String) -> Result<(), String> {
let ingestion_data = self.build_ingestion_data()?;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got convinced by Claude early on to return String instead of a dedicated error type here but I hate it already, it's silly. Will update now or in next PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant