Skip to content

EXAMPLES/DEVICE/EP/CSRC: Removed extra params from ht functions.#1812

Open
rakhmets wants to merge 1 commit into
ai-dynamo:mainfrom
rakhmets:topic/device-ep-ht-rm-extra-param
Open

EXAMPLES/DEVICE/EP/CSRC: Removed extra params from ht functions.#1812
rakhmets wants to merge 1 commit into
ai-dynamo:mainfrom
rakhmets:topic/device-ep-ht-rm-extra-param

Conversation

@rakhmets

@rakhmets rakhmets commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What?

Removed unused low_latency_mode parameter from the functions from ht namespace.
Removed type parameter from ht::combine, and moved the assert from ht::combine to Buffer::ht_combine.

Why?

Reduced hanges in #1793 to facilitate review.

Summary by CodeRabbit

Summary

  • Refactor

    • Simplified high-throughput notification/dispatch/combine interfaces by removing the low-latency mode option.
    • Reduced template branching so fewer execution variants are selected at launch.
  • Bug Fixes

    • Added stricter scalar-type validation during combine operations to ensure incompatible input types are rejected earlier.

@rakhmets rakhmets requested review from a team, ebarilanM and itayalroy as code owners June 23, 2026 14:27
@github-actions

Copy link
Copy Markdown

👋 Hi rakhmets! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This change removes low-latency-mode parameters from HT dispatch and combine kernel declarations, launch paths, and call sites, and adds a host-side BF16 validation before ht_combine calls combine.

Changes

HT dispatch and combine cleanup

Layer / File(s) Summary
Dispatch API and launch path
examples/device/ep/csrc/kernels/api.cuh, examples/device/ep/csrc/kernels/nixl_ep_ht.cu, examples/device/ep/csrc/nixl_ep.cpp
notify_dispatch and dispatch declarations and templates drop low_latency_mode; launch selection is reduced to cached/non-cached dispatch variants, and the ht_dispatch call sites stop passing the removed argument.
Combine API, launch path, and validation
examples/device/ep/csrc/kernels/api.cuh, examples/device/ep/csrc/kernels/nixl_ep_ht.cu, examples/device/ep/csrc/nixl_ep.cpp
cached_notify and combine declarations and templates drop low_latency_mode, combine drops its leading datatype parameter, the launch wiring is updated to the new signatures, and ht_combine adds a BF16 scalar-type assertion before calling combine.

Sequence Diagram(s)

sequenceDiagram
  participant Buffer
  participant notify_dispatch
  participant dispatch
  Buffer->>notify_dispatch: launch without low_latency_mode
  Buffer->>dispatch: launch without low_latency_mode
Loading
sequenceDiagram
  participant Buffer
  participant cached_notify
  participant combine
  Buffer->>cached_notify: enqueue barrier without low_latency_mode
  Buffer->>combine: call with BF16-validated input and no type argument
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped through kernels, swift and bright,
With one less flag to dim the flight.
BF16 got its tiny test,
Then combine sailed on with cheerful zest.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main change: removing extra parameters from ht functions in the examples/device/ep/csrc area.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description includes the required What and Why sections and clearly matches the changes; the optional How section is omitted, which is acceptable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@ColinNV ColinNV left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also hard-wiring scalar type?

ColinNV
ColinNV previously approved these changes Jun 23, 2026
@nv-nmailhot nv-nmailhot dismissed ColinNV’s stale review June 24, 2026 06:28

The merge-base changed after approval.

@nv-nmailhot nv-nmailhot requested review from a team, aranadive, brminich and ovidiusm as code owners June 24, 2026 06:28
Signed-off-by: Raul Akhmetshin <rakhmetshin@nvidia.com>
@rakhmets rakhmets force-pushed the topic/device-ep-ht-rm-extra-param branch from 02df4a8 to b5ee466 Compare June 24, 2026 12:31
@rakhmets rakhmets removed request for a team, aranadive, brminich and ovidiusm June 24, 2026 12:32
@rakhmets

Copy link
Copy Markdown
Contributor Author

/build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants