[EP] Add UDP source port configuration for RoCEv2 ECMP via UCCL_UDP_SPORT_BASE#946
[EP] Add UDP source port configuration for RoCEv2 ECMP via UCCL_UDP_SPORT_BASE#946bkpathak wants to merge 1 commit into
Conversation
| | UCCL_IB_TC | Traffic class in RDMA network | 104/0 (IB/EFA) | | ||
| | UCCL_EP_ENABLE_AGGRESSIVE_ATOMIC | Use relaxed atomics with manual `s_waitcnt vmcnt(0)` fences instead of acquire/release semantics. Required on AMD CDNA so the combine receiver actually sees the producer's tail-pointer updates over XGMI; without it the kernel deadlocks at scale. | 1 on AMD, 0 on CUDA | | ||
| | UCCL_RDMA_ADAPTIVE_SLEEP | Enable adaptive sleeping on proxy threads, by putting the proxy threads into a sleeping state if there have been no new work requests / RDMA completion events after 120s. | null | | ||
| | UCCL_UDP_SPORT_BASE | Base UDP source port for RoCEv2 QPs on mlx5 devices. Used to create entropy for ECMP routers, load balancers and 802.3ad link aggregation switches. Each QP gets a unique port starting from this value. Valid range: 1-65535. | 0 (driver decides) | |
There was a problem hiding this comment.
can we remove "Used to create entropy for ECMP routers, load balancers and 802.3ad link aggregation switches. Each QP gets a unique port starting from this value." which too lengthy I feel.
There was a problem hiding this comment.
Sure, I will rephrase it. I copied it from the man page https://man7.org/linux/man-pages/man3/mlx5dv_modify_qp_udp_sport.3.html
| else: | ||
| print("EFA not detected, building without EFA") | ||
| # MLX5 Detection | ||
| mlx5_home = os.getenv("MLX5_HOME", "/usr") |
There was a problem hiding this comment.
This does not seem to be mlx5's home folder. In addition, a server might have mlx5 nic as the frontend nic and other nic as the backend, so auto-detecting mlx5 folder does not always work.
How about not using auto-detection, but doing dlsym runtime loading for mlx5-specific function? An example can be found in https://github.com/NVIDIA/nccl/blob/master/src/misc/mlx5dvsymbols.cc.
- This dlsym only gets triggered when UCCL_UDP_SPORT_BASE is specified in env.
- Also, need to fail safe (rather than crash the whole problem) if an error happens.
There was a problem hiding this comment.
Oh, I will look into the example and send an update.
Description
[EP] Add UDP source port configuration for RoCEv2 ECMP via UCCL_UDP_SPORT_BASE
Fixes #709
Type of Change
How Has This Been Tested?
Include any tests here.
Build validated on Lambda Labs GH200 (sm_90, aarch64, Ubuntu 22.04) with mlx5 detected and
-DMLX5passed to compiler:libmlx5confirmed linked in the built wheel:End-to-end test attempted on 2×GH200 instances. Failed at RDMA buffer MR registration, Lambda Labs exposes mlx5 as a VF, which does not support GPUDirect RDMA. Full end-to-end test requires bare metal with a physical mlx5 PF.
Checklist
format.shto follow the style guidelines.build.shto verify compilation.