RuntimeError: CUDA error: misaligned address

使用
torchrun --nproc_per_node=1 run_demo_avatar_single_audio_to_video.py --context_parallel_size=1 --checkpoint_dir=./weights/LongCat-Video-Avatar-1.5 --stage_1=ai2v --input_json=assets/avatar/single_example_1.json --num_segments=5 --ref_img_index=10 --mask_frame_range=3 --use_distill --model_type avatar-v1.5 --use_int8
稳定复现出错：看指标，显存没有用满。不是oom




2026-06-05 01:42:30,953 - INFO - separator - Separator version 0.30.2 instantiating with output_dir: audio_temp_file/vocals, output_format: WAV
2026-06-05 01:42:30,953 - INFO - separator - Using model directory from model_file_dir parameter: ./weights/LongCat-Video-Avatar-1.5/vocal_separator
2026-06-05 01:42:30,953 - INFO - separator - Operating System: Linux #1 SMP PREEMPT_DYNAMIC Fri Aug 8 18:29:23 UTC 2025
2026-06-05 01:42:31,140 - INFO - separator - System: Linux Node: ea1f42169c47 Release: 5.14.0-570.32.1.el9_6.x86_64 Machine: x86_64 Proc: x86_64
2026-06-05 01:42:31,140 - INFO - separator - Python Version: 3.10.13
2026-06-05 01:42:31,140 - INFO - separator - PyTorch Version: 2.6.0+cu124
2026-06-05 01:42:35,529 - INFO - separator - FFmpeg installed: ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
2026-06-05 01:42:35,564 - INFO - separator - ONNX Runtime CPU package installed with version: 1.16.3
2026-06-05 01:42:35,574 - INFO - separator - CUDA is available in Torch, setting Torch device to CUDA
2026-06-05 01:42:35,574 - WARNING - separator - CUDAExecutionProvider not available in ONNXruntime, so acceleration will NOT be enabled
2026-06-05 01:42:35,575 - INFO - separator - Loading model Kim_Vocal_2.onnx...
2026-06-05 01:42:35,860 - INFO - separator - Hash of model file ./weights/LongCat-Video-Avatar-1.5/vocal_separator/Kim_Vocal_2.onnx is 970b3f9492014d18fefeedfe4773cb42
2026-06-05 01:42:36,863 - INFO - separator - Load model duration: 00:00:01
2026-06-05 01:42:48,493 - INFO - separator - Starting separation process for audio_file_path: assets/avatar/single/man.mp3
100%|██████████| 20/20 [00:46<00:00,  2.30s/it]
100%|██████████| 15/15 [00:01<00:00,  7.78it/s]
2026-06-05 01:43:41,005 - INFO - mdx_separator - Saving Vocals stem to man_(Vocals)_Kim_Vocal_2.wav...
2026-06-05 01:43:41,577 - INFO - common_separator - Audio duration is 0.02 hours (82.04 seconds).
2026-06-05 01:43:41,577 - INFO - common_separator - Using pydub for writing.
2026-06-05 01:43:42,016 - INFO - common_separator - Clearing input audio file paths, sources and stems...
2026-06-05 01:43:42,016 - INFO - separator - Separation duration: 00:00:53
Generating segment 1/5...
Denoising:   0%|          | 0/8 [00:00<?, ?it/s]
[rank0]: Traceback (most recent call last):
[rank0]:   File "/root/caoyansen/longcat-vidio/run_demo_avatar_single_audio_to_video.py", line 441, in <module>
[rank0]:     generate(args)
[rank0]:   File "/root/caoyansen/longcat-vidio/run_demo_avatar_single_audio_to_video.py", line 262, in generate
[rank0]:     output_tuple = pipe.generate_ai2v(
[rank0]:   File "/root/caoyansen/longcat-vidio/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:   File "/root/caoyansen/longcat-vidio/longcat_video/pipeline_longcat_video_avatar.py", line 1093, in generate_ai2v
[rank0]:     noise_pred = self.dit(
[rank0]:   File "/root/caoyansen/longcat-vidio/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:   File "/root/caoyansen/longcat-vidio/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:   File "/root/caoyansen/longcat-vidio/longcat_video/modules/avatar/longcat_video_dit_avatar.py", line 489, in forward
[rank0]:     block_outputs = block(
[rank0]:   File "/root/caoyansen/longcat-vidio/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:   File "/root/caoyansen/longcat-vidio/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:   File "/root/caoyansen/longcat-vidio/longcat_video/modules/avatar/longcat_video_dit_avatar.py", line 159, in forward
[rank0]:     x = x + self.cross_attn(self.pre_crs_attn_norm(x), y, y_seqlen, num_cond_latents=num_cond_latents, shape=latent_shape)
[rank0]:   File "/root/caoyansen/longcat-vidio/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:   File "/root/caoyansen/longcat-vidio/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:   File "/root/caoyansen/longcat-vidio/longcat_video/modules/attention.py", line 268, in forward
[rank0]:     output_noise = self._process_cross_attn(x_noise, cond, kv_seqlen) # [B, N_noise, C]
[rank0]:   File "/root/caoyansen/longcat-vidio/longcat_video/modules/attention.py", line 238, in _process_cross_attn
[rank0]:     cu_seqlens_q=torch.tensor([0] + [N] * B, device=q.device).cumsum(0).to(torch.int32),
[rank0]: RuntimeError: CUDA error: misaligned address
[rank0]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[rank0]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[rank0]: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: misaligned address #124

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

RuntimeError: CUDA error: misaligned address #124

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions