Skip to content

[fal.ai] longlive/krea-realtime-video: LoRA lost after worker reset — /tmp/.daydream-scope/assets/lora/ cleared between jobs (SUPERSUISH_LoRA_V1) #923

@livepeer-tessa

Description

@livepeer-tessa

Summary

After a worker reset (session teardown + new job start), LoRA files that were previously present at /tmp/.daydream-scope/assets/lora/ are no longer available, causing longlive and krea-realtime-video pipeline load failures. The LoRA loaded fine on the first job but is missing on subsequent ones.

cc @mjh1 @emranemran

Error Messages

scope.server.pipeline_manager - ERROR - [aa6d9669] Failed to load pipeline longlive: LongLivePipeline.__init__: LoRA loading failed. File not found: /tmp/.daydream-scope/assets/lora/SUPERSUISH_LoRA_V1_000000750.safetensors. Ensure the file exists in the models/lora/ directory.. If this error persists, consider removing the models directory '/data/models' and re-downloading models.

scope.server.pipeline_manager - ERROR - [aa6d9669] Failed to load pipeline krea-realtime-video: KreaRealtimeVideoPipeline.__init__: LoRA loading failed. File not found: /tmp/.daydream-scope/assets/lora/SUPERSUISH_LoRA_V1_000000750.safetensors.

Timeline

15:14:36 - ✅ load_adapter: Loaded adapter 'SUPERSUISH_LoRA_V1_000000750' from /tmp/.daydream-scope/assets/lora/ (first job, 067a55be — success)
15:17:09 - ❌ LoRA loading failed: File not found at /tmp/.daydream-scope/assets/lora/SUPERSUISH_LoRA_V1_000000750.safetensors (second job, aa6d9669)
15:17:56 - ❌ Same error for krea-realtime-video (session aa6d9669)
15:21:46 - ❌ Same error (new job, no session prefix)
15:22:03 - ❌ Same error for longlive (new job)

Root Cause

/tmp/ is ephemeral and is cleared between fal.ai worker jobs. LoRA files uploaded by the user to /tmp/.daydream-scope/assets/lora/ persist for the duration of one job, but are gone on the next.

This is a different class of issue from:

This issue is specifically that the user's uploaded LoRA was available in /tmp/ during the first job, but /tmp/ is wiped between jobs, so subsequent sessions get a file-not-found error even though the user didn't change anything.

Affected Files

  • SUPERSUISH_LoRA_V1_000000750.safetensors — user LoRA uploaded to temp assets dir
  • Path used: /tmp/.daydream-scope/assets/lora/SUPERSUISH_LoRA_V1_000000750.safetensors
  • Pipelines affected: longlive, krea-realtime-video

Frequency (last 12h, 2026-04-12 06:09 – 18:09 UTC)

  • ~6+ occurrences across sessions aa6d9669 and multiple unnamed sessions
  • Time window: 15:17–15:22 UTC
  • App: github_f1lhgmk5v76a0ev1w0u378by-scope-app--prod

Suggested Fix

  1. Persist user LoRAs to /data/models/lora/ (persistent volume) instead of /tmp/ so they survive between jobs. The error message even mentions this path: "Ensure the file exists in the models/lora/ directory".
  2. On pipeline load, check both paths/tmp/.daydream-scope/assets/lora/ and /data/models/lora/ — and fall back gracefully.
  3. Re-upload LoRA on job reconnect — if the client knows the LoRA is needed, it should re-upload on each new job connection (similar to plugin cleanup/reinstall pattern already in place).
  4. Surface a user-friendly error distinguishing "LoRA was here but got cleaned up" from "LoRA was never uploaded".

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions