Hi! Sorry to trouble you again.
Could you please share the loss curves for all stages? I found the curves for ODE and CD in the closed issues, but Stage 1 and Stage 3 are currently missing.
By the way, I initialized the casual model directly with the bidirectional ckpt and run inference without any training. But the results are terribly bad, almost all noise. Is this normal?
Hi! Sorry to trouble you again.
Could you please share the loss curves for all stages? I found the curves for ODE and CD in the closed issues, but Stage 1 and Stage 3 are currently missing.
By the way, I initialized the casual model directly with the bidirectional ckpt and run inference without any training. But the results are terribly bad, almost all noise. Is this normal?