It seems related to using accelerator.accumulate() together with DeepSpeed’s ZeRO Stage 2 (also with 1). might be due to version incompatibilities between accelerate and deepspeed? Could you plz share the specific versions of accelerate, deepspeed?
It seems related to using accelerator.accumulate() together with DeepSpeed’s ZeRO Stage 2 (also with 1).
might be due to version incompatibilities between accelerate and deepspeed?
Could you plz share the specific versions of accelerate, deepspeed?