Feature request: FSDP2 QLoRA

Quantization is extremely useful and FSDP2 has lots of nice features. It would be great if we could please have the option to do QLoRA so that we can fully transition from FSDP1 to FSDP2.

Example script that fails with FSDP2 but not FSDP1 (modulo some FSDP1-specific setup):
```
import torch
from accelerate import Accelerator
from peft import LoraConfig, get_peft_model
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

# Initialize accelerator
accelerator = Accelerator()

# Load model and tokenizer
model_name = "Qwen/Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="bfloat16",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_storage="bfloat16",
)
model = AutoModelForCausalLM.from_pretrained(
    model_name, quantization_config=quantization_config, torch_dtype=torch.bfloat16, use_cache=False
)
model = get_peft_model(model, peft_config, autocast_adapter_dtype=False)

# Tokenize input
inputs = tokenizer("Hello, World!", return_tensors="pt")
inputs = {k: v.to(accelerator.device) for k, v in inputs.items()}


# Create dummy optimizer for the submodule (required for FSDP2)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)

# Prepare the model and optimizer (FSDP2 requires both model and optimizer)
prepared_model, optimizer = accelerator.prepare(model, optimizer)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: FSDP2 QLoRA #3874

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Feature request: FSDP2 QLoRA #3874

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions