Skip to content

Fix Errors in trainer_sgdmf.py and movielens.py #779

@czzhangheng

Description

@czzhangheng

I tried to run this example following the docs at https://federatedscope.io/docs/recommendation/ with the command:
python federatedscope/main.py --cfg federatedscope/mf/baseline/hfl-sgdmf_fedavg_standalone_on_movielens1m.yaml
However, it did not run and reported some errors. The error occurred in the file ./federatedscope/mf/trainer/trainer_sgdmf.py. It might be caused by a torch type "Embedding". Specifically, ctx.model.embed_user.grad is incorrect, while ctx.model.embed_user.weight.grad is correct. Additionally, there are some other errors, such as "add(sparse, dense)".

I tried using ChatGPT to fix the code, and now the example can run. I checked my Git history. Here are my fixed records:

In federatedscope/mf/dataset/movielens.py, line 160-161

row = [mapping_user[mid] for _, mid in data["userId"].items()]
col = [mapping_item[mid] for _, mid in data["movieId"].items()]

In federatedscope/mf/trainer/trainer_sgdmf.py line 70, replace all funciton def hook_on_batch_backward(ctx):

def hook_on_batch_backward(ctx):
    """Private local updates in SGDMF

    """
    ctx.optimizer.zero_grad()
    ctx.loss_task.backward()

    if ctx.model.embed_user.weight.grad.is_sparse:
        dense_user_grad = ctx.model.embed_user.weight.grad.to_dense()
    else:
        dense_user_grad = ctx.model.embed_user.weight.grad

    if ctx.model.embed_item.weight.grad.is_sparse:
        dense_item_grad = ctx.model.embed_item.weight.grad.to_dense()
    else:
        dense_item_grad = ctx.model.embed_item.weight.grad

    # Inject noise
    dense_user_grad.data += get_random(
        "Normal",
        sample_shape=ctx.model.embed_user.weight.shape,
        params={
            "loc": 0,
            "scale": ctx.scale
        },
        device=ctx.model.embed_user.weight.device)
    dense_item_grad.data += get_random(
        "Normal",
        sample_shape=ctx.model.embed_item.weight.shape,
        params={
            "loc": 0,
            "scale": ctx.scale
        },
        device=ctx.model.embed_item.weight.device)

    ctx.model.embed_user.weight.grad = dense_user_grad.to_sparse()
    ctx.model.embed_item.weight.grad = dense_item_grad.to_sparse()
    ctx.optimizer.step()

    # Embedding clipping
    with torch.no_grad():
        embedding_clip(ctx.model.embed_user.weight, ctx.sgdmf_R)
        embedding_clip(ctx.model.embed_item.weight, ctx.sgdmf_R)

The code can now run, but I’m not sure if there are any other issues.
I rarely use GitHub. I might need to learn how to pull a request later.

Env.:
python 3.9
torch 1.10.1
cuda 11.3

Thank your work. Have a good day. :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions