Skip to content

[🐛BUG] AttributeError in eval_collector when using non-accuracy metrics like AveragePopularity #2194

@pareror

Description

@pareror

Describe the bug
When running an evaluation that includes certain "beyond-accuracy" metrics (like AveragePopularity, GiniIndex, or ShannonEntropy), the program crashes at the end of the first validation epoch.

The traceback points to recbole/evaluator/collector.py in the get_data_struct function. The code attempts to call .cpu() on all values collected for evaluation. However, some metrics (like AveragePopularity) require simple integer values (which are not torch.Tensor objects) to be collected.

This results in an AttributeError: 'int' object has no attribute 'cpu', which crashes the evaluation.

To Reproduce
Steps to reproduce the behavior:
Save this in the same folder of run_recbole.py as config.yaml

model: BPR
dataset: ml-100k
epochs: 2
eval_setting: TO_RS, 80_10_10
topk: 10

# Bug Trigger: Using 'AveragePopularity' (or GiniIndex, etc.)
metrics: [Hit, NDCG, AveragePopularity]

# Required to prevent a different KeyError
valid_metric: ndcg@10

execute run_recbole.py in the terminal:
Python run_recbole.py --config_files=config.yaml

Expected behavior
The evaluation should complete successfully for all epochs and print the results table, including the scores for Hit, NDCG, and AveragePopularity.
But the behavior that happen is:

Traceback (most recent call last):
  File "C:\...\run_recbole.py", line 46, in <module>
    run(
  File "C:\...\recbole\quick_start\quick_start.py", line 52, in run
    res = run_recbole(
          ^^^^^^^^^^^^
...
  File "C:\...\recbole\evaluator\collector.py", line 227, in get_data_struct
    self.data_struct._data_dict[key] = self.data_struct._data_dict[key].cpu()
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'int' object has no attribute 'cpu'

Desktop (please complete the following information):

  • OS: Windows 11
  • RecBole Version Latest on github.
  • Python Version 3.12.3
  • PyTorch Version 2.5.1+cu118
  • cudatoolkit Version None

The Fix
After the git clone, go to /RecBole/recbole/evaluator/collector.py
Modify the method get_data_struct adding this snippets

value = self.data_struct._data_dict[key]
            if isinstance(value, torch.Tensor):
                self.data_struct._data_dict[key] = value.cpu()

The result should look like this:

def get_data_struct(self):
        """Get all the evaluation resource that been collected.
        And reset some of outdated resource.
        """
        for key in self.data_struct._data_dict:
            # ------------- START FIX -------------
            value = self.data_struct._data_dict[key]
            if isinstance(value, torch.Tensor):
                self.data_struct._data_dict[key] = value.cpu()
            # ------------- END FIX -------------
            
        returned_struct = copy.deepcopy(self.data_struct)
        for key in ["rec.topk", "rec.meanrank", "rec.score", "rec.items", "data.label"]:
            if key in self.data_struct:
                del self.data_struct[key]
        return returned_struct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions