Describe the bug
When running an evaluation that includes certain "beyond-accuracy" metrics (like AveragePopularity, GiniIndex, or ShannonEntropy), the program crashes at the end of the first validation epoch.
The traceback points to recbole/evaluator/collector.py in the get_data_struct function. The code attempts to call .cpu() on all values collected for evaluation. However, some metrics (like AveragePopularity) require simple integer values (which are not torch.Tensor objects) to be collected.
This results in an AttributeError: 'int' object has no attribute 'cpu', which crashes the evaluation.
To Reproduce
Steps to reproduce the behavior:
Save this in the same folder of run_recbole.py as config.yaml
model: BPR
dataset: ml-100k
epochs: 2
eval_setting: TO_RS, 80_10_10
topk: 10
# Bug Trigger: Using 'AveragePopularity' (or GiniIndex, etc.)
metrics: [Hit, NDCG, AveragePopularity]
# Required to prevent a different KeyError
valid_metric: ndcg@10
execute run_recbole.py in the terminal:
Python run_recbole.py --config_files=config.yaml
Expected behavior
The evaluation should complete successfully for all epochs and print the results table, including the scores for Hit, NDCG, and AveragePopularity.
But the behavior that happen is:
Traceback (most recent call last):
File "C:\...\run_recbole.py", line 46, in <module>
run(
File "C:\...\recbole\quick_start\quick_start.py", line 52, in run
res = run_recbole(
^^^^^^^^^^^^
...
File "C:\...\recbole\evaluator\collector.py", line 227, in get_data_struct
self.data_struct._data_dict[key] = self.data_struct._data_dict[key].cpu()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'int' object has no attribute 'cpu'
Desktop (please complete the following information):
- OS: Windows 11
- RecBole Version Latest on github.
- Python Version 3.12.3
- PyTorch Version 2.5.1+cu118
- cudatoolkit Version None
The Fix
After the git clone, go to /RecBole/recbole/evaluator/collector.py
Modify the method get_data_struct adding this snippets
value = self.data_struct._data_dict[key]
if isinstance(value, torch.Tensor):
self.data_struct._data_dict[key] = value.cpu()
The result should look like this:
def get_data_struct(self):
"""Get all the evaluation resource that been collected.
And reset some of outdated resource.
"""
for key in self.data_struct._data_dict:
# ------------- START FIX -------------
value = self.data_struct._data_dict[key]
if isinstance(value, torch.Tensor):
self.data_struct._data_dict[key] = value.cpu()
# ------------- END FIX -------------
returned_struct = copy.deepcopy(self.data_struct)
for key in ["rec.topk", "rec.meanrank", "rec.score", "rec.items", "data.label"]:
if key in self.data_struct:
del self.data_struct[key]
return returned_struct
Describe the bug
When running an evaluation that includes certain "beyond-accuracy" metrics (like AveragePopularity, GiniIndex, or ShannonEntropy), the program crashes at the end of the first validation epoch.
The traceback points to recbole/evaluator/collector.py in the get_data_struct function. The code attempts to call .cpu() on all values collected for evaluation. However, some metrics (like AveragePopularity) require simple integer values (which are not torch.Tensor objects) to be collected.
This results in an AttributeError: 'int' object has no attribute 'cpu', which crashes the evaluation.
To Reproduce
Steps to reproduce the behavior:
Save this in the same folder of run_recbole.py as config.yaml
execute run_recbole.py in the terminal:
Python run_recbole.py --config_files=config.yamlExpected behavior
The evaluation should complete successfully for all epochs and print the results table, including the scores for Hit, NDCG, and AveragePopularity.
But the behavior that happen is:
Desktop (please complete the following information):
The Fix
After the git clone, go to
/RecBole/recbole/evaluator/collector.pyModify the method get_data_struct adding this snippets
The result should look like this: