The test is failing on platforms:
- Linux-6.1.85+-x86_64-with-glibc2.35
- Python 3.10.12
- pandas 2.0.3
- numpy 1.25.2
- Windows-11-10.0.22631-SP0
- Python 3.12
- pandas 2.2.2
- numpy 2.0.0
Update: The test passes with Python 3.8 and pandas 1.3.0; it seems to be a backward compatibility issue with pandas.
Failed test:
test_GEOparse.py:626 (TestGSE.test_merge_and_average)
TypeError: agg function failed [how->mean,dtype->object]
TypeError: Could not convert string 'DNA segment, Chr 8, ERATO Doi 594, expressed' to numeric
The test fails on this line:
..\src\GEOparse\GEOTypes.py:445: in annotate_and_average
tmp_data = tmp_data.groupby(group_by_column).mean()[[expression_column]]
where
tmp_data is a pandas dataframe that contains both numeric and string columns (attached: tmp_data.csv);
expression_column = 'VALUE'
group_by_column = 'GB_ACC'
Jupyter notebook reproducing the issue:
https://gist.github.com/olp-cs/9902b5cdc554afbf3faa7127ee602f20
Would it make sense to filter the columns first, to keep the numerical ones only?
The test is failing on platforms:
Update: The test passes with Python 3.8 and pandas 1.3.0; it seems to be a backward compatibility issue with pandas.
Failed test:
test_GEOparse.py:626(TestGSE.test_merge_and_average)The test fails on this line:
..\src\GEOparse\GEOTypes.py:445: inannotate_and_averagewhere
tmp_datais a pandas dataframe that contains both numeric and string columns (attached: tmp_data.csv);expression_column= 'VALUE'group_by_column= 'GB_ACC'Jupyter notebook reproducing the issue:
https://gist.github.com/olp-cs/9902b5cdc554afbf3faa7127ee602f20
Would it make sense to filter the columns first, to keep the numerical ones only?