Skip to content

Rename fold_change column to log2_fold_change (deprecate alias)#74

Merged
sullivanj91 merged 3 commits into
ArcInstitute:mainfrom
LeonHafner:leonhafner/fix_log2fc
May 4, 2026
Merged

Rename fold_change column to log2_fold_change (deprecate alias)#74
sullivanj91 merged 3 commits into
ArcInstitute:mainfrom
LeonHafner:leonhafner/fix_log2fc

Conversation

@LeonHafner

Copy link
Copy Markdown
Collaborator

Summary

The fold_change column in pdex output has held log2(target_mean / ref_mean) since 0.2.0 despite its name suggesting a linear ratio. This silently broke downstream consumers (notably cell-eval ≤ 0.7.0,
which double-logged the column and zeroed out roughly half of all DE values).

Discussion and decision: ArcInstitute/cell-eval#232. The agreed plan is a two-release deprecation:

  • 0.2.2 (this PR): add an explicit log2_fold_change column with the correct semantics; keep fold_change as a duplicate alias for one release; emit a DeprecationWarning on every pdex(...) call.
  • 0.3.0 (follow-up): drop the fold_change alias.

Changes

  • src/pdex/_math.py — renamed the internal fold_change function to log2_fold_change (the docstring already said "log2-fold change"; pure clarity rename).
  • src/pdex/__init__.py — output pl.DataFrames now carry both fold_change and log2_fold_change (identical values); pdex() emits a DeprecationWarning directing callers to migrate.
  • pyproject.toml — version bump 0.2.10.2.2.
  • README.md, CLAUDE.md — output-schema tables updated; column order matches the actual DataFrame.
  • Tests — added a parametrised regression test (mode="ref" and mode="all") asserting log2_fold_change == log2(target_mean / ref_mean) on finite rows.

Migration

# old                                     
df["fold_change"]   # values are log2(target/ref) since 0.2.0; emits DeprecationWarning in 0.2.2
                                                                                                                                                                                                                   
# new
df["log2_fold_change"]                                                                                                                                                                                             
                                                                                                                                                                                                                   
The two columns are identical in 0.2.2. fold_change is removed in 0.3.0.                                                                                                                                           
                                                                                                                                                                                                                   
Test plan                                                                                                                                                                                                          
                                                                             
- uv run pytest -v passes                                                                                                                                                                                          
- uv run ruff format clean          
- uv run ty check clean                                                                                                                                                                                            
- CI semver-check job passes (version bumped)

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the package to version 0.2.2, renaming the fold_change metric to log2_fold_change and marking the original name as deprecated. The changes include documentation updates, internal logic adjustments to support both column names for backward compatibility, and the introduction of a deprecation warning. Feedback recommends updating docstring formulas to include the epsilon parameter for accuracy and using FutureWarning instead of DeprecationWarning to ensure users are properly notified of the upcoming breaking change.

Comment thread src/pdex/__init__.py Outdated
Comment on lines +243 to +244
``log2(target_mean / ref_mean)`` and ``percent_change`` is
``(target_mean - ref_mean) / ref_mean``. The MWU ``p_value`` and
``(target_mean - ref_mean) / ref_mean``.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The formulas for log2_fold_change and percent_change in the docstring should include the epsilon parameter to be accurate and consistent with the updated CLAUDE.md. Currently, they represent the case where epsilon=0, which may be misleading when a pseudocount is applied.

Suggested change
``log2(target_mean / ref_mean)`` and ``percent_change`` is
``(target_mean - ref_mean) / ref_mean``. The MWU ``p_value`` and
``(target_mean - ref_mean) / ref_mean``.
``log2((target_mean + epsilon) / (ref_mean + epsilon))`` and ``percent_change`` is
``(target_mean - ref_mean) / (ref_mean + epsilon)``.

Comment thread src/pdex/__init__.py Outdated
``fold_change`` is a **deprecated** alias for ``log2_fold_change``
(identical values). It is retained for one release to ease migration
and will be removed in pdex 0.3.0. New code should read
``log2_fold_change`` directly. A :class:`DeprecationWarning` is emitted

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the warning type is changed to FutureWarning (see suggestion below), this docstring should be updated accordingly.

Suggested change
``log2_fold_change`` directly. A :class:`DeprecationWarning` is emitted
``log2_fold_change`` directly. A :class:`FutureWarning` is emitted

Comment thread src/pdex/__init__.py Outdated
Comment on lines +270 to +273
"The `fold_change` column in pdex output is deprecated and will be "
"removed in pdex 0.3.0. Use `log2_fold_change` instead — it contains "
"the same values (`log2(target_mean / ref_mean)`).",
DeprecationWarning,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider using FutureWarning instead of DeprecationWarning. DeprecationWarning is filtered out by default in many non-interactive environments, whereas FutureWarning is intended for end-users to see upcoming breaking changes. Given the goal is to ensure users migrate before the 0.3.0 release, FutureWarning provides better visibility. Additionally, I've simplified the message to avoid the inaccurate formula when epsilon > 0.

Suggested change
"The `fold_change` column in pdex output is deprecated and will be "
"removed in pdex 0.3.0. Use `log2_fold_change` instead — it contains "
"the same values (`log2(target_mean / ref_mean)`).",
DeprecationWarning,
"The `fold_change` column in pdex output is deprecated and will be "
"removed in pdex 0.3.0. Use `log2_fold_change` instead — it contains "
"the same values.",
FutureWarning,

@sullivanj91 sullivanj91 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Leon!

@sullivanj91 sullivanj91 merged commit 167dc1a into ArcInstitute:main May 4, 2026
8 checks passed
@LeonHafner LeonHafner deleted the leonhafner/fix_log2fc branch May 4, 2026 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants