Skip to content

Verify BacDive enzyme activities use correct edge type (not metabolite utilization) #522

@turbomam

Description

@turbomam

Context

PR #174 (opened 2024-06-04 by @realmarcin, now closed) removed NCBI_TO_ENZYME_EDGE and routed all enzyme activity data through NCBI_TO_METABOLITE_UTILIZATION_EDGE. Copilot review (Aug 2025) flagged this as a bug — enzyme activities and metabolite utilization are semantically different and should have distinct edge types.

The original problem

The BacDive transform was conflating two types of edges:

  • Enzyme activities (organism has enzyme capability, e.g. catalase) — should use biolink:capable_of / CAPABLE_OF relation
  • Metabolite utilization (organism uses/produces a metabolite) — should use biolink:has_phenotype or biolink:produces

PR #174 made this worse by merging them into one edge type.

Current state

The current bacdive.py has been heavily rewritten with a richer data model including METPO predicates and CAPABLE_OF relations. This issue may already be resolved, but it should be verified.

What needs to happen

  1. Check current bacdive.py enzyme activity edge handling — confirm enzyme activities get CAPABLE_OF or equivalent, not metabolite utilization predicates
  2. Verify edge predicates match the patterns established in the metatraits transform (PR Add metatraits data source transform and mappings (isolated) #502): biolink:capable_of for enzymes, biolink:produces for products, biolink:has_phenotype for phenotypes

Original PR

#174 by @realmarcin — closed 2026-03-18, see closing comment for full analysis.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions