Questions about Uni-FEP-Benchmarks data processing: ChEMBL versioning, assay filtering, target–PDB mapping, reference PDB selection

Hi maintainers,

Thanks for releasing Uni-FEP-Benchmarks.
I have a few technical questions about the Uni-FEP-Benchmarks data processing pipeline (ChEMBL + PDB → benchmarks):

**1. ChEMBL version / update plan**
- Was Uni-FEP-Benchmarks built using **ChEMBL v35**?
- For example, the latest EGFR assay **[CHEMBL5260455](https://www.ebi.ac.uk/chembl/explore/assay/CHEMBL5260455)** is dated **2023-04-18**, which is before the **[ChEMBL 35](https://chembl.gitbook.io/chembl-interface-documentation/downloads)** release date (December 2024).
- Do you plan to update the benchmark following new ChEMBL releases (ChEMBL is now at **v37**)?

**2. Assay / affinity endpoint filtering**
- Which affinity types are included (Ki/Kd only, or also IC50/EC50, etc.)?
- What filters are applied for assay/data quality (assay type, confidence score, curated flags, removing ambiguous units, replicates/uncertainty/outlier handling, etc.)?

**3. Target → PDB mapping**
- How is a ChEMBL target mapped to PDB structures (e.g., UniProt/SIFTS mapping, sequence alignment/similarity search)?
- How are **mutations**, engineered constructs, missing segments, or isoforms handled?

**4. Reference PDB selection**
- When multiple PDB structures are available for the same target, what is the selection strategy for the reference structure?
- For targets with distinct conformational states (e.g., **GPCR agonist vs antagonist**), how do you ensure the selected structure matches the relevant state?

**5. Reproducible pipeline / scripts**
- Are there scripts or a reproducible workflow available for the **ChEMBL + PDB → benchmark** generation? If not, any pointers to the key steps would be appreciated.

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions about Uni-FEP-Benchmarks data processing: ChEMBL versioning, assay filtering, target–PDB mapping, reference PDB selection #56

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Questions about Uni-FEP-Benchmarks data processing: ChEMBL versioning, assay filtering, target–PDB mapping, reference PDB selection #56

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions