Dear Arc Institute STATE team,
I am currently trying to use ST-HVG-Tahoe for fine-tuning and external validation with DILImap data.
I would like to know where I can find the exact 2000 HVG gene list corresponding to obsm["X_hvg"] in ST-HVG-Tahoe.
I checked both the few-shot and zero-shot model files, including:
config.yaml
var_dims.pkl
data_module.torch
hparams.yaml
generalization.toml
batch_onehot_map.pkl
cell_type_onehot_map.pkl
pert_onehot_map.pt
best.ckpt, final.ckpt, and last.ckpt
- evaluation
adata_real.h5ad, adata_pred.h5ad, and real_de.csv files
All model files indicate that input_dim, hvg_dim, and output_dim are 2000, and that embed_key is X_hvg. However, gene_names contains 62,710 genes, and I could not find a length-2000 gene name list. In the evaluation h5ad and DE csv files, the features appear to be stored only as indices 0–1999.
Could you please let me know where the exact X_hvg feature gene list and order can be found?
I need this information to align external datasets such as DILImap for fine-tuning and DEG overlap analysis.
Thank you very much for your help.
Dear Arc Institute STATE team,
I am currently trying to use ST-HVG-Tahoe for fine-tuning and external validation with DILImap data.
I would like to know where I can find the exact 2000 HVG gene list corresponding to
obsm["X_hvg"]in ST-HVG-Tahoe.I checked both the few-shot and zero-shot model files, including:
config.yamlvar_dims.pkldata_module.torchhparams.yamlgeneralization.tomlbatch_onehot_map.pklcell_type_onehot_map.pklpert_onehot_map.ptbest.ckpt,final.ckpt, andlast.ckptadata_real.h5ad,adata_pred.h5ad, andreal_de.csvfilesAll model files indicate that
input_dim,hvg_dim, andoutput_dimare 2000, and thatembed_keyisX_hvg. However,gene_namescontains 62,710 genes, and I could not find a length-2000 gene name list. In the evaluation h5ad and DE csv files, the features appear to be stored only as indices0–1999.Could you please let me know where the exact
X_hvgfeature gene list and order can be found?I need this information to align external datasets such as DILImap for fine-tuning and DEG overlap analysis.
Thank you very much for your help.