Hi Constantin,
Thank you very much for glmGamPoi! I've been trying to get it work on a dataset I have, where I have 9 clusters for 2 genotypes, 3 replicate for each genotype. I have tried to run:
fit <- glm_gp( bm.mat_subset,
design = ~ genotype + seurat_clusters + genotype:seurat_clusters - 1,
col_data = metadat,
subsample = TRUE,
on_disk = FALSE,
reference_level = "WT" )
c0 <- test_de(fit,
contrast = genotypeKO - genotypeWT,
pseudobulk_by = mouse ,
sort_by = pval, decreasing = FALSE)
Error in handle_design_parameter(design, data, col_data, reference_level) :
The model_matrix has more columns (18) than the there are samples in the data matrix (6 columns).
Too few replicates / too many coefficients to fit model.
The head of the design matrix:
genotypeWT genotypeKO seurat_clusters1 seurat_clusters2 seurat_clusters3 seurat_clusters4 seurat_clusters5 seurat_clusters6 seurat_clusters7 seurat_clusters8
I've looked at your example and I get the same error if I don't pre-filter the data to few clusters (NK cells, B cells and T cells), as it's done in the example. The resulting fit has 16 coeficients and the data has 16 samples ( ind + stim) so it produces the same error too few replicates/too many coeficients to fit the model. Is this a bug of do we always have to prefilter the data so the number of coeficients is less than the number of samples you "pseudobulk_by"?
would you say that doing the following would be a good way to overcome the issues above?
de_res <- test_de(fit, contrast = `stimstim` + `cellCD4 T cells:stimstim`,
pseudobulk_by = paste(stim, ind, cell, sep="_" ))
Thank you very much for your help
Miriam
sce_subset <- sce[rowSums(counts(sce)) > 100,
sample(which(! is.na(sce$cell)), 1000)]
counts(sce_subset) <- as.matrix(counts(sce_subset))
sce_subset$cell <- droplevels(sce_subset$cell)
fit <- glm_gp(sce_subset, design = ~ cell + stim + stim:cell - 1,
reference_level = "NK cells")
fit
glmGamPoiFit object:
The data had 9727 rows and 1000 columns.
A model with 16 coefficient was fitted.
> de_res <- test_de(fit, contrast = `stimstim` + `cellCD4 T cells:stimstim`,
pseudobulk_by = paste0(stim, "-", ind))
Error in handle_design_parameter(design, data, col_data, reference_level) :
The model_matrix has more columns (16) than the there are samples in the data matrix (16 columns).
Too few replicates / too many coefficients to fit model.
The head of the design matrix:
cellNK cells cellB cells cellCD14+ Monocytes cellCD4 T cells cellCD8 T cells cellDendritic cells cellFCGR3A+ Monocytes cellMegakaryocytes stimstim cellB cells:stimstim cellCD14+ Monocytes:stimstim cellCD4 T cells:stimstim cellCD8 T cells:stimstim cellDendritic cells:stimstim cellFCGR3A+ Monocytes:stimstim cellMegakaryocytes:stimstim
ctrl-101 0.05714286 0.11428571 0.4000000 0.2285714 0.14285714 0.00000000 0.02857143 0.02857143 0 0 0 0 0 0 0 0
ctrl-1015 0.04761905 0.20000000 0.2761905 0.3333333 0.03809524 0.01904762
Hi Constantin,
Thank you very much for glmGamPoi! I've been trying to get it work on a dataset I have, where I have 9 clusters for 2 genotypes, 3 replicate for each genotype. I have tried to run:
I've looked at your example and I get the same error if I don't pre-filter the data to few clusters (NK cells, B cells and T cells), as it's done in the example. The resulting fit has 16 coeficients and the data has 16 samples ( ind + stim) so it produces the same error too few replicates/too many coeficients to fit the model. Is this a bug of do we always have to prefilter the data so the number of coeficients is less than the number of samples you "pseudobulk_by"?
would you say that doing the following would be a good way to overcome the issues above?
Thank you very much for your help
Miriam