Error in handle_design_parameter(design, data, col_data, reference_level) : Too few replicates / too many coefficients to fit model.

Hi Constantin,

Thank you very much for glmGamPoi! I've been trying to get it work on a dataset I have, where I have 9 clusters for 2 genotypes, 3 replicate for each genotype. I have tried to run:
```
fit <- glm_gp( bm.mat_subset,
               design = ~ genotype + seurat_clusters + genotype:seurat_clusters - 1,
               col_data = metadat,
               subsample = TRUE,
               on_disk = FALSE,
               reference_level = "WT" )

c0 <- test_de(fit,  
                  contrast = genotypeKO - genotypeWT,
                  pseudobulk_by = mouse , 
         sort_by = pval, decreasing = FALSE)

Error in handle_design_parameter(design, data, col_data, reference_level) : 
  The model_matrix has more columns (18) than the there are samples in the data matrix (6 columns).
Too few replicates / too many coefficients to fit model.
The head of the design matrix: 
     genotypeWT genotypeKO seurat_clusters1 seurat_clusters2 seurat_clusters3 seurat_clusters4 seurat_clusters5 seurat_clusters6 seurat_clusters7 seurat_clusters8 
```
I've looked at your example and I get the same error if I don't pre-filter the data to few clusters (NK cells, B cells and T cells), as it's done in the example. The resulting fit has 16 coeficients and the data has 16 samples ( ind + stim) so it produces the same error too few replicates/too many coeficients to fit the model. Is this a bug of do we always have to prefilter the data so the number of coeficients is less than the number of samples you "pseudobulk_by"?

would you say that doing the following would be a good way to overcome the issues above? 

```
de_res <- test_de(fit, contrast = `stimstim` + `cellCD4 T cells:stimstim`, 
                  pseudobulk_by = paste(stim,  ind,  cell, sep="_" )) 

```


Thank you very much for your help

Miriam

```

sce_subset <- sce[rowSums(counts(sce)) > 100, 
                  sample(which(! is.na(sce$cell)), 1000)]
 counts(sce_subset) <- as.matrix(counts(sce_subset))
 sce_subset$cell <- droplevels(sce_subset$cell)
fit <- glm_gp(sce_subset, design = ~ cell + stim +  stim:cell - 1,
              reference_level = "NK cells")
fit
glmGamPoiFit object:
The data had 9727 rows and 1000 columns.
A model with 16 coefficient was fitted.
> de_res <- test_de(fit, contrast = `stimstim` + `cellCD4 T cells:stimstim`, 
                  pseudobulk_by = paste0(stim, "-", ind)) 

Error in handle_design_parameter(design, data, col_data, reference_level) : 
  The model_matrix has more columns (16) than the there are samples in the data matrix (16 columns).
Too few replicates / too many coefficients to fit model.
The head of the design matrix: 
          cellNK cells cellB cells cellCD14+ Monocytes cellCD4 T cells cellCD8 T cells cellDendritic cells cellFCGR3A+ Monocytes cellMegakaryocytes stimstim cellB cells:stimstim cellCD14+ Monocytes:stimstim cellCD4 T cells:stimstim cellCD8 T cells:stimstim cellDendritic cells:stimstim cellFCGR3A+ Monocytes:stimstim cellMegakaryocytes:stimstim
 ctrl-101   0.05714286  0.11428571           0.4000000       0.2285714      0.14285714          0.00000000            0.02857143         0.02857143        0                    0                            0                        0                        0                            0                              0                           0
ctrl-1015   0.04761905  0.20000000           0.2761905       0.3333333      0.03809524          0.01904762      

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in handle_design_parameter(design, data, col_data, reference_level) : Too few replicates / too many coefficients to fit model. #42

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Error in handle_design_parameter(design, data, col_data, reference_level) : Too few replicates / too many coefficients to fit model. #42

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions