library(miloR)
library(SingleCellExperiment)
library(scater)
library(scran)
library(dplyr)
library(patchwork)
library(scRNAseq)
library(scuttle)
library(irlba)
library(BiocParallel)
library(ggplot2)

1 Introduction

Our first implementation of Milo used a negative binomial GLM to perform hypothesis testing and identify differentially abundant neighbourhoods. GLMs are incredibly powerful, but they have certain limitations. Notably, they assume that the dependent variable (nhood counts) are (conditionally) independently and identically distributed - that means there can’t be any relationship between the individual counts e.g. they can’t be derived from the same individual. This creates a dependency between count observations in the same nhood and can lead to inflated type I error rates. This is especially problematic when considering genetic analyses and organisms of the same species share a genetic ancestry, which in humans only goes back ~100,000 years. In the simplest example, imagine we performed single-cell experiments on individuals from multiple families, and from each family we included siblings and parents. Within a family the individuals would share on average 50% of their DNA, so the observations for DA testing wouldn’t be independent. For more distantly related individuals the relationships are smaller, but can be non-trivial, particularly as sample sizes increase. It’s not just genetics that leads to dependencies, multiple measurements from the same individual, e.g. multiple time points or from different organs, can also introduce dependency between observations.

We have opted to use GLMM to address this problem as they are very powerful and can explicitly account for fairly arbitrary dependencies, as long as they can be encoded either as a multi-level factor variable (sometimes referred to as clusters in the mixed effect model literature) or by an nXn matrix.

For the purpose of demonstrating how to use Milo in GLMM mode I’ll use a data set KotliarovPBMC from the scRNAseq package. These data are derived from SLE patients with variable treatment responses. This should allow us to model the batching as a random effect variable while testing for differential abundance between the high and low drug responders.

2 Load data

We will use the KotliarovPBMCData data set as there are multiple groups that we can compare.

# uncomment the row below to allow multi-processing and comment out the SerialParam line.
# bpparam <- MulticoreParam(workers=4)
bpparam <- SerialParam()
register(bpparam)

pbmc.sce <- KotliarovPBMCData(mode="rna", ensembl=TRUE) # download the PBMC data from Kotliarov _et al._

## see ?scRNAseq and browseVignettes('scRNAseq') for documentation

## loading from cache
## loading from cache

## require("ensembldb")

## Warning: Unable to map 11979 of 32738 requested IDs.

## see ?scRNAseq and browseVignettes('scRNAseq') for documentation

## loading from cache

# downsample cells to reduce compute time
prop.cells <- 0.75
n.cells <- floor(ncol(pbmc.sce) * prop.cells)
set.seed(42)
keep.cells <- sample(colnames(pbmc.sce), size=n.cells)
pbmc.sce <- pbmc.sce[, colnames(pbmc.sce) %in% keep.cells]


# downsample the number of samples
colData(pbmc.sce)$ObsID <- paste(colData(pbmc.sce)$tenx_lane, colData(pbmc.sce)$sampleid, sep="_")
n.samps <- 80
keep.samps <- sample(unique(colData(pbmc.sce)$ObsID), size=n.samps)
keep.cells <- rownames(colData(pbmc.sce))[colData(pbmc.sce)$ObsID %in% keep.samps]
pbmc.sce <- pbmc.sce[, colnames(pbmc.sce) %in% keep.cells]

pbmc.sce

## class: SingleCellExperiment 
## dim: 20759 28105 
## metadata(0):
## assays(1): counts
## rownames(20759): ENSG00000284557 ENSG00000237613 ... ENSG00000274412
##   ENSG00000283767
## rowData names(1): originalName
## colnames(28105): AAACCTGAGAGCCCAA_H1B1ln1 AAACCTGCAGTATCTG_H1B1ln1 ...
##   TTTGTCATCGGTTCGG_H1B2ln6 TTTGTCATCTACCTGC_H1B2ln6
## colData names(25): nGene nUMI ... timepoint ObsID
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):

3 Data processing and normalisation

set.seed(42)
# remove sparser cells
drop.cells <- colSums(counts(pbmc.sce)) < 1000
pbmc.sce <- computePooledFactors(pbmc.sce[, !drop.cells], BPPARAM=bpparam)
pbmc.sce <- logNormCounts(pbmc.sce)

pbmc.hvg <- modelGeneVar(pbmc.sce)
pbmc.hvg$FDR[is.na(pbmc.hvg$FDR)] <- 1

pbmc.sce <- runPCA(pbmc.sce, subset_row=rownames(pbmc.sce)[pbmc.hvg$FDR < 0.1], scale=TRUE, ncomponents=50, assay.type="logcounts", name="PCA", BPPARAM=bpparam)
pbmc.sce

## class: SingleCellExperiment 
## dim: 20759 24712 
## metadata(0):
## assays(2): counts logcounts
## rownames(20759): ENSG00000284557 ENSG00000237613 ... ENSG00000274412
##   ENSG00000283767
## rowData names(1): originalName
## colnames(24712): AAACCTGAGAGCCCAA_H1B1ln1 AAACCTGCAGTATCTG_H1B1ln1 ...
##   TTTGTCATCGAGAACG_H1B2ln6 TTTGTCATCTACCTGC_H1B2ln6
## colData names(26): nGene nUMI ... ObsID sizeFactor
## reducedDimNames(1): PCA
## mainExpName: NULL
## altExpNames(0):

4 Define cell neighbourhoods

set.seed(42)
pbmc.sce <- runUMAP(pbmc.sce, n_neighbors=30, pca=50, BPPARAM=bpparam) # add a UMAP for plotting results later

pbmc.milo <- Milo(pbmc.sce) # from the SCE object
reducedDim(pbmc.milo, "UMAP") <- reducedDim(pbmc.sce, "UMAP")

plotUMAP(pbmc.milo, colour_by="adjmfc.time") + plotUMAP(pbmc.milo, colour_by="sampleid")

These UMAPs shows how the different constituent cell types of PBMCs distributed across the drug response categories (left) and samples (right). Next we build the KNN graph and define neighbourhoods to quantify cell abundance across our experimental samples.

set.seed(42)
# we build KNN graph
pbmc.milo <- buildGraph(pbmc.milo, k = 60, d = 30)

## Constructing kNN graph with k:60

pbmc.milo <- makeNhoods(pbmc.milo, prop = 0.05, k = 60, d=30, refined = TRUE, refinement_scheme="graph") # make nhoods using graph-only as this is faster

## Checking valid object

## Running refined sampling with graph

colData(pbmc.milo)$ObsID <- paste(colData(pbmc.milo)$tenx_lane, colData(pbmc.milo)$sampleid, sep="_")
pbmc.milo <- countCells(pbmc.milo, meta.data = data.frame(colData(pbmc.milo)), samples="ObsID") # make the nhood X sample counts matrix

## Checking meta.data validity

## Counting cells in neighbourhoods

pbmc.milo

## class: Milo 
## dim: 20759 24712 
## metadata(0):
## assays(2): counts logcounts
## rownames(20759): ENSG00000284557 ENSG00000237613 ... ENSG00000274412
##   ENSG00000283767
## rowData names(1): originalName
## colnames(24712): AAACCTGAGAGCCCAA_H1B1ln1 AAACCTGCAGTATCTG_H1B1ln1 ...
##   TTTGTCATCGAGAACG_H1B2ln6 TTTGTCATCTACCTGC_H1B2ln6
## colData names(26): nGene nUMI ... ObsID sizeFactor
## reducedDimNames(2): PCA UMAP
## mainExpName: NULL
## altExpNames(0):
## nhoods dimensions(2): 24712 1081
## nhoodCounts dimensions(2): 1081 80
## nhoodDistances dimension(1): 0
## graph names(1): graph
## nhoodIndex names(1): 1081
## nhoodExpression dimension(2): 1 1
## nhoodReducedDim names(0):
## nhoodGraph names(0):
## nhoodAdjacency dimension(2): 1 1

Do we have a good distribution of nhood sizes?

plotNhoodSizeHist(pbmc.milo)

The smallest nhood is 60 (we used k=60) - that should be sufficient for the number of samples (N~120)

5 Demonstrating the GLMM syntax

Now we have the pieces in place for DA testing to demonstrate how to use the GLMM. We should first consider what our random effect variable is. There is a fair bit of debate on what constitutes a random effect vs. a fixed effect. As a rule of thumb, we can ask if the groups are randomly selected from a larger population of possible groups. For instance, if we recruited patients from 5 hospitals, could we consider the hospital as a random effect if there are actually 500 hospitals that we could have chosen? For these PBMC data we don’t have a variable in the experiment that fits this decision per se, so the tenx_lane will be arbitrarily selected (assuming cells were randomly assigned to batches).

pbmc.design <- data.frame(colData(pbmc.milo))[,c("tenx_lane", "adjmfc.time", "sample", "sampleid", "timepoint", "ObsID")]
pbmc.design <- distinct(pbmc.design)
rownames(pbmc.design) <- pbmc.design$ObsID
## Reorder rownames to match columns of nhoodCounts(milo)
pbmc.design <- pbmc.design[colnames(nhoodCounts(pbmc.milo)), , drop=FALSE]
table(pbmc.design$adjmfc.time)

## 
## d0 high  d0 low 
##      41      39

We encode the fixed effect variables as normal - but the random effects are different. We encode them as (1|variable) which tells the model that this is a random intercept. There are also random slopes GLMMs, but Milo doesn’t currently work with these. There are few other arguments we have to pass to testNhoods. We need to consider what solver we use, as the parameter estimation is a little more complex. The options are ‘Fisher’, ‘HE’ or ‘HE-NNLS’; the latter refers to a constrained optimisation for the variance components. If at any point during the optimisation negative variance estimates are encountered, then Milo will produce a warning message and automatically switch to ‘HE-NNLS’. As we are estimating variances, we also want these to be unbiased, so we use restricted maximum likelihood (REML=TRUE). Note that NB-GLMMs are by construction slower than NB-GLMs as there are additional matrix inversion steps that don’t scale very nicely. While we have made every effort to reduce the computational burden we are still limited by the bounds of matrix algebra!

set.seed(42)
rownames(pbmc.design) <- pbmc.design$ObsID

da_results <- testNhoods(pbmc.milo, design = ~ adjmfc.time + (1|tenx_lane), design.df = pbmc.design, fdr.weighting="graph-overlap",
                         glmm.solver="HE-NNLS", REML=TRUE, norm.method="TMM", BPPARAM = bpparam, fail.on.error=FALSE)

## Random effects found

## Using TMM normalisation

## Running GLMM model - this may take a few minutes

## Warning in testNhoods(pbmc.milo, design = ~adjmfc.time + (1 | tenx_lane), : 75
## out of 1081 neighborhoods did not converge; increase number of iterations?

## Performing spatial FDR correction with graph-overlap weighting

table(da_results$SpatialFDR < 0.1)

## 
## FALSE  TRUE 
##   739   340

We can see that the GLMM produces a warning if parameters don’t converge - this is important because we want to know if we can trust our estimates or not. One way to handle this is to increase max.iters (default is 50) - the downside is that this will increase the compute time and doesn’t guarantee convergence. If the nhood counts are very sparse then this can cause problems as there isn’t much information/variance from which to learn the (locally) optimal parameter values. An additional point is that the GLMM may fail on some nhoods, likely due to singular Hessian matrices during parameter estimation. These nhoods will have results with all NA values.

which(is.na(da_results$logFC))

## [1] 557 611

In this analysis there are 2 nhood models that failed. If this happens to a large number of nhoods then there may be issues with the combination of variables in the model, nhood counts might have a lot of zero-values, or there could be separation. For the latter checkSeparation can be used to check for all-zero values in the testing variables of interest. If any nhoods have perfect separation between zero and non-zero counts on a variable level then these nhoods can be excluded from the analysis.

which(checkSeparation(pbmc.milo, design.df=pbmc.design, condition="adjmfc.time", min.val=5))

## 977 
## 977

Here we can see that nhood 977 can be separated into counts >= 5 and < 5 on our test variable adjmfc.time - this is the same nhood that encounters a model failure in our GLMM. We can also visualise the count distribution to confirm this using plotNhoodCounts (kindly contributed by Nick Hirschmüller).

plotNhoodCounts(pbmc.milo, which(checkSeparation(pbmc.milo, design.df=pbmc.design, condition="adjmfc.time", min.val=5)), 
                design.df=pbmc.design, condition="adjmfc.time")

This shows the extremely low counts in the d0 high group, or specifically, that only a single observation is non-zero.

As with the GLM implementation, the GLMM calculates a log fold-change and corrected P-value for each neighbourhood, which indicates whether there is significant differential abundance between conditions for NA neighbourhoods. The hypothesis testing is slightly different - for the GLM we use edgeR::glmQLFTest which performs an F-test on the quasi-likelihood negative binomial fit. In the GLMM we use a pseudo-likelihood, so we instead we opt for a Wald-type test on the fixed effect variable log-fold change; FDR correction is performed the same.

The output of testNhoods with a GLMM has some additional columns that are worth exploring.

head(da_results[!is.na(da_results$logFC), ])

##         logFC    logCPM        SE     tvalue       PValue tenx_lane variance
## 1 -0.30838314  9.964361 0.2469503 -1.2487662 2.117532e-01         0.00000001
## 2  1.61453945 10.794070 0.3458565  4.6682345 3.039227e-06         0.90358676
## 3  0.06401847 10.049771 0.2413120  0.2652933 7.907840e-01         0.00000001
## 4 -0.38652731 10.611866 0.2345900 -1.6476720 9.942265e-02         0.00000001
## 5 -0.83390442 10.391979 0.2453895 -3.3982890 6.782408e-04         0.04665016
## 6  2.51585769 10.024907 0.4579447  5.4938028 3.934121e-08         1.83859787
##   Converged Dispersion Logliklihood Nhood   SpatialFDR
## 1      TRUE  1.8018995     54.42524     1 3.680065e-01
## 2      TRUE  0.7990389    243.28398     2 1.659338e-04
## 3      TRUE  1.7280630     58.01532     3 8.645597e-01
## 4      TRUE  1.5195629     98.34669     4 2.217264e-01
## 5      TRUE  1.5993640    113.09071     5 6.482190e-03
## 6     FALSE  0.8344926    133.79600     6 6.760123e-06

Due to the way parameter estimation is performed in the GLMM, we can directly compute standard errors (SE column) - these are used to compute the subequent test statistic (tvalue) and p-value. We next have the variance parameter estimate for each random effect variable (here ‘tenx_lane variance’). As we use constrained optimisation to prevent negative estimates some of these values will be bounded at 1e-8. We then have a column that determines which nhoods have converged - this can be useful for checking if, for example, the inference is different between converged and not-converged nhoods. We also return the estimated dispersion value and the log pseudo-likelihood in addition the same columns in the results table when using the GLM.

We can also inspect our results as we would for the GLM, by using the neighbourhood graph produced by buildNhoodGraph

pbmc.milo <- buildNhoodGraph(pbmc.milo, overlap=25)

# we need to subset the plotting results as it can't handle the NAs internally.
plotUMAP(pbmc.milo, colour_by="adjmfc.time") + plotNhoodGraphDA(pbmc.milo, da_results[!is.na(da_results$logFC), ],
                                                                subset.nhoods=!is.na(da_results$logFC), alpha=0.1) +
  plot_layout(guides="auto" )

## Warning in recycleSingleBracketReplacementValue(value, x, i): number of values
## supplied is not a sub-multiple of the number of values to be replaced

We can see that there are some complex differences between the high and low responder patients. How does this compare to running the same analysis with a GLM using the batching variable as a fixed effect?

set.seed(42)
# we need to use place the test variable at the end of the formula
glm_results <- testNhoods(pbmc.milo, design = ~ tenx_lane + adjmfc.time, design.df = pbmc.design, fdr.weighting="graph-overlap",
                          REML=TRUE, norm.method="TMM", BPPARAM = bpparam)

## Using TMM normalisation

## Performing spatial FDR correction with graph-overlap weighting

table(glm_results$SpatialFDR < 0.1)

## 
## FALSE  TRUE 
##   961   120

The first and obvious difference is that we have fewer DA nhoods. We can attribute this to the fact that the GLM uses more degrees of freedom to model the batching variable which reduces the overall statistical power.

plotUMAP(pbmc.milo, colour_by="adjmfc.time") + plotNhoodGraphDA(pbmc.milo, glm_results, alpha=0.1) +
  plot_layout(guides="auto" )

We can do a side-by-side comparison of the estimated log fold changes from the GLM and GLMM.

comp.da <- merge(da_results, glm_results, by='Nhood')
comp.da$Sig <- "none"
comp.da$Sig[comp.da$SpatialFDR.x < 0.1 & comp.da$SpatialFDR.y < 0.1] <- "Both"
comp.da$Sig[comp.da$SpatialFDR.x >= 0.1 & comp.da$SpatialFDR.y < 0.1] <- "GLM"
comp.da$Sig[comp.da$SpatialFDR.x < 0.1 & comp.da$SpatialFDR.y >= 0.1] <- "GLMM"

ggplot(comp.da, aes(x=logFC.x, y=logFC.y)) +
    geom_point(data=comp.da[, c("logFC.x", "logFC.y")], aes(x=logFC.x, y=logFC.y),
               colour='grey80', size=1) +
    geom_point(aes(colour=Sig)) +
    labs(x="GLMM LFC", y="GLM LFC") +
    facet_wrap(~Sig) +
    NULL

## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_point()`).

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).

This shows that the parameter estimates are extremely similar - this is what we should expect to see. We can see that both models identify the nhoods with the strongest DA. The difference appears in the nhoods that are more modestly DA - the GLMM has more power to identify these.

6 A note on when to use GLMM vs. GLM

In general, GLMMs require larger samples sizes than GLMs - the power gain comes from the narrower scope of the GLMM in it’s inference that leads to (generally) smaller standard errors and thus bigger test statistics. That doesn’t mean that GLMMs are inherently better than GLMs - with great power comes great responsibilities, and it’s easy to abuse a mixed effect model. In general I wouldn’t recommend using a GLMM with fewer than 50 observations and a good case for including a variable as a random effect. The simplest case for this is where you have multiple observations per experimental individual/sample and thus the nhood counts are not i.i.d. The other obvious case, as discussed in the intro is for genetic analysis or for time course data.

Session Info

sessionInfo()

## R version 4.4.0 beta (2024-04-15 r86425)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ensembldb_2.28.0             AnnotationFilter_1.28.0     
##  [3] GenomicFeatures_1.56.0       AnnotationDbi_1.66.0        
##  [5] BiocParallel_1.38.0          irlba_2.3.5.1               
##  [7] Matrix_1.7-0                 scRNAseq_2.17.10            
##  [9] MouseGastrulationData_1.17.1 SpatialExperiment_1.14.0    
## [11] MouseThymusAgeing_1.11.0     patchwork_1.2.0             
## [13] dplyr_1.1.4                  scran_1.32.0                
## [15] scater_1.32.0                ggplot2_3.5.1               
## [17] scuttle_1.14.0               SingleCellExperiment_1.26.0 
## [19] SummarizedExperiment_1.34.0  Biobase_2.64.0              
## [21] GenomicRanges_1.56.0         GenomeInfoDb_1.40.0         
## [23] IRanges_2.38.0               S4Vectors_0.42.0            
## [25] BiocGenerics_0.50.0          MatrixGenerics_1.16.0       
## [27] matrixStats_1.3.0            miloR_2.0.0                 
## [29] edgeR_4.2.0                  limma_3.60.0                
## [31] BiocStyle_2.32.0            
## 
## loaded via a namespace (and not attached):
##   [1] RcppAnnoy_0.0.22          splines_4.4.0            
##   [3] BiocIO_1.14.0             bitops_1.0-7             
##   [5] filelock_1.0.3            tibble_3.2.1             
##   [7] polyclip_1.10-6           XML_3.99-0.16.1          
##   [9] httr2_1.0.1               lifecycle_1.0.4          
##  [11] lattice_0.22-6            MASS_7.3-60.2            
##  [13] alabaster.base_1.4.0      magrittr_2.0.3           
##  [15] sass_0.4.9                rmarkdown_2.26           
##  [17] jquerylib_0.1.4           yaml_2.3.8               
##  [19] metapod_1.12.0            cowplot_1.1.3            
##  [21] DBI_1.2.2                 RColorBrewer_1.1-3       
##  [23] abind_1.4-5               zlibbioc_1.50.0          
##  [25] purrr_1.0.2               ggraph_2.2.1             
##  [27] BumpyMatrix_1.12.0        RCurl_1.98-1.14          
##  [29] tweenr_2.0.3              rappdirs_0.3.3           
##  [31] GenomeInfoDbData_1.2.12   ggrepel_0.9.5            
##  [33] alabaster.sce_1.4.0       dqrng_0.3.2              
##  [35] DelayedMatrixStats_1.26.0 codetools_0.2-20         
##  [37] DelayedArray_0.30.0       ggforce_0.4.2            
##  [39] tidyselect_1.2.1          UCSC.utils_1.0.0         
##  [41] farver_2.1.1              ScaledMatrix_1.12.0      
##  [43] viridis_0.6.5             BiocFileCache_2.12.0     
##  [45] GenomicAlignments_1.40.0  jsonlite_1.8.8           
##  [47] BiocNeighbors_1.22.0      tidygraph_1.3.1          
##  [49] tools_4.4.0               Rcpp_1.0.12              
##  [51] glue_1.7.0                gridExtra_2.3            
##  [53] SparseArray_1.4.0         xfun_0.43                
##  [55] HDF5Array_1.32.0          gypsum_1.0.0             
##  [57] withr_3.0.0               numDeriv_2016.8-1.1      
##  [59] BiocManager_1.30.22       fastmap_1.1.1            
##  [61] rhdf5filters_1.16.0       bluster_1.14.0           
##  [63] fansi_1.0.6               digest_0.6.35            
##  [65] rsvd_1.0.5                R6_2.5.1                 
##  [67] mime_0.12                 colorspace_2.1-0         
##  [69] gtools_3.9.5              paws.storage_0.5.0       
##  [71] RSQLite_2.3.6             utf8_1.2.4               
##  [73] tidyr_1.3.1               generics_0.1.3           
##  [75] FNN_1.1.4                 rtracklayer_1.64.0       
##  [77] graphlayouts_1.1.1        httr_1.4.7               
##  [79] S4Arrays_1.4.0            uwot_0.2.2               
##  [81] pkgconfig_2.0.3           gtable_0.3.5             
##  [83] blob_1.2.4                XVector_0.44.0           
##  [85] htmltools_0.5.8.1         bookdown_0.39            
##  [87] ProtGenerics_1.36.0       alabaster.matrix_1.4.0   
##  [89] scales_1.3.0              png_0.1-8                
##  [91] knitr_1.46                rjson_0.2.21             
##  [93] curl_5.2.1                cachem_1.0.8             
##  [95] rhdf5_2.48.0              stringr_1.5.1            
##  [97] BiocVersion_3.19.1        parallel_4.4.0           
##  [99] vipor_0.4.7               restfulr_0.0.15          
## [101] pillar_1.9.0              grid_4.4.0               
## [103] alabaster.schemas_1.4.0   vctrs_0.6.5              
## [105] BiocSingular_1.20.0       dbplyr_2.5.0             
## [107] beachmat_2.20.0           cluster_2.1.6            
## [109] beeswarm_0.4.0            evaluate_0.23            
## [111] tinytex_0.50              magick_2.8.3             
## [113] cli_3.6.2                 locfit_1.5-9.9           
## [115] compiler_4.4.0            Rsamtools_2.20.0         
## [117] rlang_1.1.3               crayon_1.5.2             
## [119] paws.common_0.7.2         labeling_0.4.3           
## [121] ggbeeswarm_0.7.2          stringi_1.8.3            
## [123] alabaster.se_1.4.0        viridisLite_0.4.2        
## [125] munsell_0.5.1             Biostrings_2.72.0        
## [127] lazyeval_0.2.2            ExperimentHub_2.12.0     
## [129] sparseMatrixStats_1.16.0  bit64_4.0.5              
## [131] Rhdf5lib_1.26.0           KEGGREST_1.44.0          
## [133] statmod_1.5.0             alabaster.ranges_1.4.0   
## [135] highr_0.10                AnnotationHub_3.12.0     
## [137] igraph_2.0.3              memoise_2.0.1            
## [139] bslib_0.7.0               bit_4.0.5

Mixed effect models for Milo DA testing