Use

To use BASiCStan, we can first use BASICS_MockSCE() to generate an toy example dataset. We will also set a seed for reproducibility.

suppressPackageStartupMessages(library("BASiCStan"))
set.seed(42)
sce <- BASiCS_MockSCE()

The interface for running MCMC using BASiCS and using alternative inference methods using Stan is very similar. It is worth noting that the joint prior between mean and over-dispersion parameters, corresponding to Regression = TRUE, is the only supported mode in BASiCStan().

amwg_fit <- BASiCS_MCMC(
    sce,
    N = 200,
    Thin = 10,
    Burn = 100,
    Regression = TRUE
)
#> altExp 'spike-ins' is assumed to contain spike-in genes.
#> see help(altExp) for details.
#> Running with spikes BASiCS sampler (regression case) ...
#> -------------------------------------------------------------
#> MCMC running time 
#> -------------------------------------------------------------
#> user: 0.341
#> system: 0.003
#> elapsed: 0.344
#> -------------------------------------------------------------
#> Output 
#> -------------------------------------------------------------
#> -------------------------------------------------------------
#> BASiCS version 2.20.0 : 
#> vertical integration (spikes case) 
#> -------------------------------------------------------------
stan_fit <- BASiCStan(sce, Method = "sampling", iter = 10)
#> Warning: There were 5 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems

The output of BASiCStan() is an object of class BASiCS_Chain, similar to the output of BASiCS_MCMC(). Therefore, you could use these as you would an object generated using Metropolis within Gibbs sampling. For example, we can plot the relationship between mean and over-dispersion estimated using the joint regression prior:

BASiCS_ShowFit(amwg_fit)

BASiCS_ShowFit(stan_fit)

Stan MCMC diagnostics

Using Stan has advantages outside of the variety of inference engines available. By returning a Stan object that we can later convert to a BASiCS_Chain object, we can leverage an even broader set of functionality. For example, Stan has the ability to easily generate MCMC diagnostics using simple functions. For example, summary() outputs a number of useful per-chain and across-chain diagnostics:

stan_obj <- BASiCStan(sce, ReturnBASiCS = FALSE, Method = "sampling", iter = 10)
#> 
#> SAMPLING FOR MODEL 'basics_regression' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.002996 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 29.96 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: WARNING: No variance estimation is
#> Chain 1:          performed for num_warmup < 20
#> Chain 1: 
#> Chain 1: Iteration: 1 / 10 [ 10%]  (Warmup)
#> Chain 1: Iteration: 2 / 10 [ 20%]  (Warmup)
#> Chain 1: Iteration: 3 / 10 [ 30%]  (Warmup)
#> Chain 1: Iteration: 4 / 10 [ 40%]  (Warmup)
#> Chain 1: Iteration: 5 / 10 [ 50%]  (Warmup)
#> Chain 1: Iteration: 6 / 10 [ 60%]  (Sampling)
#> Chain 1: Iteration: 7 / 10 [ 70%]  (Sampling)
#> Chain 1: Iteration: 8 / 10 [ 80%]  (Sampling)
#> Chain 1: Iteration: 9 / 10 [ 90%]  (Sampling)
#> Chain 1: Iteration: 10 / 10 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.046 seconds (Warm-up)
#> Chain 1:                0.029 seconds (Sampling)
#> Chain 1:                0.075 seconds (Total)
#> Chain 1:
#> Warning: There were 5 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
head(summary(stan_obj)$summary)
#>               mean se_mean sd     2.5%      25%      50%      75%    97.5%
#> log_mu[1] 3.098420     NaN  0 3.098420 3.098420 3.098420 3.098420 3.098420
#> log_mu[2] 1.058488     NaN  0 1.058488 1.058488 1.058488 1.058488 1.058488
#> log_mu[3] 2.107672     NaN  0 2.107672 2.107672 2.107672 2.107672 2.107672
#> log_mu[4] 2.245089     NaN  0 2.245089 2.245089 2.245089 2.245089 2.245089
#> log_mu[5] 2.002693     NaN  0 2.002693 2.002693 2.002693 2.002693 2.002693
#> log_mu[6] 1.457520     NaN  0 1.457520 1.457520 1.457520 1.457520 1.457520
#>           n_eff Rhat
#> log_mu[1]   NaN  NaN
#> log_mu[2]   NaN  NaN
#> log_mu[3]   NaN  NaN
#> log_mu[4]   NaN  NaN
#> log_mu[5]   NaN  NaN
#> log_mu[6]   NaN  NaN

We can then convert this object to a BASiCS_Chain and carry on a workflow as before:

Stan2BASiCS(stan_obj)
#> An object of class BASiCS_Chain
#>  5 MCMC samples.
#>  Dataset contains 100 biological genes and 100 cells (2 batches). 
#>  Object stored using BASiCS version:  2.20.0 
#>  Parameters:  mu delta s nu theta epsilon phi beta

Session info

sessionInfo()
#> R version 4.5.0 RC (2025-04-04 r88126)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] BASiCStan_1.10.0            rstan_2.32.7               
#>  [3] StanHeaders_2.32.10         BASiCS_2.20.0              
#>  [5] SingleCellExperiment_1.30.0 SummarizedExperiment_1.38.0
#>  [7] Biobase_2.68.0              GenomicRanges_1.60.0       
#>  [9] GenomeInfoDb_1.44.0         IRanges_2.42.0             
#> [11] S4Vectors_0.46.0            BiocGenerics_0.54.0        
#> [13] generics_0.1.3              MatrixGenerics_1.20.0      
#> [15] matrixStats_1.5.0          
#> 
#> loaded via a namespace (and not attached):
#>   [1] gridExtra_2.3             inline_0.3.21            
#>   [3] rlang_1.1.6               magrittr_2.0.3           
#>   [5] compiler_4.5.0            DelayedMatrixStats_1.30.0
#>   [7] loo_2.8.0                 vctrs_0.6.5              
#>   [9] pkgconfig_2.0.3           crayon_1.5.3             
#>  [11] fastmap_1.2.0             backports_1.5.0          
#>  [13] XVector_0.48.0            labeling_0.4.3           
#>  [15] scuttle_1.18.0            promises_1.3.2           
#>  [17] rmarkdown_2.29            UCSC.utils_1.4.0         
#>  [19] xfun_0.52                 bluster_1.18.0           
#>  [21] cachem_1.1.0              beachmat_2.24.0          
#>  [23] jsonlite_2.0.0            later_1.4.2              
#>  [25] DelayedArray_0.34.0       BiocParallel_1.42.0      
#>  [27] irlba_2.3.5.1             parallel_4.5.0           
#>  [29] cluster_2.1.8.1           R6_2.6.1                 
#>  [31] bslib_0.9.0               limma_3.64.0             
#>  [33] jquerylib_0.1.4           Rcpp_1.0.14              
#>  [35] assertthat_0.2.1          knitr_1.50               
#>  [37] splines_4.5.0             httpuv_1.6.15            
#>  [39] Matrix_1.7-3              igraph_2.1.4             
#>  [41] tidyselect_1.2.1          abind_1.4-8              
#>  [43] yaml_2.3.10               viridis_0.6.5            
#>  [45] codetools_0.2-20          miniUI_0.1.1.1           
#>  [47] curl_6.2.2                pkgbuild_1.4.7           
#>  [49] lattice_0.22-7            tibble_3.2.1             
#>  [51] withr_3.0.2               shiny_1.10.0             
#>  [53] posterior_1.6.1           coda_0.19-4.1            
#>  [55] evaluate_1.0.3            RcppParallel_5.1.10      
#>  [57] pillar_1.10.2             tensorA_0.36.2.1         
#>  [59] checkmate_2.3.2           distributional_0.5.0     
#>  [61] ggplot2_3.5.2             sparseMatrixStats_1.20.0 
#>  [63] rstantools_2.4.0          munsell_0.5.1            
#>  [65] scales_1.3.0              xtable_1.8-4             
#>  [67] glue_1.8.0                metapod_1.16.0           
#>  [69] tools_4.5.0               hexbin_1.28.5            
#>  [71] BiocNeighbors_2.2.0       ScaledMatrix_1.16.0      
#>  [73] locfit_1.5-9.12           scran_1.36.0             
#>  [75] cowplot_1.1.3             grid_4.5.0               
#>  [77] QuickJSR_1.7.0            edgeR_4.6.0              
#>  [79] colorspace_2.1-1          GenomeInfoDbData_1.2.14  
#>  [81] BiocSingular_1.24.0       cli_3.6.4                
#>  [83] rsvd_1.0.5                S4Arrays_1.8.0           
#>  [85] viridisLite_0.4.2         dplyr_1.1.4              
#>  [87] V8_6.0.3                  glmGamPoi_1.20.0         
#>  [89] gtable_0.3.6              sass_0.4.10              
#>  [91] digest_0.6.37             SparseArray_1.8.0        
#>  [93] dqrng_0.4.1               farver_2.1.2             
#>  [95] htmltools_0.5.8.1         lifecycle_1.0.4          
#>  [97] httr_1.4.7                statmod_1.5.0            
#>  [99] mime_0.13                 ggExtra_0.10.1           
#> [101] MASS_7.3-65

An introduction to BASiCStan

A Stan implementation of BASiCS

Outline

Use

Stan MCMC diagnostics

Session info