5. Creating a Multi-Scale Sequence Track

This is an example of how to create an interactive genomic visualization using the shiny.gosling package within a Shiny app. It leverages the shiny package for creating the user interface and interactivity.

We create a Shiny app that visualizes genomic data using the shiny.gosling package. It generates an interactive visualization with tracks displaying DNA base counts and annotations, allowing users to explore genomic data related to the SARS-CoV-2 virus.

Call library libraries

library(shiny)
library(shiny.gosling)

Fetching Data

Below, we use the track_data() function to fetch data from the specified URL. The data represents base counts for the SARS-CoV-2 virus genome, organized into rows and columns. It includes attributes like base, position, count, and categories (A, T, G, C).

“Multivec” is a term used in genomics to refer to a specific type of data format used for representing and visualizing multi-dimensional numerical data across genomic coordinates. It’s commonly used for representing data like ChIP-seq, ATAC-seq, Hi-C, and other genomic experiments where signals or measurements are collected at various genomic positions.

Multivec data is essentially a matrix where rows correspond to different genomic positions or regions, and columns correspond to different samples or experiments. Each entry in the matrix represents a value associated with a specific genomic position and sample. The genomic positions along the rows of the matrix are usually represented as chromosomal coordinates (chromosome name and base pair position). This allows the data to be aligned with the genome, enabling accurate visualization and analysis. There are different tools and file formats that support multivec data, allowing researchers to work with and visualize this type of data. The bigWig and bedGraph formats are commonly used for representing multivec data. Visualization tools and libraries like the UCSC Genome Browser, IGV (Integrative Genomics Viewer), and libraries like “shiny.gosling” can render multivec data visualizations.

Here are some resources and links where you can learn more about multivec data and how it’s used in genomics research:

UCSC Genome Browser:

The UCSC Genome Browser is a widely used tool for visualizing genomic data, including multivec data. Tutorial on visualizing multivec data in the UCSC Genome Browser

IGV (Integrative Genomics Viewer):

IGV is another popular genome visualization tool that supports multivec data. Tutorial on loading and visualizing multivec data in IGV

BedGraph and BigWig Formats:

These are common file formats used for representing multivec data. Explanation of the BedGraph format Explanation of the BigWig format


track1_data <- track_data(
  url = "https://server.gosling-lang.org/api/v1/tileset_info/?d=NC_045512_2-multivec",
  type = "multivec",
  row = "base",
  column = "position",
  value = "count",
  categories = c("A", "T", "G", "C"),
  start = "start",
  end = "end"
)

Creating Tracks

Here, we define two tracks (track1 and track2) that will be displayed in the visualization. track1 displays the count of DNA bases using a bar mark, and track2 displays text annotations for certain conditions.


track1 <- add_single_track(
  mark = "bar",
  y = visual_channel_y(
    field = "count", type = "quantitative", axis = "none"
  )
)

track2 <- add_single_track(
  dataTransform = track_data_transform(
    type = "filter",
    field = "count",
    oneOf = list(0),
    not = TRUE
  ),
  mark = "text",
  x = visual_channel_x(
    field = "start", type = "genomic"
  ),
  xe = visual_channel_x(
    field = "end", type = "genomic"
  ),
  size = 24,
  color = "white",
  visibility = list(list(
    operation = "less-than",
    measure = "width",
    threshold = "|xe-x|",
    transitionPadding = 30,
    target = "mark"
  ),
  list(
    operation = "LT",
    measure = "zoomLevel",
    threshold = 40,
    target = "track"
  ))
)

Defining Visual Channels

Now, lets define visual channels for track1. track1_x specifies the genomic position on the x-axis, track1_color assigns colors based on DNA bases, and track1_text specifies text annotations based on DNA bases.


track1_x <- visual_channel_x(
  field = "position", type = "genomic"
)

track1_color <- visual_channel_color(
  field = "base",
  type = "nominal",
  domain = c("A", "T", "G", "C"),
  legend = TRUE
)

track1_text <- visual_channel_text(
  field = "base", type = "nominal"
)

track1_style <- default_track_styles(
  inlineLegend = TRUE
)

Creating combined track

This code chunk combines the previously defined tracks (track1 and track2) into a single track (track3) and specifies various properties such as title, alignment, data, visual channels, and style.


track3 <- add_single_track(
  title = "NC_045512.2 Sequence",
  alignment = "overlay",
  data = track1_data,
  tracks = add_multi_tracks(
    track1, track2
  ),
  x = track1_x,
  color = track1_color,
  text = track1_text,
  style = track1_style,
  width = 800, height = 40
)

Creating the view

Lets create a view (view1) that contains the combined track (track3). It specifies properties like multi-view mode, x-axis domain, alignment, and linking.


view1 <- compose_view(
  multi = TRUE,
  centerRadius = 0,
  xDomain = list(interval = c(1, 29903)),
  linkingId = "detail",
  alignment = "stack",
  tracks = add_multi_tracks(
    track3
  )
)

Arranging the view

Next, we arrange views using the arrange_views function. It sets the title, subtitle, assembly information, layout, spacing, and includes the previously defined view1.


combined_view <- arrange_views(
  title = "SARS-CoV-2",
  subtitle = "Data Source: WashU Virus Genome Browser, NCBI, GISAID",
  assembly = list(list("NC_045512.2", 29903)),
  layout = "linear",
  spacing = 50,
  views = list(view1),
  listify = FALSE
)

Shiny App

Finally, we define the Shiny user interface (UI) using the fluidPage function. It includes the goslingOutput function to create a placeholder for the visualization. We also define the Shiny server logic. It uses the renderGosling function to render the interactive visualization using the combined_view defined earlier.


ui <- fluidPage(
  use_gosling(),
  fluidRow(
    column(6, goslingOutput("gosling_plot"))
  )
)


server <- function(input, output, session) {
  output$gosling_plot <- renderGosling({
    gosling(
      component_id = "sars_cov2",
      combined_view
    )
  })
}

shinyApp(ui, server)

Session Info


sessionInfo()
#> R version 4.4.0 RC (2024-04-16 r86468)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.20-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] sessioninfo_1.2.2                  ggbio_1.53.0                      
#>  [3] ggplot2_3.5.1                      StructuralVariantAnnotation_1.21.0
#>  [5] VariantAnnotation_1.51.0           Rsamtools_2.21.0                  
#>  [7] Biostrings_2.73.0                  XVector_0.45.0                    
#>  [9] SummarizedExperiment_1.35.0        Biobase_2.65.0                    
#> [11] MatrixGenerics_1.17.0              matrixStats_1.3.0                 
#> [13] rtracklayer_1.65.0                 GenomicRanges_1.57.0              
#> [15] GenomeInfoDb_1.41.0                IRanges_2.39.0                    
#> [17] S4Vectors_0.43.0                   BiocGenerics_0.51.0               
#> [19] shiny_1.8.1.1                      shiny.gosling_1.1.0               
#> 
#> loaded via a namespace (and not attached):
#>   [1] RColorBrewer_1.1-3       rstudioapi_0.16.0        jsonlite_1.8.8          
#>   [4] magrittr_2.0.3           GenomicFeatures_1.57.0   rmarkdown_2.26          
#>   [7] fs_1.6.4                 BiocIO_1.15.0            zlibbioc_1.51.0         
#>  [10] vctrs_0.6.5              memoise_2.0.1            RCurl_1.98-1.14         
#>  [13] base64enc_0.1-3          progress_1.2.3           htmltools_0.5.8.1       
#>  [16] S4Arrays_1.5.0           curl_5.2.1               SparseArray_1.5.0       
#>  [19] Formula_1.2-5            sass_0.4.9               bslib_0.7.0             
#>  [22] fontawesome_0.5.2        htmlwidgets_1.6.4        httr2_1.0.1             
#>  [25] plyr_1.8.9               cachem_1.0.8             GenomicAlignments_1.41.0
#>  [28] shiny.react_0.3.0        mime_0.12                lifecycle_1.0.4         
#>  [31] pkgconfig_2.0.3          Matrix_1.7-0             R6_2.5.1                
#>  [34] fastmap_1.1.1            GenomeInfoDbData_1.2.12  digest_0.6.35           
#>  [37] colorspace_2.1-0         GGally_2.2.1             AnnotationDbi_1.67.0    
#>  [40] OrganismDbi_1.47.0       Hmisc_5.1-2              RSQLite_2.3.6           
#>  [43] filelock_1.0.3           fansi_1.0.6              httr_1.4.7              
#>  [46] abind_1.4-5              compiler_4.4.0           bit64_4.0.5             
#>  [49] withr_3.0.0              htmlTable_2.4.2          backports_1.4.1         
#>  [52] BiocParallel_1.39.0      DBI_1.2.2                ggstats_0.6.0           
#>  [55] biomaRt_2.61.0           rappdirs_0.3.3           DelayedArray_0.31.0     
#>  [58] rjson_0.2.21             tools_4.4.0              foreign_0.8-86          
#>  [61] httpuv_1.6.15            nnet_7.3-19              glue_1.7.0              
#>  [64] restfulr_0.0.15          promises_1.3.0           grid_4.4.0              
#>  [67] checkmate_2.3.1          cluster_2.1.6            reshape2_1.4.4          
#>  [70] generics_0.1.3           gtable_0.3.5             BSgenome_1.73.0         
#>  [73] tidyr_1.3.1              ensembldb_2.29.0         hms_1.1.3               
#>  [76] data.table_1.15.4        xml2_1.3.6               utf8_1.2.4              
#>  [79] pillar_1.9.0             stringr_1.5.1            later_1.3.2             
#>  [82] dplyr_1.1.4              BiocFileCache_2.13.0     lattice_0.22-6          
#>  [85] bit_4.0.5                biovizBase_1.53.0        RBGL_1.81.0             
#>  [88] tidyselect_1.2.1         knitr_1.46               gridExtra_2.3           
#>  [91] ProtGenerics_1.37.0      xfun_0.43                stringi_1.8.3           
#>  [94] UCSC.utils_1.1.0         lazyeval_0.2.2           yaml_2.3.8              
#>  [97] evaluate_0.23            codetools_0.2-20         tibble_3.2.1            
#> [100] graph_1.83.0             BiocManager_1.30.22      cli_3.6.2               
#> [103] rpart_4.1.23             xtable_1.8-4             munsell_0.5.1           
#> [106] jquerylib_0.1.4          dichromat_2.0-0.1        Rcpp_1.0.12             
#> [109] dbplyr_2.5.0             png_0.1-8                XML_3.99-0.16.1         
#> [112] parallel_4.4.0           assertthat_0.2.1         blob_1.2.4              
#> [115] prettyunits_1.2.0        AnnotationFilter_1.29.0  bitops_1.0-7            
#> [118] txdbmaker_1.1.0          pwalign_1.1.0            scales_1.3.0            
#> [121] purrr_1.0.2              crayon_1.5.2             rlang_1.1.3             
#> [124] KEGGREST_1.45.0