1 Introduction

biodbLipidmaps is a biodb extension package that implements a connector to Lipidmaps Structure (Sud et al. 2007).

2 Installation

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install('biodbLipidmaps')

3 Initialization

The first step in using biodbLipidmaps, is to create an instance of the biodb class BiodbMain from the main biodb package. This is done by calling the constructor of the class:

mybiodb <- biodb::newInst()

During this step the configuration is set up, the cache system is initialized and extension packages are loaded.

We will see at the end of this vignette that the biodb instance needs to be terminated with a call to the terminate() method.

4 Creating a connector to Lipidmaps Structure

In biodb connections to databases are handled by connector instances that you can get from the factory. Here is the code to instantiate a connector to Lipidmaps Structure database:

conn <- mybiodb$getFactory()$createConn('lipidmaps.structure')
## Loading required package: biodbLipidmaps

5 Accessing entries

To get the number of entries stored inside the database, run:

conn$getNbEntries()
## [1] NA

To get some of the first entry IDs (accession numbers) from the database, run:

ids <- conn$getEntryIds(2)
ids
## [1] "LMFA00000001" "LMFA00000002"

To retrieve entries, use:

entries <- conn$getEntry(ids)
entries
## [[1]]
## Biodb LIPID MAPS Structure entry instance LMFA00000001.
## 
## [[2]]
## Biodb LIPID MAPS Structure entry instance LMFA00000002.

To convert a list of entries into a data frame, run:

x <- mybiodb$entriesToDataframe(entries)
## Loading required package: biodbChebi
x
##      accession chebi.id ncbi.pubchem.comp.id
## 1 LMFA00000001   178363             10930192
## 2 LMFA00000002   137783             42607281
##                                 comp.iupac.name.syst monoisotopic.mass
## 1 2-methoxy-12-methyloctadec-17-en-5-ynoyl anhydride          626.4910
## 2                    N-(3S-hydroxydecanoyl)-L-serine          275.1733
##     formula                                                                name
## 1  C40H66O5 2-methoxy-12-methyloctadec-17-en-5-ynoyl anhydride;Acetylenic acids
## 2 C13H25NO5                                                     Serratamic acid
##   lipidmaps.structure.id
## 1           LMFA00000001
## 2           LMFA00000002
##                                                                                                                                                                                       inchi
## 1 InChI=1S/C40H66O5/c1-7-9-11-23-29-35(3)31-25-19-15-13-17-21-27-33-37(43-5)39(41)45-40(42)38(44-6)34-28-22-18-14-16-20-26-32-36(4)30-24-12-10-8-2/h7-8,35-38H,1-2,9-16,19-20,23-34H2,3-6H3
## 2                                                             InChI=1S/C13H25NO5/c1-2-3-4-5-6-7-10(16)8-12(17)14-11(9-15)13(18)19/h10-11,15-16H,2-9H2,1H3,(H,14,17)(H,18,19)/t10-,11-/m0/s1
##                      inchikey molecular.mass
## 1 VOGBKCAANIAXCI-UHFFFAOYSA-N        626.963
## 2 NDDJIMSGSZNACM-QWRGUYRKSA-N        275.342

6 Running the “LMSDSearch” web service

You can access the web service “LMSDSearch” directly with the wsLmsdSearch method:

ids <- conn$wsLmsdSearch(mode='ProcessStrSearch', name="fatty", retfmt="ids")
ids
## [1] "LMFA01010000" "LMFA01140081" "LMFA01140082" "LMFA01140083" "LMFA01140084"
## [6] "LMFA01140085" "LMFA05000000" "LMFA06000000"

From this list of identifiers, we can obtain the full entry objects:

entries <- conn$getEntry(ids)

And then a data frame:

entriesDf <- mybiodb$entriesToDataframe(entries)

That you can see in table 1.


Table 1: The entries listed in the result of the search.
accession chebi.id kegg.compound.id comp.iupac.name.syst name lipidmaps.structure.id inchi inchikey molecular.mass ncbi.pubchem.comp.id monoisotopic.mass formula
LMFA01010000 35366 C00162 fatty acid fatty acid LMFA01010000 NA NA 45.0174 NA NA NA
LMFA01140081 NA NA 2-[5]-ladderane ethanoic acid 2-[5]-ladderane ethanoic acid;C14-[5]-ladderane fatty acid LMFA01140081 NA NA NA 137323820 218.1307 C14H18O2
LMFA01140082 187485 NA 2-[3]-ladderane ethanoic acid 2-[3]-ladderane ethanoic acid;C14-[3]-ladderane fatty acid LMFA01140082 InChI=1S/C14H20O2/c15-12(16)6-7-1-2-10-11(5-7)14-9-4-3-8(9)13(10)14/h7-11,13-14H,1-6H2,(H,15,16) MZLSFWGEQLSKRL-UHFFFAOYSA-N 220.3120 137323821 220.1463 C14H20O2
LMFA01140083 NA NA 8-[1]-ladderane octanoic acid 8-[1]-ladderane octanoic acid;C20-[1]-ladderane fatty acid LMFA01140083 NA NA NA 137323822 302.2246 C20H30O2
LMFA01140084 NA NA 8-[1]-ladderane octanoic acid 8-[1]-ladderane octanoic acid;C20-[1]-ladderane fatty acid LMFA01140084 NA NA NA 137323823 304.2402 C20H32O2
LMFA01140085 NA NA 6-[1]-ladderane hexanoic acid 6-[1]-ladderane hexanoic acid;C18-[1]-ladderane fatty acid LMFA01140085 NA NA NA 137323824 274.1933 C18H26O2
LMFA05000000 142622 NA NA Fatty Alcohol LMFA05000000 NA NA 31.0340 NA NA NA
LMFA06000000 35746 NA NA Fatty Aldehyde LMFA06000000 NA NA 29.0180 NA NA NA

7 Closing biodb instance

When done with your biodb instance you have to terminate it, in order to ensure release of resources (file handles, database connection, etc):

mybiodb$terminate()
## INFO  [16:07:24.965] Closing BiodbMain instance...
## INFO  [16:07:24.967] Connector "lipidmaps.structure" deleted.
## INFO  [16:07:24.975] Connector "chebi" deleted.

8 Session information

sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.18-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] biodbChebi_1.8.0     biodbLipidmaps_1.8.0 BiocStyle_2.30.0    
## 
## loaded via a namespace (and not attached):
##  [1] rappdirs_0.3.3       sass_0.4.7           utf8_1.2.4          
##  [4] generics_0.1.3       bitops_1.0-7         stringi_1.7.12      
##  [7] RSQLite_2.3.1        hms_1.1.3            digest_0.6.33       
## [10] magrittr_2.0.3       evaluate_0.22        bookdown_0.36       
## [13] fastmap_1.1.1        blob_1.2.4           plyr_1.8.9          
## [16] jsonlite_1.8.7       progress_1.2.2       DBI_1.1.3           
## [19] BiocManager_1.30.22  httr_1.4.7           fansi_1.0.5         
## [22] XML_3.99-0.14        jquerylib_0.1.4      cli_3.6.1           
## [25] rlang_1.1.1          chk_0.9.1            crayon_1.5.2        
## [28] dbplyr_2.3.4         bit64_4.0.5          withr_2.5.1         
## [31] cachem_1.0.8         yaml_2.3.7           tools_4.3.1         
## [34] memoise_2.0.1        biodb_1.10.0         dplyr_1.1.3         
## [37] filelock_1.0.2       curl_5.1.0           vctrs_0.6.4         
## [40] R6_2.5.1             BiocFileCache_2.10.0 lifecycle_1.0.3     
## [43] stringr_1.5.0        bit_4.0.5            pkgconfig_2.0.3     
## [46] pillar_1.9.0         bslib_0.5.1          glue_1.6.2          
## [49] Rcpp_1.0.11          lgr_0.4.4            xfun_0.40           
## [52] tibble_3.2.1         tidyselect_1.2.0     knitr_1.44          
## [55] htmltools_0.5.6.1    rmarkdown_2.25       compiler_4.3.1      
## [58] prettyunits_1.2.0    askpass_1.2.0        RCurl_1.98-1.12     
## [61] openssl_2.1.1

References

Sud, Manish, Eoin Fahy, Dawn Cotter, Alex Brown, Edward A. Dennis, Christopher K. Glass, Alfred H. Merrill Jr., et al. 2007. “LMSD: LIPID Maps Structure Database.” Nucleic Acids Research 35 (Database issue): D527–D532. https://doi.org/10.1093/nar/gkl838.