Contents

This vignette has been changed in BioC 3.14, when each data package (LRBase.XXX.eg.db) is deprecated and the way to provide LRBase data has changed to AnnotationHub-style.

1 Specification change of LRBase and scTensor from BioC 3.14 (Nov. 2021)

This section is for the users of previous LRBase.XXX.eg.db-type packages and scTensor. The specifications of the LRBase.XXX.eg.db and scTensor have changed significantly since BioC 3.14. Specifically, the distribution of all LRBase.XXX.eg.db-type packages will be abolished, and the policy has been switched to one where the data is placed on a cloud server called AnnotationHub, and users are allowed to retrieve the data only when they really need it. The following are the advantages of this AnnotationHub-style.

2 Introduction

2.1 About Cell-Cell Interaction (CCI) databases

Due to the rapid development of single-cell RNA-Seq (scRNA-Seq) technologies, wide variety of cell types such as multiple organs of a healthy person, stem cell niche and cancer stem cell have been found. Such complex systems are composed of communication between cells (cell-cell interaction or CCI).

Many CCI studies are based on the ligand-receptor (L-R)-pair list of FANTOM5 project1 Jordan A. Ramilowski, A draft network of ligand-receptor-mediated multicellular signaling in human, Nature Communications, 2015 as the evidence of CCI (http://fantom.gsc.riken.jp/5/suppl/Ramilowski_et_al_2015/data/PairsLigRec.txt). The project proposed the L-R-candidate genes by following two basises.

  1. Subcellular Localization
    1. Known Annotation (UniProtKB and HPRD) : The term “Secreted” for candidate ligand genes and “Plasma Membrane” for candidate receptor genes
    2. Computational Prediction (LocTree3 and PolyPhobius)
  2. Physical Binding of Proteins : Experimentally validated PPI (protein-protein interaction) information of HPRD and STRING

The project also merged the data with previous L-R database such as IUPHAR/DLRP/HPMR and filter out the list without PMIDs. The recent L-R databases such as CellPhoneDB and SingleCellSignalR also manually curated L-R pairs, which are not listed in IUPHAR/DLRP/HPMR. In Bader Laboratory, many putative L-R databases are predicted by their standards. In our framework, we expanded such L-R databases for 134 organisms based on the ortholog relationships. For the details, check the summary of rikenbit/lrbase-workflow2 https://github.com/rikenbit/lrbase-workflow#summary, which is the Snakemake workflow to create LRBase data in each bi-annual update of Bioconductor.

2.2 LRBase and scTensor framework

Our L-R databases (LRBase) are provided a cloud server called AnnotationHub, and users are allowed to retrieve the data only when they really need it. Downloaded data is stored as a cache file on our local machines by the BiocFileCache mechanism. Then, the data is converted to LRBase object by LRBaseDbi. We also developed scTensor, which is a method to detect CCI and the CCI-related L-R pairs simultaneously. This document provides the way to use LRBaseDbi, LRBase objects, and scTensor (Figure 1).