Skip to content.

bioconductor.org

Bioconductor is an open source and open development software project
for the analysis and comprehension of genomic data.

Sections

Bioconductor FAQ


Node:Top, Next:, Previous:(dir), Up:(dir)


Frequently Asked Questions on Bioconductor

Version 1.1.2, 10 February 2006

Robert Gentleman, A.J. Rossini, and Sandrine Dudoit


Node:Introduction, Next:, Previous:Top, Up:Top

Introduction

This document contains answers to some of the most frequently asked questions about Bioconductor.


Node:Legalese, Next:, Previous:Introduction, Up:Introduction

Legalese

This document is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.

This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

A copy of the GNU General Public License is available via WWW at

http://www.gnu.org/copyleft/gpl.html.

You can also obtain it by writing to the Free Software Foundation, Inc., 59 Temple Place -- Suite 330, Boston, MA 02111-1307, USA.


Node:Obtaining this document, Next:, Previous:Legalese, Up:Introduction

Obtaining this document

The latest version of this document is always available from

http://www.bioconductor.org/docs/faq/

Node:Citing this document, Next:, Previous:Obtaining this document, Up:Introduction

Citing this document

In publications, please refer to this FAQ as Gentleman, Rossini, Dudoit and Hornik (2003), "The Bioconductor FAQ" and give the above, official URL.


Node:Feedback, Previous:Citing this document, Up:Introduction

Feedback

Feedback is most welcome.


Node:Bioconductor Basics, Next:, Previous:Introduction, Up:Top

Bioconductor Basics


Node:What is Bioconductor?, Next:, Previous:Bioconductor Basics, Up:Bioconductor Basics

What is Bioconductor?

Bioconductor is an open source and open development software project to provide tools for the analysis and comprehension of genomic data (bioinformatics).

The project was started in the Fall of 2001. The Bioconductor core team is based primarily at the Computational Biology Group in the division of Public Health Sciences at the Fred Hutchinson Cancer Research Center. Other members come from various US and international institutions.

The broad goals of the projects are to

  • provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data;
  • facilitate the integration of biological metadata in the analysis of experimental data: e.g. literature data from PubMed, annotation data from LocusLink;
  • allow the rapid development of extensible, scalable, and interoperable software;
  • promote high-quality documentation and reproducible research;
  • provide training in computational and statistical methods for the analysis of genomic data.

Node:Bioconductor Packages, Next:, Previous:What is Bioconductor?, Up:Bioconductor Basics

Bioconductor Packages

The first Bioconductor software release occurred on May 2nd, 2002. Although initial efforts focused primarily on DNA microarray data analysis, many of the software tools are general and can be used broadly for the analysis of genomic data, such as SAGE, sequence, or SNP data.

There are two main types of Bioconductor packages. One set is designed to provide basic infrastructure support that will help other developers produce high quality software for the analysis of genomic data. The other variety provide innovative methodology for analyzing genomic data. We anticipate that libraries of the second form may from time to time migrate to become libraries of the first form.

Bioconductor packages may be downloaded in their released or development versions.

General tools
Biobase Object-oriented representation and manipulation of genomic data (S4 class structure).
Biostrings Class definitions and generics for biological sequences along with pattern matching algorithms
convert Define coerce methods for microarray data objects
ctc Tools for export and import of Tree and Cluster to other programs
DynDoc Functionality to create and interact with dynamic documents, vignettes, and other navigable documents.

Icense Many functions for computing the NPMLE for censored and truncated data.
Ruuid Creates Universally Unique ID values (UUIDs) in R
Analysis
daMA Contains functions for the efficient design of factorial two-color microarray experiments and for the statistical analysis of factorial microarray data.
edd Expression density diagnostics: graphical methods and pattern recognition algorithms for distribution shape classification.
factDesign Provides a set of tools for analyzing data from factorial designed microarray experiments. The functions can be used to evaluate appropriate tests of contrast and perform single outlier detection
genefilter Tools for sequentially filtering genes using a wide variety of filtering functions. Example of filters include: number of missing value, coefficient of variation of expression measures, ANOVA p-value, Cox model p-values. Sequential application of filtering functions to genes
globaltest Testing globally whether a group of genes is significantly relatedto some clinical variable of interest.
gpls Classification using generalized partial least squares for two-group and multi-group
limma Linear models for microarray data
RMAGEML Used to handle MAGE-ML documents in Bioconductor
MeasurementError.cor Two-stage measurement error model for correlation estimation with smaller bias then the usual sample correlation
multtest Multiple testing procedures for controlling the family-wise error rate (FWER) and the false discovery rate (FDR). Tests can be based on t- or F-statistics for one- and two-factor designs, and permutation procedures are available to estimate adjusted p-values.
pamr Some functions for sample classification in microarrays
ROC Receiver Operating Characteristic (ROC) approach for identifying genes that are differentially expressed int two types of samples.
siggenes Identifying differentially expressed genes and estimating the False Discovery Rate with both the Significance Analysis of Microarrays and the Empirical Bayes Analyses of Microarrays
splicegear A set of tools to work with alternative splicing
Annotation
annotate Associate experimental data in real time to biological metadata from web databases such as GenBank, LocusLink and PubMed. Process and store query results. Generate HTML reports of analyses.
AnnBuilder Assemble and process genomic annotation data, from databases such as GenBank, the Gene Ontology Consortium, LocusLink, UniGene, the UCSC Human Genome Project.
Data packages XML and R annotation data packages, providing mappings between different probe identifiers (e.g. Affy IDs, LocusLink, PubMed)
Resourcer This package allows user either to read an annotation data file from TIGR Resourcerer as a matrix or convert the file into a Bioconductor annotation data package using the AnnBuilder package.
SNPtools Provides bindings to the XML RPC services of chip.org::SNPper
Database Interaction
Rdbi Generic framework for database access in R
RdbiPgSQL Provides methods for accessing data stored in PostgresSQL
SAGElyzer Locates genes based on SAGE tags
Graphics & User Interface
affylmGUI A Graphical User Interface for affy analysis using the limma Microarray package by Gordon Smyth.
geneplotter Graphical tools for genomic data, for example for plotting expression data along a chromosome or producing color images of expression data matrices.
hexbin Binning functions, in particular hexagonal bins for graphing.
limmaGUI A Graphical User Interface for the limma Microarray package
tkWidgets Widgets in Tcl/Tk that provide functionality for Bioconductor packages.
webbioc An integrated web interface for doing microarray analysis using several of the Bioconductor packages. It is intended to be deployed as a centralized bioinformatics resource for use by many users. (Currently only Affymetrix oligonucleotide analysis is supported.)
widgetTools Tools for creating Tcl/Tk widgets, i.e., small-scale graphical user interfaces.
Graphs
graph Classes and tools for creating and manipulating graphs within R.
RBGL A package that creates an interface between the graph package and the Boost graph libraries, allowing for fast manipulation of graph objects in R.
Rgraphviz Provides an interface with Graphviz for plotting graph objects in R.
SNAData Data from Wasserman & Faust (1999) "Social Network Analysis"
Proteomics
gpls Classification using generalized partial least squares for two-group and multi-group (more than 2 group) classification.
PROcess A package for processing protein mass spectrometry data.
apComplex This package contains functions to estimate a bipartite graph representing protein complex membership using data from AP-MS technology.
arrayCGH
DNAcopy Segments DNA copy number data using circular binary segmentation to detect regions with abnormal copy number.
aCGH Functions for reading aCGH data from image analysis output files and clone information files, creation of aCGH S3 objects for storing these data. Basic methods for accessing/replacing, subsetting, printing and plotting aCGH objects.
GLAD
Analysis of array CGH data : detection of breakpoints in genomic profiles and assignment of a status (gain, normal or lost) to each chromosomal regions identified.

Pre-processing
affy, affycomp, affydata, affypdnn, affyPLM, gcrma, makecdfenv Diagnostic plots, expression measures, and normalization for Affymetrix chip data.
annaffy Functions for handling data from Bioconductor Affymetrix annotation data packages. Produces compact HTML and text reports including experimental data and URL links to many online databases.
marray Diagnostic plots and normalization for cDNA microarray data.
matchprobes Tools for sequence matching of probes on arrays
vsn Calibration and variance stabilizing transformations for both Affymetrix and cDNA array data.
Ontologies
GOstats.html A set of tools for interacting with GO and microarray data. A variety of basic manipulation tools for graphs, hypothesis testing and other simple calculations.
goTools Wraper functions for description/comparison of oligo ID list using Gene Ontology database
ontoTools Tools for working with ontologies

Node:Main Features of Bioconductor, Next:, Previous:Bioconductor Packages, Up:Bioconductor Basics

Main Features of Bioconductor


Node:Use of R, Next:, Previous:Main Features of Bioconductor, Up:Main Features of Bioconductor

Use of R

R and the R package system are the main vehicles for designing and releasing software. R (www.r-project.org) is a widely used open source language and environment for statistical computing and graphics - GNU's S-Plus. It provides a high-level programming environment together with a sophisticated packaging and testing paradigm. It has a number of mechanisms that allow it to interact directly with software that has been written in many different languages (see Omega Project). These tools allow users to incorporate modules based on other work. Viewed in that context, adopting R as a vehicle does not exclude other development environments and paradigms. R can, in those cases, provide a glue or connectivity linking what might otherwise be different products. Finally, R is under very active development by a dedicated team of researchers with a strong commitment to good documentation and software design.


Node:Documentation and reproducible research, Next:, Previous:Use of R, Up:Main Features of Bioconductor

Documentation and reproducible research

One of the goals of the project is to provide high-quality documentation and encourage reproducible research.

Each package contains at least one vignette, which is a document that provides a textual, task-oriented description of the package's functionality and that can be used interactively. Packages vignettes come in several forms. Many are simple "HowTo"s, that is, they are designed to demonstrate how a particular task can be accomplished with that package's software. Others provide a more thorough overview of the package, or might even discuss general issues related to the package. In the future, we are looking towards providing vignettes that are not specifically tied to a package, but rather are demonstrating more complex concepts. As with all aspects of the Bioconductor project, users are encouraged to participate in this effort.

The vignettes are generated using the Sweave function from the R package tools. They are documents that intermix text, code, and output (textual and graphical) and can be regenerated automatically whenever the data or analyses change. Additional supporting software for vignettes will aid users in obtaining data and sample code, step through specific analyses, and apply these analyses to their own data. Vignette sources are in the inst/docs directory of the packages.

The tkWidgets package provides functions and widgets for viewing and testing vignette code chunks interactively (e.g. vExplorer).


Node:Statistical and graphical methods, Next:, Previous:Documentation and reproducible research, Up:Main Features of Bioconductor

Statistical and graphical methods

The Bioconductor project aims to provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data. Analysis packages are available for: pre-processing Affymetrix and cDNA array data; identifying differentially expressed genes; graph theoretical analyses; plotting genomic data. In addition, the R package system itself provides implementations for a broad range of state-of-the-art statistical and graphical techniques, including linear and non-linear modeling, cluster analysis, prediction, resampling, survival analysis, and time-series analysis.


Node:Annotation, Next:, Previous:Statistical and graphical methods, Up:Main Features of Bioconductor

Annotation

The Bioconductor project provides software for associating microarray and other genomic data in real time to biological metadata from web databases such as GenBank, LocusLink and PubMed (annotate package). Functions are also provided for incorporating the results of statistical analysis in HTML reports with links to annotation WWW resources.

Software tools are available for assembling and processing genomic annotation data, from databases such as GenBank, the Gene Ontology Consortium, LocusLink, UniGene, the UCSC Human Genome Project (AnnBuilder package).

Data packages are distributed to provide mappings between different probe identifiers (e.g. Affy IDs, LocusLink, PubMed). Customized annotation libraries can also be assembled.


Node:Graphical user interface, Next:, Previous:Annotation, Up:Main Features of Bioconductor

Graphical user interface

Perhaps the largest problem with using a language such as R is that first time users can be discouraged by the complexity of the language. Another major focus of the project is therefore to provide some form of graphical user interface (GUI) for selected tasks. The approach is programmatic to a large extent, thereby allowing any package developer to use these tools to provide a graphical user interface for their package.

To overcome this barrier to entry we are designing a widget mechanism to provide interactive access to much of the Bioconductor functionality. A widget can be thought of as a small-scale GUI. It builds on the tcltk package which provides an interface and language bindings to Tcl/Tk GUI elements in R. The tkWidgets package was used to generate widgets for file browsing and data input in the affy and marrayInput packages. Many more extensions are planned for the near future.


Node:Bioconductor short courses, Next:, Previous:Graphical user interface, Up:Main Features of Bioconductor

Bioconductor short courses

The Bioconductor projects has developed a program of short courses on software and statistical methods for the analysis of genomic data. Courses have been given for audiences with backgrounds in either biology or statistics. All course materials (lectures and computer labs) are available on the WWW. Customized short courses may also be designed for interested parties.


Node:Open source, Next: Previous:Bioconductor short courses, Up:Main Features of Bioconductor

Open source

Bioconductor has a commitment to full open source discipline, with distribution via a SourceForge-like platform. All contributions are expected to exist under an open source license such as GPL2 or BSD. There are many different reasons why open--source software is beneficial to the analysis of microarray data and to computational biology in general. The reasons include:

  • full access to algorithms and their implementation
  • the ability to fix bugs and extend and improve the supplied software
  • to encourage good scientific computing and statistical practice by providing appropriate tools and instruction
  • to provide a workbench of tools that allow researchers to explore and expand the methods used to analyze biological data
  • to ensure that the international scientific community is the owner of the software tools needed to carry out research
  • to lead and encourage commercial support and development of those tools that are successful
  • to promote reproducible research by providing open and accessible tools with which to carry out that research [reproducible research is distinct from independent verification]

Node:Open development, Previous:Open source,# Up:Main Features of Bioconductor

Open development

Bioconductor is a collaborative research effort. Users are encouraged to become developers, either by contributing Bioconductor compliant packages or documentation.

Additionally Bioconductor will provide a mechanism for linking together different groups with common goals. Hopefully this will foster collaboration on software, possibly at the level of shared development. But at the least it will provide information on which projects are already under development.


Node:What is the current version of Bioconductor?, Next:, Previous:Main Features of Bioconductor, Up:Bioconductor Basics

What is the current version of Bioconductor?

The current released version is announced on the Bioconductor home page. A Bioconductor 'release' is a snapshot of (almost) all Bioconductor packages. The release is meant to be as bug-free and stable as possible. A Bioconductor release coincides with each major R release, and occurs approximately every six months. Each release incorporates major changes to existing packages made since the previous release, and includes new contributed packages.


Node:How can Bioconductor be obtained?, Next:, Previous:What is the current version of Bioconductor?, Up:Bioconductor Basics

How can Bioconductor be obtained?

Sources, binaries, and documentation for released and development versions of Bioconductor packages can be downloaded from the Bioconductor website. Additional packages may be downloaded from the "Comprehensive R Archive Network" (CRAN).


Node:For Unix/Linux, Next:, Previous:How can Bioconductor be obtained?, Up:How can Bioconductor be obtained?

For Unix/Linux

Installing R

  1. Download the most recent version of R (currently 2.0.0) from CRAN by following the links to the appropriate distribution for your system or by downloading the source code R-2.0.1.tgz. The link to "FAQs" under the "Documentation" section on the R website provides detailed information on how to download and install R for different systems.
  2. Start the R program by typing "R" at the shell prompt. To start the R help browser type "help.start()" . For help on any function, e.g. the "mean" function, type "? mean".

Installing Bioconductor packages using biocLite.R

  • Users are encouraged to use biocLite.R to obtain, install and update their packages. Information on how to use thisfunctions can be found in the biocLite sections of this document.

Installing Bioconductor packages using R INSTALL

  • Alternately, you may download and install Bioconductor packages as any R add-on packages using the command
    "R CMD INSTALL /path/to/pkg_version.tar.gz" at the shell prompt. Instructions for doing so are given in the R FAQ on R add-on packages.

Node:For Windows, Previous:For Unix/Linux, Up:How can Bioconductor be obtained?

For Windows

Installing R

  1. Download the most recent version of R (currently 2.0.1) from CRAN by following the links to "Windows (95 and later)" and then "base". Consult the file ReadMe.rwxxxx "ReadMe.rwxxxx" for detailed instructions.
  2. Save the file "SetupR.exe" on your desktop, double click on the icon and follow the installation instructions. This file contains all the R components, and you can select what you want installed.
  3. Start the R program by double clicking on "Rgui.exe". To start the R help browser type "help.start()" or use the menu. For help on any function, e.g. the "mean" function, type "? mean".

Installing Bioconductor packages using the installation scripts biocLite.R

  • Users are encouraged to use biocLite.R to obtain, install and update their packages. Information on how to use these functions can be found in the biocLite section of this document.

Installing Bioconductor packages using install.packages

  • Alternately, you may download and install Bioconductor packages as any R add-on packages. Instructions for doing so are given in the R for Windows FAQ on Packages. This involves downloading the pre-compiled Windows versions of the packages as .zip files and using the "install.packages" function or menu option "Install package from local zip file ..." under "Packages".

Node:For Raqua, Next:Using biocLite, Previous:For Windows, Up:How can Bioconductor be obtained?

Using Raqua

Go to the R menu, select Packages&Data->Package Installer. In that window, choose Bioconductor (binaries) and click Get List. Choose the packages that you wish to install and click Install/Update.


Node:Using biocLite, Next:Downloading All Packages From A Repository, Previous:For Raqua, Up:How can Bioconductor be obtained?

Using biocLite to obtain Bioconductor packages

  1. biocLite is the simplest and fastest way users for users to get started with Bioconductor. biocLite requires an R version >= 2.1.0.
    From your R session, type:
    source("http://bioconductor.org/biocLite.R")
    
    this will download the biocLite functionality into your R session.
  2. To install Bioconductor packages, use the function "biocLite" by typing:
    biocLite()
    biocLite installs a core subset of package. A list of packages can be obtained with the R commands (the first line needs to be entered only once per session)
    source("http://bioconductor.org/biocLite.R")
    biocinstallPkgGroups("lite")
    
You may also call "biocLite "with other arguments, including
    • destdir: the directory where the downloaded packages will be stored.
    • lib: character vector giving the library directories under which packages may be installed. Recycled as needed.
    • pkgs: character vector of Bioconductor packages to install.
    • The script will try to install the downloaded packages and print "Installation complete" and TRUE on the screen when the installation was successful.

    Node:Downloading All Packages From A Repository, Next:Repositories Currently Available, Previous:Using biocLite, Up:How can Bioconductor be obtained?

    Downloading All Packages From A Repository

    To download all of the packages from a repository into a directory, use the download.packages2 function. To do this, select the repository you wish to download the packages from (using repositories for instance) and run download.packages2(repEntry=REP, destDir=DIR), where REP is your ReposEntry object and DIR is the directory you'd like them downloaded to (e.g. ".").


    Node:Repositories Currently Available, Previous:Downloading All Packages From A Repository, Up:How can Bioconductor be obtained?

    Repositories Currently Available

    These are the default repositories that come built into biocLite. The actual URLs might vary depending on your usages of mirrors.

    • /CRANrepository: A reposTools style repository consisting of the CRAN packages.
    • /packages/bioc/stable/src/contrib/Source: A repository consisting of the source packages for BioC 1.5
    • /packages/bioc/stable/src/contrib/Win32: A repository consisting of the windows packages for BioC 1.5
    • /repository/devel/package/Source: A repository consisting of the current source versions of the developmental packages
    • /repository/devel/package/Win32: A repository consisting of the current windows versions of the developmental packages
    • /data/metaData: This repository contains the Bioconductor annotation packages
    • /data/metaData-devel: This repository contains the developmental versions of the Bioconductor annotation packages.
    • /data/experimental/repos: A repository for the Bioconductor experimental data packages.
    • /repository/Courses: This repository contains the packages for various short courses from Bioconductor.
    • /data/cdfenvs/repos: A repository for the Bioconductor CDF data packages.
    • /data/probes/Packages: A repository for the Bioconductor probeset packages.
    • /repository/Omegahat: A repository for various Omegahat (www.omegahat.org) packages.
    • /repository/lindsey: A repository of Jim Lindsey's packages

    Node:Other Notes, Previous:Repositories Currently Available, Up:How can Bioconductor be obtained?

    Other Notes

    Some users with proxies and firewalls might have difficulties in downloading packages properly with these functions. This is covered in the R FAQ question The Internet Download Functions Fail. If you have further difficulty with network enabled R functions, this is best brought up on the R-help mailing list. There are some Bioconductor packages which require special libraries to first be installed in order to work correctly. The current listing of these packages:

    • limmaGUI and affylmGUI: These require some specific Tcl/Tk setup on your machine. Please see the WEHI page for more specific instructions.
    • Rgraphviz: Requires the Graphviz libraries. Users of Rgraphviz need Graphviz version 1.12 or later. This is available at the Graphviz download site. Note that odd number version numbers of Graphviz imply a developmental/unstable version, and may not work properly with the current version of Rgraphviz. It is recommended that users download and build Graphviz from source - however, if one chooses to use RPMs or other package tools, please make sure to download both the primary Graphviz package and the developer version. (eg if using RPMs, download graphviz and graphviz-devel RPMs). Also, please note that Rgraphviz currently does not work with Windows systems.

    Without proper installation of the required libraries, these packages will fail to install on your system.


    Node:What documentation exists for Bioconductor?, Next:, Previous:How can Bioconductor be obtained?, Up:Bioconductor Basics

    What documentation exists for Bioconductor?

    Extensive documentation for R and Bioconductor is available on the WWW.

    • Online help. Online documentation for most of the functions and variables in Bioconductor packages exists, and can be printed on-screen by typing help(name) (or ?name) at the R prompt, where name is the name of the topic help is sought for. (In the case of unary and binary operators and control-flow special forms, the name may need to be be quoted.)
      This documentation can also be made available as one reference manual for on-line reading in HTML and PDF formats, and as hardcopy via LaTeX.
    • Vignettes. Each Bioconductor package contains at least one vignette, which is a document that provides a textual, task-oriented description of the package's functionality and that can be used interactively. Package vignettes come in several form and are discussed in greater detail in "Documentation and reproducible research".
    • Short courses. Lectures and computer labs from Bioconductor short courses are available.
    • Publications. Articles and technical reports providing in depth discussions of Bioconductor software are available in the "Publications" section of this website.

    For general documentation on R consult the "What documentation exists for R?" section of this FAQ.


    Node:Citing Bioconductor, Next:, Previous:What documentation exists for Bioconductor?, Up:Bioconductor Basics

    Citing Bioconductor

    At this point, to cite Bioconductor in publications, please use the following article,

    @Article{BIOC,
    author = {Robert C Gentleman and Vincent J. Carey and Douglas M. Bates and
    Ben Bolstad and Marcel Dettling and Sandrine Dudoit and Byron Ellis and
    Laurent Gautier and Yongchao Ge and Jeff Gentry and Kurt Hornik and
    Torsten Hothorn and Wolfgang Huber and Stefano Iacus and Rafael Irizarry
    and Friedrich Leisch and Cheng Li and Martin Maechler and Anthony J. Rossini
    and Gunther Sawitzki and Colin Smith and Gordon Smyth and Luke Tierney
    and Jean Y. H. Yang and Jianhua Zhang},
    title = {Bioconductor: Open software development for
    computational biology and bioinformatics},
    journal = {Genome Biology},
    volume = {5},
    year = {2004},
    pages = {R80},
    url = {http://genomebiology.com/2004/5/10/R80}
    }

    Node:What mailing lists exist for Bioconductor?, Previous:Citing Bioconductor, Up:Bioconductor Basics

    What mailing lists exist for Bioconductor?

    Thanks to Martin Maechler, there is a mailing list devoted to Bioconductor.

    bioconductor
    This list is for general discussion of issues, libraries ideas with functional genomics.

    Information about this list, how to subscribe, how to post and how to access the archives is avaialable at the Bioconductor mailing list page.

    It is recommended that you send mail to bioconductor rather than to any particular member of the team. Core developers are all subscribed to the list, of course.

    Of course, in the case of bug reports it would be very helpful to have code which reliably reproduces the problem. Also, make sure that you include information on the system and version of R and of bioconductor being used. See Bioconductor Bugs for more details.


    Node:R and Bioconductor, Next:, Previous:Bioconductor Basics, Up:Top

    R and Bioconductor


    Node:What is R?, Next:, Previous:R and Bioconductor Up:R and Bioconductor

    What is R?

    R (www.r-project.org) is a widely used open source language and environment for statistical computing and graphics. It is available for Linux, Unix, Windows, and MacIntosh computers. More information on R is available in the "R Basics" section of the R FAQ.


    Node:What documentation exists for R?, Next:, Previous:What is R?, Up:R and Bioconductor

    What documentation exists for R?

    Extensive documentation for R is available on the WWW. Resources include

    R Help Pages
    R FAQ
    R Manuals
    R Contributed Documentation

    For more detail, consult the documentation section of the R FAQ.


    Node:What is CRAN?, Next:, Previous:What documentation exists for R?, Up:R and Bioconductor

    What is CRAN?

    The "Comprehensive R Archive Network" (CRAN) is a collection of sites which carry identical material, consisting of the R distribution(s), the contributed extensions, documentation for R, and binaries.

    The CRAN master site at TU Wien, Austria, can be found at the URL

    http://cran.r-project.org/

    and is currently being mirrored daily at

    http://cran.at.r-project.org/ (TU Wien, Austria)
    http://cran.au.r-project.org/ (PlanetMirror, Australia)
    http://cran.ch.r-project.org/ (ETH Zürich, Switzerland)
    http://cran.dk.r-project.org/ (SunSITE, Denmark)
    http://cran.hu.r-project.org/ (Semmelweis U, Hungary)
    http://cran.uk.r-project.org/ (U of Bristol, United Kingdom)
    http://cran.us.r-project.org/ (U of Wisconsin, USA)
    http://cran.za.r-project.org/ (Rhodes U, South Africa)

    Please use the CRAN site closest to you to reduce network load.

    From CRAN, you can obtain the latest official release of R, daily snapshots of R (copies of the current CVS trees), as gzipped and bzipped tar files, a wealth of additional contributed code, as well as prebuilt binaries for various operating systems (Linux, Digital Unix, and MS Windows). CRAN also provides access to documentation on R, existing mailing lists and the R Bug Tracking system.


    Node:How can I create an Bioconductor compliant package?, Next:, Previous:What is CRAN?, Up:R and Bioconductor

    How can I create a Bioconductor compliant package?

    Software contributions to Bioconductor should be in the form of standard R packages. Guidelines on creating your own R packages are provided in the "Writing R Extensions" manual. Packages must pass R's CMD check process without warnings or errors. They must work with the current version of R (this will be a number no less than R 1.5).

    In addition, each package should contain a directory inst/docs that includes LaTeX documentation. The documentation here is intended to describe the overall functionality of your package. Note that this is separate from and in addition to the standard R help files documenting individual functions.

    Details on Bioconductor's approach to documentation are given in "Documentation and reproducible research". In addition, we recommend that you consult released Bioconductor packages for examples.

    We encourage the use the Bioconductor classes and methods. Especially those in the Biobase and annotate packages.


    Node:How can I contribute to Bioconductor?, Previous:How can I create an Bioconductor compliant package?, Up:R and Bioconductor

    How can I contribute to Bioconductor?

    Bioconductor is in active development and there is always a risk of bugs creeping in. Also, the developers do not have access to all possible machines capable of running Bioconductor. So, simply using it and communicating problems is certainly of great value.

    The Bioconductor Development page acts as an intermediate repository for more or less finalized ideas and plans for Bioconductor packages. It contains (pointers to) TODO lists, RFCs, various other write-ups, ideas lists, and CVS miscellanea.

    Ideally, the development page will provide details and contact information for each major initiative. Please feel free to contact developers if you want to contribute to, or simply discuss a particular project. If you are interested in starting new projects under this umbrella, please notify Robert Gentleman.


    Node:Bioconductor Bugs, Next:, Previous:R and Bioconductor, Up:Top

    Bioconductor Bugs


    Node:What is a bug?, Previous:Bioconductor Bugs, Up:Bioconductor Bugs

    What is a bug?

    Taking forever to complete a command can be a bug, but you must make certain that it was really Bioconductor's fault. Some commands simply take a long time. If the input was such that you know it should have been processed quickly, report a bug. If you don't know whether the command should take a long time, find out by looking in the manual or by asking for assistance.

    For example, suppose that on a data set which you know to be quite large the command

    R> genefilter(exprSet1, filterfun1)

    never returns. Do not report that genefilter() fails for large data sets. Try to see what the real problem is. See the R FAQ for help on deciding whether the bug is an R bug or a Bioconductor bug.

    It is very useful to try and find simple examples that produce apparently the same bug, and somewhat useful to find simple examples that might be expected to produce the bug but actually do not. If you want to debug the problem and find exactly what caused it, that is wonderful. You should still report the facts as well as any explanations or solutions. Please include an example that reproduces the problem, preferably the simplest one you have found.

    Invoking R with the --vanilla option may help in isolating a bug. This ensures that the site profile and saved data files are not read.

    Bug reports on packages should perhaps be sent to the package maintainer rather than to r-bugs.


    Node:Acknowledgments, Previous:Bioconductor Bugs, Up:Top

    Acknowledgments

    Thanks to the Dana Farber Cancer Institute for supporting the initial development of the Bioconductor project.

    Thanks to the developers of R for providing us with a system in which to work (and to John Chambers for providing the precursor to R).

    Special thanks go to the members of Bioconductor who have helped me improve this FAQ; especially to Kurt Hornik for providing the template.

    News
    2008-05-01

    BioC 2.2, consisting of 260 packages and designed to work with R 2.7.0, was released today.

    2008-03-04

    BioConductor release scheduled for 30 April 2008.