Bioconductor FAQ
Node:Top, Next:Introduction, Previous:(dir), Up:(dir)
Frequently Asked Questions on Bioconductor
Version 1.1.2, 10 February 2006
Robert Gentleman, A.J. Rossini, and Sandrine DudoitNode:Introduction, Next:Bioconductor Basics,
Previous:Top, Up:Top
Introduction
This document contains answers to some of the most frequently asked questions about Bioconductor.
Node:Legalese, Next:Obtaining this document,
Previous:Introduction,
Up:Introduction
Legalese
This document is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.
This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
A copy of the GNU General Public License is available via WWW at
http://www.gnu.org/copyleft/gpl.html.
You can also obtain it by writing to the Free Software Foundation, Inc., 59 Temple Place -- Suite 330, Boston, MA 02111-1307, USA.
Node:Obtaining this
document, Next:Citing this document,
Previous:Legalese,
Up:Introduction
Obtaining this document
The latest version of this document is always available from
http://www.bioconductor.org/docs/faq/
Node:Citing this document,
Next:Feedback, Previous:Obtaining this
document, Up:Introduction
Citing this document
In publications, please refer to this FAQ as Gentleman, Rossini, Dudoit and Hornik (2003), "The Bioconductor FAQ" and give the above, official URL.
Node:Feedback, Previous:Citing this document,
Up:Introduction
Feedback
Feedback is most welcome.
Node:Bioconductor Basics,
Next:R and
Bioconductor, Previous:Introduction, Up:Top
Bioconductor Basics
- What is Bioconductor?:
- Bioconductor Packages:
- Main Features of Bioconductor:
- What is the current version of Bioconductor?:
- How can Bioconductor be obtained?:
- What documentation exists for Bioconductor?:
- Citing Bioconductor:
- What mailing lists exist for Bioconductor?:
Node:What is Bioconductor?,
Next:Bioconductor
Packages, Previous:Bioconductor Basics, Up:Bioconductor Basics
What is Bioconductor?
Bioconductor is an open source and open development software project to provide tools for the analysis and comprehension of genomic data (bioinformatics).
The project was started in the Fall of 2001. The Bioconductor core team is based primarily at the Computational Biology Group in the division of Public Health Sciences at the Fred Hutchinson Cancer Research Center. Other members come from various US and international institutions.
The broad goals of the projects are to
- provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data;
- facilitate the integration of biological metadata in the analysis of experimental data: e.g. literature data from PubMed, annotation data from LocusLink;
- allow the rapid development of extensible, scalable, and interoperable software;
- promote high-quality documentation and reproducible research;
- provide training in computational and statistical methods for the analysis of genomic data.
Node:Bioconductor Packages,
Next:Main
Features of Bioconductor, Previous:What is Bioconductor?, Up:Bioconductor Basics
Bioconductor Packages
The first Bioconductor software release occurred on May 2nd, 2002. Although initial efforts focused primarily on DNA microarray data analysis, many of the software tools are general and can be used broadly for the analysis of genomic data, such as SAGE, sequence, or SNP data.
There are two main types of Bioconductor packages. One set is designed to provide basic infrastructure support that will help other developers produce high quality software for the analysis of genomic data. The other variety provide innovative methodology for analyzing genomic data. We anticipate that libraries of the second form may from time to time migrate to become libraries of the first form.
Bioconductor packages may be downloaded in their released or development versions.
| General tools | ||
| Biobase | Object-oriented representation and manipulation of genomic data (S4 class structure). | |
| Biostrings | Class definitions and generics for biological sequences along with pattern matching algorithms | |
| convert | Define coerce methods for microarray data objects | |
| ctc | Tools for export and import of Tree and Cluster to other programs | |
| DynDoc | Functionality to create and interact with dynamic documents, vignettes, and other navigable documents. | |
| Icense | Many functions for computing the NPMLE for censored and truncated data. | |
| Ruuid | Creates Universally Unique ID values (UUIDs) in R | |
| Analysis | ||
| daMA | Contains functions for the efficient design of factorial two-color microarray experiments and for the statistical analysis of factorial microarray data. | |
| edd | Expression density diagnostics: graphical methods and pattern recognition algorithms for distribution shape classification. | |
| factDesign | Provides a set of tools for analyzing data from factorial designed microarray experiments. The functions can be used to evaluate appropriate tests of contrast and perform single outlier detection | |
| genefilter | Tools for sequentially filtering genes using a wide variety of filtering functions. Example of filters include: number of missing value, coefficient of variation of expression measures, ANOVA p-value, Cox model p-values. Sequential application of filtering functions to genes | |
| globaltest | Testing globally whether a group of genes is significantly relatedto some clinical variable of interest. | |
| gpls | Classification using generalized partial least squares for two-group and multi-group | |
| limma | Linear models for microarray data | |
| RMAGEML | Used to handle MAGE-ML documents in Bioconductor | |
| MeasurementError.cor | Two-stage measurement error model for correlation estimation with smaller bias then the usual sample correlation | |
| multtest | Multiple testing procedures for controlling the family-wise error rate (FWER) and the false discovery rate (FDR). Tests can be based on t- or F-statistics for one- and two-factor designs, and permutation procedures are available to estimate adjusted p-values. | |
| pamr | Some functions for sample classification in microarrays | |
| ROC | Receiver Operating Characteristic (ROC) approach for identifying genes that are differentially expressed int two types of samples. | |
| siggenes | Identifying differentially expressed genes and estimating the False Discovery Rate with both the Significance Analysis of Microarrays and the Empirical Bayes Analyses of Microarrays | |
| splicegear | A set of tools to work with alternative splicing | |
| Annotation | ||
| annotate | Associate experimental data in real time to biological metadata from web databases such as GenBank, LocusLink and PubMed. Process and store query results. Generate HTML reports of analyses. | |
| AnnBuilder | Assemble and process genomic annotation data, from databases such as GenBank, the Gene Ontology Consortium, LocusLink, UniGene, the UCSC Human Genome Project. | |
| Data packages | XML and R annotation data packages, providing mappings between different probe identifiers (e.g. Affy IDs, LocusLink, PubMed) | |
| Resourcer | This package allows user either to read an annotation data file from TIGR Resourcerer as a matrix or convert the file into a Bioconductor annotation data package using the AnnBuilder package. | |
| SNPtools | Provides bindings to the XML RPC services of chip.org::SNPper | |
| Database Interaction | ||
| Rdbi | Generic framework for database access in R | |
| RdbiPgSQL | Provides methods for accessing data stored in PostgresSQL | |
| SAGElyzer | Locates genes based on SAGE tags | |
| Graphics & User Interface | ||
| affylmGUI | A Graphical User Interface for affy analysis using the limma Microarray package by Gordon Smyth. | |
| geneplotter | Graphical tools for genomic data, for example for plotting expression data along a chromosome or producing color images of expression data matrices. | |
| hexbin | Binning functions, in particular hexagonal bins for graphing. | |
| limmaGUI | A Graphical User Interface for the limma Microarray package | |
| tkWidgets | Widgets in Tcl/Tk that provide functionality for Bioconductor packages. | |
| webbioc | An integrated web interface for doing microarray analysis using several of the Bioconductor packages. It is intended to be deployed as a centralized bioinformatics resource for use by many users. (Currently only Affymetrix oligonucleotide analysis is supported.) | |
| widgetTools | Tools for creating Tcl/Tk widgets, i.e., small-scale graphical user interfaces. | |
| Graphs | ||
| graph | Classes and tools for creating and manipulating graphs within R. | |
| RBGL | A package that creates an interface between the graph package and the Boost graph libraries, allowing for fast manipulation of graph objects in R. | |
| Rgraphviz | Provides an interface with Graphviz for plotting graph objects in R. | |
| SNAData | Data from Wasserman & Faust (1999) "Social Network Analysis" | |
| Proteomics | ||
| gpls | Classification using generalized partial least squares for two-group and multi-group (more than 2 group) classification. | |
| PROcess | A package for processing protein mass spectrometry data. | |
| apComplex | This package contains functions to estimate a bipartite graph representing protein complex membership using data from AP-MS technology. | |
| arrayCGH | ||
| DNAcopy | Segments DNA copy number data using circular binary segmentation to detect regions with abnormal copy number. | |
| aCGH | Functions for reading aCGH data from image analysis output files and clone information files, creation of aCGH S3 objects for storing these data. Basic methods for accessing/replacing, subsetting, printing and plotting aCGH objects. | |
| GLAD | Analysis of array CGH data : detection of breakpoints in genomic profiles and assignment of a status (gain, normal or lost) to each chromosomal regions identified. | |
| Pre-processing | ||
| affy, affycomp, affydata, affypdnn, affyPLM, gcrma, makecdfenv | Diagnostic plots, expression measures, and normalization for Affymetrix chip data. | |
| annaffy | Functions for handling data from Bioconductor Affymetrix annotation data packages. Produces compact HTML and text reports including experimental data and URL links to many online databases. | |
| marray | Diagnostic plots and normalization for cDNA microarray data. | |
| matchprobes | Tools for sequence matching of probes on arrays | |
| vsn | Calibration and variance stabilizing transformations for both Affymetrix and cDNA array data. | |
| Ontologies | ||
| GOstats.html | A set of tools for interacting with GO and microarray data. A variety of basic manipulation tools for graphs, hypothesis testing and other simple calculations. | |
| goTools | Wraper functions for description/comparison of oligo ID list using Gene Ontology database | |
| ontoTools | Tools for working with ontologies | |
Node:Main Features of
Bioconductor, Next:What is
the current version of Bioconductor?, Previous:Bioconductor
Packages, Up:Bioconductor Basics
Main Features of Bioconductor
- Use of R:
- Documentation and reproducible research:
- Statistical and graphical methods.:
- Annotation:
- Graphical user interface:
- Bioconductor short courses:
- Open source:
- Open development:
Node:Use of R, Next:Documentation and
reproducible research, Previous:Main Features of
Bioconductor, Up:Main Features of
Bioconductor
Use of R
R and the R package system are the main vehicles for designing and releasing software. R (www.r-project.org) is a widely used open source language and environment for statistical computing and graphics - GNU's S-Plus. It provides a high-level programming environment together with a sophisticated packaging and testing paradigm. It has a number of mechanisms that allow it to interact directly with software that has been written in many different languages (see Omega Project). These tools allow users to incorporate modules based on other work. Viewed in that context, adopting R as a vehicle does not exclude other development environments and paradigms. R can, in those cases, provide a glue or connectivity linking what might otherwise be different products. Finally, R is under very active development by a dedicated team of researchers with a strong commitment to good documentation and software design.
Node:Documentation and
reproducible research, Next:Statistical and
graphical methods, Previous:Use of R, Up:Main Features of
Bioconductor
Documentation and reproducible research
One of the goals of the project is to provide high-quality documentation and encourage reproducible research.
Each package contains at least one vignette, which is a document that provides a textual, task-oriented description of the package's functionality and that can be used interactively. Packages vignettes come in several forms. Many are simple "HowTo"s, that is, they are designed to demonstrate how a particular task can be accomplished with that package's software. Others provide a more thorough overview of the package, or might even discuss general issues related to the package. In the future, we are looking towards providing vignettes that are not specifically tied to a package, but rather are demonstrating more complex concepts. As with all aspects of the Bioconductor project, users are encouraged to participate in this effort.
The vignettes are generated using the Sweave function
from the R package tools. They are documents that intermix text,
code, and output (textual and graphical) and can be regenerated
automatically whenever the data or analyses change. Additional
supporting software for vignettes will aid users in obtaining data
and sample code, step through specific analyses, and apply these
analyses to their own data. Vignette sources are in the
inst/docs directory of the packages.
The tkWidgets
package provides functions and widgets for viewing and testing
vignette code chunks interactively (e.g.
vExplorer).
Node:Statistical and
graphical methods, Next:Annotation, Previous:Documentation and
reproducible research, Up:Main Features of
Bioconductor
Statistical and graphical methods
The Bioconductor project aims to provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data. Analysis packages are available for: pre-processing Affymetrix and cDNA array data; identifying differentially expressed genes; graph theoretical analyses; plotting genomic data. In addition, the R package system itself provides implementations for a broad range of state-of-the-art statistical and graphical techniques, including linear and non-linear modeling, cluster analysis, prediction, resampling, survival analysis, and time-series analysis.
Node:Annotation, Next:Graphical user interface,
Previous:Statistical and
graphical methods, Up:Main Features of
Bioconductor
Annotation
The Bioconductor project provides software for associating microarray and other genomic data in real time to biological metadata from web databases such as GenBank, LocusLink and PubMed (annotate package). Functions are also provided for incorporating the results of statistical analysis in HTML reports with links to annotation WWW resources.
Software tools are available for assembling and processing genomic annotation data, from databases such as GenBank, the Gene Ontology Consortium, LocusLink, UniGene, the UCSC Human Genome Project (AnnBuilder package).
Data packages are distributed to provide mappings between different probe identifiers (e.g. Affy IDs, LocusLink, PubMed). Customized annotation libraries can also be assembled.
Node:Graphical user
interface, Next:Bioconductor short courses,
Previous:Annotation,
Up:Main
Features of Bioconductor
Graphical user interface
Perhaps the largest problem with using a language such as R is that first time users can be discouraged by the complexity of the language. Another major focus of the project is therefore to provide some form of graphical user interface (GUI) for selected tasks. The approach is programmatic to a large extent, thereby allowing any package developer to use these tools to provide a graphical user interface for their package.
To overcome this barrier to entry we are designing a widget mechanism to provide interactive access to much of the Bioconductor functionality. A widget can be thought of as a small-scale GUI. It builds on the tcltk package which provides an interface and language bindings to Tcl/Tk GUI elements in R. The tkWidgets package was used to generate widgets for file browsing and data input in the affy and marrayInput packages. Many more extensions are planned for the near future.
Node:Bioconductor short
courses, Next:Open
source, Previous:Graphical user interface,
Up:Main
Features of Bioconductor
Bioconductor short courses
The Bioconductor projects has developed a program of short courses on software and statistical methods for the analysis of genomic data. Courses have been given for audiences with backgrounds in either biology or statistics. All course materials (lectures and computer labs) are available on the WWW. Customized short courses may also be designed for interested parties.
Node:Open source, Next:Open development Previous:Bioconductor
short courses, Up:Main Features of
Bioconductor
Open source
Bioconductor has a commitment to full open source discipline, with distribution via a SourceForge-like platform. All contributions are expected to exist under an open source license such as GPL2 or BSD. There are many different reasons why open--source software is beneficial to the analysis of microarray data and to computational biology in general. The reasons include:
- full access to algorithms and their implementation
- the ability to fix bugs and extend and improve the supplied software
- to encourage good scientific computing and statistical practice by providing appropriate tools and instruction
- to provide a workbench of tools that allow researchers to explore and expand the methods used to analyze biological data
- to ensure that the international scientific community is the owner of the software tools needed to carry out research
- to lead and encourage commercial support and development of those tools that are successful
- to promote reproducible research by providing open and accessible tools with which to carry out that research [reproducible research is distinct from independent verification]
Node:Open development,
Previous:Open source,#
Up:Main
Features of Bioconductor
Open development
Bioconductor is a collaborative research effort. Users are encouraged to become developers, either by contributing Bioconductor compliant packages or documentation.
Additionally Bioconductor will provide a mechanism for linking together different groups with common goals. Hopefully this will foster collaboration on software, possibly at the level of shared development. But at the least it will provide information on which projects are already under development.
Node:What
is the current version of Bioconductor?, Next:How can
Bioconductor be obtained?, Previous:Main Features of
Bioconductor, Up:Bioconductor Basics
What is the current version of Bioconductor?
The current released version is announced on the Bioconductor home page. A Bioconductor 'release' is a snapshot of (almost) all Bioconductor packages. The release is meant to be as bug-free and stable as possible. A Bioconductor release coincides with each major R release, and occurs approximately every six months. Each release incorporates major changes to existing packages made since the previous release, and includes new contributed packages.
Node:How can
Bioconductor be obtained?, Next:For Unix/Linux, Previous:What is
the current version of Bioconductor?, Up:Bioconductor Basics
How can Bioconductor be obtained?
Sources, binaries, and documentation for released and development versions of Bioconductor packages can be downloaded from the Bioconductor website. Additional packages may be downloaded from the "Comprehensive R Archive Network" (CRAN).
- For Unix/Linux:
- For Windows:
- For Raqua:
- biocLite:
- Downloading An Entire Repository:
- Available Repositories:
- Other Notes:
Node:For Unix/Linux, Next:For Windows, Previous:How can
Bioconductor be obtained?, Up:How can Bioconductor be
obtained?
For Unix/Linux
Installing R
- Download the most recent version of R (currently 2.0.0) from CRAN by following the links to the appropriate distribution for your system or by downloading the source code R-2.0.1.tgz. The link to "FAQs" under the "Documentation" section on the R website provides detailed information on how to download and install R for different systems.
- Start the R program by typing "R" at the shell prompt. To start the R help browser type "help.start()" . For help on any function, e.g. the "mean" function, type "? mean".
Installing Bioconductor packages using biocLite.R
- Users are encouraged to use biocLite.R to obtain, install and update their packages. Information on how to use thisfunctions can be found in the biocLite sections of this document.
Installing Bioconductor packages using R INSTALL
- Alternately, you may download and install Bioconductor packages
as any R add-on packages using the command
"R CMD INSTALL /path/to/pkg_version.tar.gz" at the shell prompt. Instructions for doing so are given in the R FAQ on R add-on packages.
Node:For Windows, Previous:For Unix/Linux, Up:How can
Bioconductor be obtained?
For Windows
Installing R
- Download the most recent version of R (currently 2.0.1) from CRAN by following the links to "Windows (95 and later)" and then "base". Consult the file ReadMe.rwxxxx "ReadMe.rwxxxx" for detailed instructions.
- Save the file "SetupR.exe" on your desktop, double click on the icon and follow the installation instructions. This file contains all the R components, and you can select what you want installed.
- Start the R program by double clicking on "Rgui.exe". To start the R help browser type "help.start()" or use the menu. For help on any function, e.g. the "mean" function, type "? mean".
Installing Bioconductor packages using the installation scripts biocLite.R
- Users are encouraged to use biocLite.R to obtain, install and update their packages. Information on how to use these functions can be found in the biocLite section of this document.
Installing Bioconductor packages using install.packages
- Alternately, you may download and install Bioconductor packages as any R add-on packages. Instructions for doing so are given in the R for Windows FAQ on Packages. This involves downloading the pre-compiled Windows versions of the packages as .zip files and using the "install.packages" function or menu option "Install package from local zip file ..." under "Packages".
Node:For Raqua, Next:Using biocLite, Previous:For Windows, Up:How can Bioconductor be
obtained?
Using Raqua
Go to the R menu, select Packages&Data->Package
Installer. In that window, choose Bioconductor
(binaries) and click Get List. Choose the
packages that you wish to install and click
Install/Update.
Node:Using biocLite, Next:Downloading All Packages From A Repository, Previous:For Raqua, Up:How can Bioconductor be
obtained?
Using biocLite to obtain Bioconductor packages
- biocLite is the simplest and fastest way users for users to get
started with Bioconductor. biocLite requires an R version >=
2.1.0.
From your R session, type:source("http://bioconductor.org/biocLite.R")this will download the biocLite functionality into your R session. - To install Bioconductor packages, use the function "biocLite"
by typing:
biocLite()
biocLite installs a core subset of package. A list of packages can be obtained with the R commands (the first line needs to be entered only once per session)source("http://bioconductor.org/biocLite.R") biocinstallPkgGroups("lite")
-
- destdir: the directory where the downloaded packages will be stored.
- lib: character vector giving the library directories under which packages may be installed. Recycled as needed.
- pkgs: character vector of Bioconductor packages to install.
- The script will try to install the downloaded packages and print "Installation complete" and TRUE on the screen when the installation was successful.
Node:Downloading All Packages From A Repository,
Next:Repositories Currently Available,
Previous:Using biocLite, Up:How can Bioconductor be
obtained?
Downloading All Packages From A Repository
To download all of the packages from a repository into a
directory, use the download.packages2 function. To do
this, select the repository you wish to download the packages from
(using repositories for instance) and run
download.packages2(repEntry=REP, destDir=DIR), where
REP is your ReposEntry object and
DIR is the directory you'd like them downloaded to
(e.g. ".").
Node:Repositories Currently
Available, Previous:Downloading All Packages From A Repository,
Up:How
can Bioconductor be obtained?
Repositories Currently Available
These are the default repositories that come built into biocLite. The actual URLs might vary depending on your usages of mirrors.
- /CRANrepository: A reposTools style repository consisting of the CRAN packages.
- /packages/bioc/stable/src/contrib/Source: A repository consisting of the source packages for BioC 1.5
- /packages/bioc/stable/src/contrib/Win32: A repository consisting of the windows packages for BioC 1.5
- /repository/devel/package/Source: A repository consisting of the current source versions of the developmental packages
- /repository/devel/package/Win32: A repository consisting of the current windows versions of the developmental packages
- /data/metaData: This repository contains the Bioconductor annotation packages
- /data/metaData-devel: This repository contains the developmental versions of the Bioconductor annotation packages.
- /data/experimental/repos: A repository for the Bioconductor experimental data packages.
- /repository/Courses: This repository contains the packages for various short courses from Bioconductor.
- /data/cdfenvs/repos: A repository for the Bioconductor CDF data packages.
- /data/probes/Packages: A repository for the Bioconductor probeset packages.
- /repository/Omegahat: A repository for various Omegahat (www.omegahat.org) packages.
- /repository/lindsey: A repository of Jim Lindsey's packages
Node:Other Notes, Previous:Repositories Currently
Available, Up:How can Bioconductor be
obtained?
Other Notes
Some users with proxies and firewalls might have difficulties in
downloading packages properly with these functions. This is covered
in the R FAQ question
The Internet Download Functions Fail. If you have further
difficulty with network enabled R functions, this is best brought
up on the R-help
mailing list. There are some Bioconductor packages which require
special libraries to first be installed in order to work correctly.
The current listing of these packages:
- limmaGUI and affylmGUI: These require some specific Tcl/Tk setup on your machine. Please see the WEHI page for more specific instructions.
- Rgraphviz: Requires the Graphviz libraries. Users of Rgraphviz need Graphviz version 1.12 or later. This is available at the Graphviz download site. Note that odd number version numbers of Graphviz imply a developmental/unstable version, and may not work properly with the current version of Rgraphviz. It is recommended that users download and build Graphviz from source - however, if one chooses to use RPMs or other package tools, please make sure to download both the primary Graphviz package and the developer version. (eg if using RPMs, download graphviz and graphviz-devel RPMs). Also, please note that Rgraphviz currently does not work with Windows systems.
Without proper installation of the required libraries, these packages will fail to install on your system.
Node:What
documentation exists for Bioconductor?, Next:Citing Bioconductor,
Previous:How can Bioconductor be
obtained?, Up:Bioconductor Basics
What documentation exists for Bioconductor?
Extensive documentation for R and Bioconductor is available on the WWW.
- Online help. Online documentation for most of the
functions and variables in Bioconductor packages exists, and can be
printed on-screen by typing help(name) (or
?name) at the R prompt, where name
is the name of the topic help is sought for. (In the case of unary
and binary operators and control-flow special forms, the name may
need to be be quoted.)
This documentation can also be made available as one reference manual for on-line reading in HTML and PDF formats, and as hardcopy via LaTeX. - Vignettes. Each Bioconductor package contains at least one vignette, which is a document that provides a textual, task-oriented description of the package's functionality and that can be used interactively. Package vignettes come in several form and are discussed in greater detail in "Documentation and reproducible research".
- Short courses. Lectures and computer labs from Bioconductor short courses are available.
- Publications. Articles and technical reports providing in depth discussions of Bioconductor software are available in the "Publications" section of this website.
For general documentation on R consult the "What documentation exists for R?" section of this FAQ.
Node:Citing Bioconductor,
Next:What mailing
lists exist for Bioconductor?, Previous:What
documentation exists for Bioconductor?, Up:Bioconductor Basics
Citing Bioconductor
At this point, to cite Bioconductor in publications, please use the following article,
@Article{BIOC,
author = {Robert C Gentleman and Vincent
J. Carey and Douglas M. Bates and
Ben Bolstad and Marcel
Dettling and Sandrine Dudoit and Byron Ellis and
Laurent Gautier
and Yongchao Ge and Jeff Gentry and Kurt Hornik and
Torsten
Hothorn and Wolfgang Huber and Stefano Iacus and Rafael
Irizarry
and Friedrich Leisch and Cheng Li and Martin Maechler and Anthony J. Rossini
and Gunther Sawitzki and Colin Smith and Gordon Smyth and Luke Tierney
and Jean Y. H. Yang and Jianhua Zhang},
title = {Bioconductor: Open software development for
computational biology and bioinformatics},
journal = {Genome Biology},
volume = {5},
year = {2004},
pages = {R80},
url = {http://genomebiology.com/2004/5/10/R80}
}
Node:What
mailing lists exist for Bioconductor?, Previous:Citing Bioconductor,
Up:Bioconductor
Basics
What mailing lists exist for Bioconductor?
Thanks to Martin Maechler, there is a mailing list devoted to Bioconductor.
bioconductor- This list is for general discussion of issues, libraries ideas with functional genomics.
Information about this list, how to subscribe, how to post and how to access the archives is avaialable at the Bioconductor mailing list page.
It is recommended that you send mail to bioconductor rather than to any particular member of the team. Core developers are all subscribed to the list, of course.
Of course, in the case of bug reports it would be very helpful to have code which reliably reproduces the problem. Also, make sure that you include information on the system and version of R and of bioconductor being used. See Bioconductor Bugs for more details.
Node:R and Bioconductor,
Next:Bioconductor
Bugs, Previous:Bioconductor Basics, Up:Top
R and Bioconductor
- What is R?:
- What documentation exists for R?,
- What is CRAN?:
- How can I create an Bioconductor compliant package?:
- How can I contribute to Bioconductor?:
Node:What is R?, Next:What documentation
exists for R?, Previous:R and Bioconductor Up:R and Bioconductor
What is R?
R (www.r-project.org) is a widely used open source language and environment for statistical computing and graphics. It is available for Linux, Unix, Windows, and MacIntosh computers. More information on R is available in the "R Basics" section of the R FAQ.
Node:What
documentation exists for R?, Next:What is CRAN?, Previous:What is R?, Up:R and Bioconductor
What documentation exists for R?
Extensive documentation for R is available on the WWW. Resources include
R Help Pages
R FAQ
R Manuals
R Contributed
Documentation
For more detail, consult the documentation section of the R FAQ.
Node:What is CRAN?, Next:How
can I create an Bioconductor compliant package?,
Previous:What documentation
exists for R?, Up:R
and Bioconductor
What is CRAN?
The "Comprehensive R Archive Network" (CRAN) is a collection of sites which carry identical material, consisting of the R distribution(s), the contributed extensions, documentation for R, and binaries.
The CRAN master site at TU Wien, Austria, can be found at the URL
http://cran.r-project.org/
and is currently being mirrored daily at
http://cran.at.r-project.org/ (TU Wien, Austria) http://cran.au.r-project.org/ (PlanetMirror, Australia) http://cran.ch.r-project.org/ (ETH Zürich, Switzerland) http://cran.dk.r-project.org/ (SunSITE, Denmark) http://cran.hu.r-project.org/ (Semmelweis U, Hungary) http://cran.uk.r-project.org/ (U of Bristol, United Kingdom) http://cran.us.r-project.org/ (U of Wisconsin, USA) http://cran.za.r-project.org/ (Rhodes U, South Africa)
Please use the CRAN site closest to you to reduce network load.
From CRAN, you can obtain the latest official release of R, daily snapshots of R (copies of the current CVS trees), as gzipped and bzipped tar files, a wealth of additional contributed code, as well as prebuilt binaries for various operating systems (Linux, Digital Unix, and MS Windows). CRAN also provides access to documentation on R, existing mailing lists and the R Bug Tracking system.
Node:How can I
create an Bioconductor compliant package?, Next:How can I
contribute to Bioconductor?, Previous:What is CRAN?, Up:R and Bioconductor
How can I create a Bioconductor compliant package?
Software contributions to Bioconductor should be in the form of standard R packages. Guidelines on creating your own R packages are provided in the "Writing R Extensions" manual. Packages must pass R's CMD check process without warnings or errors. They must work with the current version of R (this will be a number no less than R 1.5).
In addition, each package should contain a directory
inst/docs that includes LaTeX documentation. The
documentation here is intended to describe the overall
functionality of your package. Note that this is separate from and
in addition to the standard R help files documenting individual
functions.
Details on Bioconductor's approach to documentation are given in "Documentation and reproducible research". In addition, we recommend that you consult released Bioconductor packages for examples.
We encourage the use the Bioconductor classes and methods. Especially those in the Biobase and annotate packages.
Node:How can I
contribute to Bioconductor?, Previous:How
can I create an Bioconductor compliant package?, Up:R and Bioconductor
How can I contribute to Bioconductor?
Bioconductor is in active development and there is always a risk of bugs creeping in. Also, the developers do not have access to all possible machines capable of running Bioconductor. So, simply using it and communicating problems is certainly of great value.
The Bioconductor Development page acts as an intermediate repository for more or less finalized ideas and plans for Bioconductor packages. It contains (pointers to) TODO lists, RFCs, various other write-ups, ideas lists, and CVS miscellanea.
Ideally, the development page will provide details and contact information for each major initiative. Please feel free to contact developers if you want to contribute to, or simply discuss a particular project. If you are interested in starting new projects under this umbrella, please notify Robert Gentleman.
Node:Bioconductor Bugs,
Next:Acknowledgments,
Previous:R and
Bioconductor, Up:Top
Bioconductor Bugs
Node:What is a bug?,
Previous:Bioconductor
Bugs, Up:Bioconductor
Bugs
What is a bug?
Taking forever to complete a command can be a bug, but you must make certain that it was really Bioconductor's fault. Some commands simply take a long time. If the input was such that you know it should have been processed quickly, report a bug. If you don't know whether the command should take a long time, find out by looking in the manual or by asking for assistance.
For example, suppose that on a data set which you know to be quite large the command
R> genefilter(exprSet1, filterfun1)
never returns. Do not report that genefilter()
fails for large data sets. Try to see what the real problem is. See
the R FAQ for help on deciding whether the bug is an R bug or a
Bioconductor bug.
It is very useful to try and find simple examples that produce apparently the same bug, and somewhat useful to find simple examples that might be expected to produce the bug but actually do not. If you want to debug the problem and find exactly what caused it, that is wonderful. You should still report the facts as well as any explanations or solutions. Please include an example that reproduces the problem, preferably the simplest one you have found.
Invoking R with the --vanilla option may help in
isolating a bug. This ensures that the site profile and saved data
files are not read.
Bug reports on packages should perhaps be sent to the package maintainer rather than to r-bugs.
Node:Acknowledgments,
Previous:Bioconductor
Bugs, Up:Top
Acknowledgments
Thanks to the Dana Farber Cancer Institute for supporting the initial development of the Bioconductor project.
Thanks to the developers of R for providing us with a system in which to work (and to John Chambers for providing the precursor to R).
Special thanks go to the members of Bioconductor who have helped me improve this FAQ; especially to Kurt Hornik for providing the template.