The Bioconductor project promotes high-quality, well documented, and interoperable software. These guidelines help to achieve this objective; they are not meant to put undue burden on package authors, and authors having difficultly satisfying guidelines should seek advice on the bioc-devel mailing list.
Package maintainers are urged to follow these guidelines as closely as possible when developing Bioconductor packages.
General instructions for producing packages can be found in the
Writing R Extensions
manual, available from within R (RShowDoc("R-exts")) or on the R web
site.
Most packages contributed by users are software packages that perform analytic calculations. Users also contribute annotation and experiment data packages. Annotation packages are database-like packages that provide information linking identifiers (e.g., Entrez gene names or Affymetrix probe ids) to other information (e.g., chromosomal location, Gene Ontology category). Experiment data packages provide data sets that are used, often by software packages, to illustrate particular analyses. An excellent practice is to develop a software package, and to provide or use an existing experiment data package to give a comprehensive illustration of the methods in the software package. The guidelines below apply to all packages, but annotation and experiment data packages are not required to conform to the space limitations of software packages. Developers wishing to contribute annotation or experiment data packages should seek additional support associated with package submission.
Bioconductor packages must pass R CMD build (or
R CMD INSTALL ‑‑build)
and pass R CMD check with no errors and no warnings using a recent R-devel.
Authors should also try to address all notes that arise during build or check.
Do not use filenames that differ only in case, as not all file systems are case sensitive.
The source package resulting from running R CMD build should occupy
less than 4MB on disk. The package should require less than 5 minutes to run
R CMD check ‑‑no‑rebuild‑vignettes.
Using the ‑‑no‑rebuild‑vignettes
option ensures that the Sweave vignette is built only once.
[ Back to top ]
Choose a descriptive name. An easy way to check whether your name is already in use is to check that the following command fails
source("http://bioconductor.org/biocLite.R")
biocLite("MyPackage")
Avoid names that are easily confused with existing package names, or
that imply a temporal (e.g., ExistingPackage2) or qualitative (e.g.,
ExistingPackagePlus) relationship.
[ Back to top ]
The "License:" field in the DESCRIPTION file should preferably refer to a standard license (see opensource.org or wikipedia) using one of R's standard specifications. Be specific about any version that applies (e.g., GPL-2). Core Bioconductor packages are typically licensed under Artistic-2.0. To specify a non-standard license, include a file named LICENSE in your package (containing the full terms of your license) and use the string "file LICENSE" (without the double quotes) in the "License:" field of your DESCRIPTION file.
[ Back to top ]
Packages must
[ Back to top ]
Reuse, rather than re-implement or duplicate, well-tested functionality from other packages. Specify package dependencies in the DESCRIPTION file, listed as follows
Packages should specify the R version on which they depend. This is usually the current development version.
[ Back to top ]
Re-use existing S4 classes and generics where possible. This encourages interoperability and simplifies your own package development. If your data requires a new representation or function, carefully design an S4 class or generic so that other package developers with similar needs will be able to re-use your hard work, and so that users of related packages will be able to seamlessly use your data structures. Do not hesitate to ask on the Bioc-devel mailing list for advice.
We recommend the following structure/layout:
show methods would go in R/show-methods.R. This is not written in stone,
but tends to provide a useful organization. Sometimes a collection of methods
that provide the interface to a class are best put in a SomeClass-accessors.R
file.A Collates: field in the DESCRIPTION file may be necessary to order class and method definitions appropriately during package installation.
[ Back to top ]
Many R operations are performed on the whole object, not just the elements of the object (e.g., sum(x), not x[1] + x[2] + ...). In particular, relatively few situations require an explicit for loop.
[ Back to top ]
message() communicates diagnostic messages (e.g., progress during lengthy
computations) during code evaluation.warning() communicates unusual situations handled by your code.stop() indicates an error condition.cat() or print() are used only when displaying an object to the user,
e.g., in a show method.[ Back to top ]
Use dev.new() to start a graphics device if necessary. Avoid using x11()
or X11() for it can only be called on machines that have access to an X
server.
[ Back to top ]
A vignette demonstrates how to accomplish non-trivial tasks embodying the core
functionality of your package. A Sweave vignette is an .Rnw file that contains
LaTeX and chunks of R code. The R code chunk starts with a line <<>>=, and ends
with @. Each chunk is evaluated during R CMD build, prior to LaTeX
compilation. Refer to
Writing package vignettes
for technical details.
A vignette provides reproducibility: the vignette produces the same results as copying the corresponding commands into an R session. It is therefore essential that the vignette embed R code between <<>>= and @; short-cuts (e.g., using a LaTeX verbatim environment, or using the Sweave eval=FALSE flag) undermine the benefit of vignettes.
All packages are expected to have at least one Sweave vignette.
[ Back to top ]
Appropriate citations must be included in help pages (e.g., in the see also section) and vignettes; this aspect of documentation is no different from any scientific endeavor. The file inst/CITATION can be used to specify how a package is to be cited.
[ Back to top ]
All Bioconductor packages use an x.y.z version scheme. The following rules apply:
When first submitted to Bioconductor, a package usually has version 0.99.0. For more details, see Version Numbering Standards
[ Back to top ]
If the package contains C or Fortran code, it should adhere to the standards and methods described in the System and foreign language interfaces section of the Writing R Extensions manual. In particular:
Use of external libraries whose functionality is redundant with libraries already supported is strongly discouraged. In cases where the external library is complex the author may need to supply pre-built binary versions for some platforms.
[ Back to top ]
Authors are strongly discouraged from placing their package into both CRAN and Bioconductor. This avoids burdening the author with extra work and confusing the user.
[ Back to top ]
Acceptance of packages into Bioconductor brings with it ongoing responsibility for package maintenance. These responsibilities include:
All authors mentioned in the package DESCRIPTION file are entitled to modify package source code. Changes to package authorship require consent of all authors.
[ Back to top ]
Source Code & Build Reports »
Source code is stored in
svn
(user: readonly, pass: readonly).
Software packages are built and checked nightly. Build reports:
Development Version»
Bioconductor packages under development:
Developer Resources: