Organizing Code in Functions, Files ,and Packages

Martin Morgan
October 29, 2014


R is a programming language

From script to package

Scripts contain a combination of data and transformations. Often the transformations are idiosyncratic, and rely heavily on functions provided by various packages. Sometimes the script contains a useful chunk of code that could be reused in different places. Examples we've encountered in this course might include GC-content of DNA sequences (or is there a function for that already? check out r Biocpkg("Biostrings")!) and creating a simple 'map' from one type of annotation to another.

It is easy and beneficial to create a package.

What is an R package?

Check out the Rstudio package wizard!



Write a function that takes a DNAStringSet and returns the GC content.

Modify the function using a conditional statement to work whether provided a DNAString or a DNAStringSet. Test the function.

Save the function in a file on your AMI.


Write a function that takes as its argument Ensembl gene identifiers (like the rownames() of the SummarizedExperiment object in the RNASeq vignette yesterday) and uses the select() method and annotation package to return a named character vector, where the names of the vector are the Ensembl identifiers and the values are the corresponding gene SYMBOLs. Adopt some simple-to-implement policy for handling Ensembl identifiers that map to more than one gene symbol. Save this function to another file

A package

Use the RStudio wizard to create a package from the files containing your GC-content and annotation-helper functions.