Introduction to R ======================================================== author: Martin Morgan (mtmorgan@fhcrc.org), Fred Hutchinson Cancer Research, Center, Seattle, WA, USA. date: 24 August 2014 ```{r setup, include=FALSE} options(width=44) opts_chunk$set(cache=TRUE) ``` Outline ======================================================== Part I - Vectors (data) - Functions - Help! Part II - Classes (objects) - Generics & methods - Help! *** Part III - Packages - Help! Part I: Vectors (data) ======================================================== ```{r} 1 # vector of length 1 c(1, 1, 2, 3, 5) # vector of length 5 ``` Part I: Vectors (data) ======================================================== - logical `c(TRUE, FALSE)`, integer, numeric, complex, character `c("A", "beta")` - list `list(c(TRUE, FALSE), c("A", "beta"))` - Statistical concepts: `factor`, `NA` Assignment and names ```{r} x <- c(1, 1, 2, 3, 5) y = c(5, 5, 3, 2, 1) z <- c(Female=12, Male=3) ``` - `=` and `<-` are the same Part I: Vectors (data) ======================================================== Operations ```{r} x + y # vectorized x / 5 # ...recylcing x[c(3, 1)] # subset ``` Part I: Functions ======================================================== Examples: `c()`, concatenate values; `rnorm()`, generate random normal deviates; `plot()` ```{r} x <- rnorm(1000) # 1000 normal deviates y <- x + rnorm(1000, sd = 0.5) ``` - Optional, named arguments; positional matching ```{r} args(rnorm) ``` Part I: Functions ======================================================== ```{r} plot(x, y) ``` - `formula`: another way `plot(y ~ x)` Part I: Help! ======================================================== Within R ```{r, eval=FALSE} ?rnorm ``` Rstudio - "Help" tab, search for "rnorm" Main sections - Title, Description, Usage, Arguments, Details, Value (result), See also, Examples Part II: Classes (objects) ======================================================== Motivation: manipulate complicated data - e.g., `x` and `y` from previous example are related to one another -- same length, element i of y is a transformation of element i of x Solution: a "data frame" to coordinate access ```{r} df <- data.frame(X=x, Y=y) head(df, 3) ``` Part II: Generics & methods ======================================================== ```{r} class(df) # plain function dim(df) # generic & method for data.frame head(df$X, 4) # column access ``` Part II: Generics & methods ======================================================== ```{r} ## create or update 'Z' df$Z <- sqrt(abs(df$Y)) ## subset rows and / or columns head(df[df$X > 0, c("X", "Z")]) ``` Part II: Generics & methods ======================================================== ```{r} plot(Y ~ X, df) # Y ~ X, values from 'df' ## lm(): linear model, returns class 'lm' fit <- lm(Y ~ X, df) abline(fit) # plot regression line ``` Part II: Generics & methods ======================================================== ```{r} anova(fit) ``` Part II: Generics & methods ======================================================== - `fit`: object of class `lm` - `anova()`: generic, with method for for class `fit` ```{r} methods(anova) ``` Part II: Help! ======================================================== ```{r, eval=FALSE} ## class of object class(fit) ## method discovery methods(class=class(fit)) methods(anova) ## help on generic, and specific method ?anova ?anova.lm ``` Part III: Packages ======================================================== Installed - Base & recommended - Additional packages ```{r} length(rownames(installed.packages())) ``` Available - [CRAN](http://cran.r-project.org/web/packages/available_packages_by_name.html), [Bioconductor](http://bioconductor.org/packages/release/BiocViews.html#___Software); - Also: [github](http://github.com), [rforge](https://r-forge.r-project.org/), ... Part III: Packages ======================================================== 'Attached' (installed and available for use): ```{r, eval=FALSE} search() # attached packages ls("package:stats") # functions in 'stats' ``` Attaching (make installed package available for use) ```{r, eval=FALSE} library(ggplot2) ``` Installing CRAN or Bioconductor packages ```{r, eval=FALSE} source("http://bioconductor.org/biocLite.R") biocLite("GenomicRanges") ``` Part III: Help! ======================================================== Packages - Available packages. CRAN: [Package index](http://cran.r-project.org/web/packages/available_packages_by_name.html), [Task Views](http://cran.r-project.org/web/views/); Bioconductor: [BiocViews](http://bioconductor.org/packages/release/BiocViews.html#___Software) - Package descriptions ('landing pages'), e.g., [ggplot2](http://cran.fhcrc.org/web/packages/ggplot2/index.html), [GenomicRanges](http://bioconductor.org/packages/release/bioc/html/GenomicRanges.html) - Vignettes: narrative descriptions of how to use the package, e.g., [minfi](http://bioconductor.org/packages/release/bioc/html/minfi.html) Part IV: Help! ======================================================== Best bet - Other R users you know! R - [StackOverflow](http://stackoverflow.com/questions/tagged/r) search for `[R]`; R-help [mailing list](http://www.r-project.org/mail.html) Bioconductor - [Web site](http://bioconductor.org) - [Mailing list](http://bioconductor.org/help/mailing-list/) - Soon: [support site](http://support.bioconductor.org) Acknowledgements ======================================================== Funding - US NIH / NHGRI 2U41HG004059; NSF 1247813 People - Seattle Bioconductor team: Sonali Arora, Marc Carlson, Nate Hayden, Valerie Obenchain, Hervé Pagès, Dan Tenenbaum - Vincent Carey, Robert Gentleman, Rafael Irizzary, Sean Davis, Kasper Hansen, Michael Lawrence, Levi Waldron