Skip to content.

bioconductor.org

Bioconductor is an open source and open development software project
for the analysis and comprehension of genomic data.

Sections

lab3Marray.Rnw

% % NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. % %\VignetteIndexEntry{EMBO03 Lab 3} %\VignetteDepends{marrayNorm} %\VignetteKeywords{Microarray, Pre-processing} \documentclass[12pt]{article}

\usepackage{amsmath,pstricks} \usepackage[authoryear,round]{natbib} \usepackage{hyperref}

\textwidth=6.2in \textheight=8.5in %\parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in

\newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle}

\bibliographystyle{plainnat}

\title{EMBO03 Lab 3: Introduction to Bioconductor \texttt{marray} Packages}

\author{Sandrine Dudoit and Robert Gentleman}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{document}

\maketitle

In this lab, we demonstrate the main functions in the \texttt{marray} suite of packages for diagnostic plots and normalization of two-color spotted microarray data. Efforts are underway to interact (read and write) with MAGE-ML documents. A brief description of the four main \texttt{marray} packages is given next.

\begin{description} \item \texttt{marrayClasses}. This package contains class definitions and associated methods for pre- and post-normalization intensity data for batches of arrays. Methods are provided for the creation and modification of microarray objects, basic computations, printing, subsetting, and class conversions. \item \texttt{marrayInput}. This package provides functionality for reading microarray data into R, such as intensity data from image processing output files (e.g., \texttt{.spot} and \texttt{.gpr} files for the \texttt{Spot} and \texttt{GenePix} packages, respectively) and textual information on probes and targets (e.g., from \texttt{.gal} files and god lists). \texttt{tcltk} widgets are supplied to facilitate and automate data input and the creation of microarray-specific R objects for storing these data. \item \texttt{marrayPlots}. This package provides functions for diagnostic plots of microarray spot statistics, such as boxplots, scatterplots, and spatial color images. Examination of diagnostic plots of intensity data is important in order to identify printing, hybridization, and scanning artifacts that can lead to biased inferences concerning gene expression. \item \texttt{marrayNorm}. This package implements robust adaptive location and scale normalization procedures, which correct for different types of dye biases (e.g., intensity, spatial, plate biases) and allow the use of control sequences spotted onto the array and possibly spiked into the mRNA samples. Normalization is needed to ensure that observed differences in intensities are indeed due to differential expression and not experimental artifacts; fluorescence intensities should therefore be normalized before any analysis that involves comparisons among gene expression measures within or between arrays. \end{description}

To load the packages <>= library(marrayNorm) @

For a more detailed introduction, consult the package vignettes which can be listed by the command \texttt{openVignette()}. A demo for \texttt{marrayPlots} can also be accessed by \texttt{demo(marrayPlots)}. We will work with the sample dataset \texttt{swirl}; for a description of \texttt{swirl}, type \texttt{? swirl}. To load this dataset

<>= data(swirl) @

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% marrayClasses

\section{Basic classes and methods: \texttt{marrayClasses} package}

One of the main classes in \texttt{marrayClasses} is the \texttt{marrayLayout} class; it is used to keep track of important layout parameters, such as the total number of spotted probe sequences on the array, the dimensions of the spot and grid matrices, the plate origin of the probes, information on spotted control sequences. For details on this class consult the help file, \texttt{? marrayLayout}. Two other important classes are \texttt{marrayRaw} and \texttt{marrayNorm}, which represent, respectively, pre-normalization and post-normalization intensity data for a batch of spotted microarrays. Methods for manipulating instances of these classes are also described in the help files.

The object \texttt{swirl} is an instance of the class \texttt{marrayRaw}. Try the following commands to obtain information on this object

<>= class(swirl) slotNames(swirl) swirl @

To access individual slots

<>= maLayout(swirl) maGnames(swirl) @

As with other microarray objects in Bioconductor packages, you can use subsetting commands for \texttt{marrayRaw} objects. For data on the first 100 genes in the second array in the \texttt{swirl} batch <>= sw<-swirl[1:100,2] class(sw) sw @

You can access red and green foreground and background intensities, and log ratios as follows <>= Gb<-maGb(swirl) dim(Gb) Gb[1:5,] Rf<-maRf(swirl) dim(Rf) Rf[1:5,] M<-maM(swirl) dim(M) @

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% marrayInput

\section{Reading in data: \texttt{marrayInput} package}

This package provides functionality for reading microarray data into R, such as intensity data from image processing output files (e.g., \texttt{.spot} and \texttt{.gpr} files for the \texttt{Spot} and \texttt{GenePix} packages, respectively) and textual information on probes and targets (e.g., from \texttt{.gal} files and god lists). \texttt{tcltk} widgets are supplied to facilitate and automate data input and the creation of microarray-specific R objects for storing these data.\\ See for example \texttt{? read.marrayRaw} or \texttt{? widget.marrayRaw}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% marrayPlots

\section{Diagnostic plots: \texttt{marrayPlots} package}

The \texttt{marrayPlots} package provides functions for diagnostic plots of microarray spot statistics, such as boxplots, scatterplots, and spatial color images. To produce a spatial image of background intensities for the Cy3 channel in the third array <>= tmp<-maImage(swirl[,3],x="maGb",bar=FALSE) @

To produce a spatial image of log ratios for the first array in the batch <>= tmp<-maImage(swirl[,1],col=maPalette(low="blue",high="yellow"),bar=FALSE) @

To produce boxplots of log ratios by sector for the first array in the batch <>= maBoxplot(swirl[,1]) @

To produce boxplots of log ratios by plate for the second array in the batch <>= maPlate(swirl)<-maCompPlate(swirl,n=384) maBoxplot(swirl[,2],x="maPlate",names=NULL) @

For boxplots of log ratios for all four arrays <>= maBoxplot(swirl) @

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% marrayNorm

\section{Normalization: \texttt{marrayNorm} package}

The \texttt{marrayNorm} package implements robust adaptive location and scale normalization procedures, which correct for different types of dye biases (e.g., intensity, spatial, plate biases). The main location and scale normalization function is \texttt{maNormMain}. Simpler wrapper functions are provided in \texttt{maNorm} and \texttt{maNormScale}. The functions operate on objects of class \texttt{marrayRaw} (or possibly \texttt{marrayNorm}, if normalization is performed in several steps) and return objects of class \texttt{marrayNorm}. For within-print-tip-group loess location normalization of the batch \texttt{swirl}

<>= swirl.norm<-maNormMain(swirl) @

For boxplots of post-normalization log ratios <>= maBoxplot(swirl.norm[,1]) @

<>= maBoxplot(swirl.norm) @

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% marrayTools

\section{Miscellaneous tools: \texttt{marrayTools} package}

The \texttt{marrayTools} package provides additional functions for handling two-color spotted microarray data, including a number of user-friendly wrapper functions for performing standard analyses.\\

The \texttt{spotTools} and \texttt{gpTools} functions in the development version of \texttt{marrayTools} start from Spot (\texttt{.spot} and \texttt{.gal}) and GenePix (\texttt{.gpr} and \texttt{.gal}) image analysis output files, respectively, and automatically read in these data into R, perform standard normalization (within print-tip-group loess), and create a directory with a standard set of diagnostic plots (jpeg format), excel files of quality measures, and tab delimited files of normalized log ratios $M$ and average log intensities $A$. In addition, an object of class \texttt{marrayRaw} or \texttt{marrayNorm} is returned. The package also includes functions for computing various gene statistics and for generating HTML pages for gene lists (\texttt{htmlPage}).

%%%<>= \begin{Sinput} > datadir <- system.file("data", package="marrayInput") > normdata <- spotTools(path=datadir, quality=FALSE) \end{Sinput} %%%@

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\end{document}

News
2009-10-26

BioC 2.5, consisting of 352 packages and designed to work with R 2.10.z, was released today.

2009-01-07

R, the open source platform used by Bioconductor, featured in a series of articles in the New York Times.