European Bioinformatics Institute
Array of plenty - analysis of a 4 base resolution yeast genome tiling array
Conventional microarrays measure the abundance of transcripts using a set of probes whose sequence is designed on the basis of prior knowledge about the transcriptome. More recently, high resolution tiling microarrays make it possible to probe for the whole sequence content of a genome. This enables the discovery of new, unexpected transcripts and the observation of transcript structure, including untranslated regions (UTRs) and exon-intron structure. I will describe the analysis of a yeast genome tiling array, on which 6.5 Million probes of length 25 bases each tile the complete genome of Saccharomyces cerevisiae (12 Megabases) in steps of 4 bases.
One of the analytical challenges is that of segmentation, or change point detection. A typical yeast chromosome presents around 120,000 sample points, along which we expect around 1000 change points (e.g. transcript boundaries). I will describe a dynamic programming algorithm which globally optimizes the likelihood of a piecewise constant curve, and discuss the model selection problem.