Modelling and Resampling Based Multiple Testing with Applications to Genetics

Modelling and Resampling Based Multiple Testing with Applications to Genetics
Author: Yifan Huang
Publisher:
Total Pages:
Release: 2005
Genre: Bootstrap (Statistics)
ISBN:

Download Modelling and Resampling Based Multiple Testing with Applications to Genetics Book in PDF, Epub and Kindle

Abstract: Multiple hypotheses testing is a common problem in practice. For instance, in microarray experiments, whether the goal is to select maintenance genes for normalization or to identify differentially expressed genes between samples, multiple genes are under consideration. Multiplicity inflates the type I error rate of the hypothesis testing, so we need to adjust the testing procedure to control the overly error rate. My research focuses on the strong control of Familywise Error Rate (FWER). There are mainly two different types of approaches to multiple testing. One is modelling based approach and the other non-modelling based. Modelling based approaches fit models to the data so that the joint distribution of the test statistics is tractable. Non-modelling based approaches consist of inequality based methods and resampling based methods. They require less or no information about the joint distribution of the test statistics. I have shown in Chapter 1 that frequently used Hochberg's step-up method is a special case of partition testing based on Simes' test. This is a new result. Hochberg's step-up method is an inequity based non-modelling partition testing. Modelling based partition testing is applicable whether the joint distribution of the test statistics is known or not. By applying modelling based partition testing when the joint distribution of test statistics is known, I illustrate that modelling based approaches are often more powerful than inequality based non-modelling approaches. In Chapter 2, I construct counterexamples to the validity of permutation test, demonstrating that the resampling based methods are often invalid. My results suggest recommendation of modelling based approaches. When the joint distribution of the test statistics is untractable, modelling followed by bootstrap can be applied. I use modelling followed by bootstrap in Chapter 3 to select maintenance genes for normalizing the gene expression data.

Resampling-based Multiple Testing with Applications to Microarray Data Analysis

Resampling-based Multiple Testing with Applications to Microarray Data Analysis
Author: Dongmei Li
Publisher:
Total Pages: 120
Release: 2009
Genre: DNA microarrays
ISBN:

Download Resampling-based Multiple Testing with Applications to Microarray Data Analysis Book in PDF, Epub and Kindle

Abstract: In microarray data analysis, resampling methods are widely used to discover significantly differentially expressed genes under different biological conditions when the distributions of test statistics are unknown. When sample size is small, however, simultaneous testing of thousands, or even millions, of null hypotheses in microarray data analysis brings challenges to the multiple hypothesis testing field. We study small sample behavior of three commonly used resampling methods, including permutation tests, post-pivot resampling methods, and pre-pivot resampling methods in multiple hypothesis testing. We show the model-based pre-pivot resampling methods have the largest maximum number of unique resampled test statistic values, which tend to produce more reliable P-values than the other two resampling methods. To avoid problems with the application of the three resampling methods in practice, we propose new conditions, based on the Partitioning Principle, to control the multiple testing error rates in fixed-effects general linear models. Meanwhile, from both theoretical results and simulation studies, we show the discrepancies between the true expected values of order statistics and the expected values of order statistics estimated by permutation in the Significant Analysis of Microarrays (SAM) procedure. Moreover, we show the conditions for SAM to control the expected number of false rejections in the permutation-based SAM procedure. We also propose a more powerful adaptive two-step procedure to control the expected number of false rejections with larger critical values than the Bonferroni procedure.

Multiple Testing Procedures with Applications to Genomics

Multiple Testing Procedures with Applications to Genomics
Author: Sandrine Dudoit
Publisher: Springer Science & Business Media
Total Pages: 611
Release: 2007-12-18
Genre: Science
ISBN: 0387493174

Download Multiple Testing Procedures with Applications to Genomics Book in PDF, Epub and Kindle

This book establishes the theoretical foundations of a general methodology for multiple hypothesis testing and discusses its software implementation in R and SAS. These are applied to a range of problems in biomedical and genomic research, including identification of differentially expressed and co-expressed genes in high-throughput gene expression experiments; tests of association between gene expression measures and biological annotation metadata; sequence analysis; and genetic mapping of complex traits using single nucleotide polymorphisms. The procedures are based on a test statistics joint null distribution and provide Type I error control in testing problems involving general data generating distributions, null hypotheses, and test statistics.

Multiple Testing Procedures with Applications to Genomics

Multiple Testing Procedures with Applications to Genomics
Author: Sandrine Dudoit
Publisher: Springer
Total Pages: 0
Release: 2008-11-01
Genre: Science
ISBN: 9780387517094

Download Multiple Testing Procedures with Applications to Genomics Book in PDF, Epub and Kindle

This book establishes the theoretical foundations of a general methodology for multiple hypothesis testing and discusses its software implementation in R and SAS. These are applied to a range of problems in biomedical and genomic research, including identification of differentially expressed and co-expressed genes in high-throughput gene expression experiments; tests of association between gene expression measures and biological annotation metadata; sequence analysis; and genetic mapping of complex traits using single nucleotide polymorphisms. The procedures are based on a test statistics joint null distribution and provide Type I error control in testing problems involving general data generating distributions, null hypotheses, and test statistics.

Resampling-Based Multiple Testing

Resampling-Based Multiple Testing
Author: Peter H. Westfall
Publisher: John Wiley & Sons
Total Pages: 382
Release: 1993-01-12
Genre: Mathematics
ISBN: 9780471557616

Download Resampling-Based Multiple Testing Book in PDF, Epub and Kindle

Combines recent developments in resampling technology (including the bootstrap) with new methods for multiple testing that are easy to use, convenient to report and widely applicable. Software from SAS Institute is available to execute many of the methods and programming is straightforward for other applications. Explains how to summarize results using adjusted p-values which do not necessitate cumbersome table look-ups. Demonstrates how to incorporate logical constraints among hypotheses, further improving power.

Multiple Hypothesis Testing

Multiple Hypothesis Testing
Author: Houston Nash Gilbert
Publisher:
Total Pages: 372
Release: 2009
Genre:
ISBN:

Download Multiple Hypothesis Testing Book in PDF, Epub and Kindle

Statistical Modeling for Biological Systems

Statistical Modeling for Biological Systems
Author: Anthony Almudevar
Publisher: Springer Nature
Total Pages: 361
Release: 2020-03-11
Genre: Medical
ISBN: 3030346757

Download Statistical Modeling for Biological Systems Book in PDF, Epub and Kindle

This book commemorates the scientific contributions of distinguished statistician, Andrei Yakovlev. It reflects upon Dr. Yakovlev’s many research interests including stochastic modeling and the analysis of micro-array data, and throughout the book it emphasizes applications of the theory in biology, medicine and public health. The contributions to this volume are divided into two parts. Part A consists of original research articles, which can be roughly grouped into four thematic areas: (i) branching processes, especially as models for cell kinetics, (ii) multiple testing issues as they arise in the analysis of biologic data, (iii) applications of mathematical models and of new inferential techniques in epidemiology, and (iv) contributions to statistical methodology, with an emphasis on the modeling and analysis of survival time data. Part B consists of methodological research reported as a short communication, ending with some personal reflections on research fields associated with Andrei and on his approach to science. The Appendix contains an abbreviated vitae and a list of Andrei’s publications, complete as far as we know. The contributions in this book are written by Dr. Yakovlev’s collaborators and notable statisticians including former presidents of the Institute of Mathematical Statistics and of the Statistics Section of the AAAS. Dr. Yakovlev’s research appeared in four books and almost 200 scientific papers, in mathematics, statistics, biomathematics and biology journals. Ultimately this book offers a tribute to Dr. Yakovlev’s work and recognizes the legacy of his contributions in the biostatistics community.

Statistical Bioinformatics with R

Statistical Bioinformatics with R
Author: Sunil K. Mathur
Publisher: Academic Press
Total Pages: 337
Release: 2009-12-21
Genre: Mathematics
ISBN: 0123751055

Download Statistical Bioinformatics with R Book in PDF, Epub and Kindle

Statistical Bioinformatics provides a balanced treatment of statistical theory in the context of bioinformatics applications. Designed for a one or two semester senior undergraduate or graduate bioinformatics course, the text takes a broad view of the subject – not just gene expression and sequence analysis, but a careful balance of statistical theory in the context of bioinformatics applications. The inclusion of R & SAS code as well as the development of advanced methodology such as Bayesian and Markov models provides students with the important foundation needed to conduct bioinformatics. Integrates biological, statistical and computational concepts Inclusion of R & SAS code Provides coverage of complex statistical methods in context with applications in bioinformatics Exercises and examples aid teaching and learning presented at the right level Bayesian methods and the modern multiple testing principles in one convenient book

Simultaneous Statistical Inference

Simultaneous Statistical Inference
Author: Thorsten Dickhaus
Publisher: Springer Science & Business Media
Total Pages: 182
Release: 2014-01-23
Genre: Science
ISBN: 3642451829

Download Simultaneous Statistical Inference Book in PDF, Epub and Kindle

This monograph will provide an in-depth mathematical treatment of modern multiple test procedures controlling the false discovery rate (FDR) and related error measures, particularly addressing applications to fields such as genetics, proteomics, neuroscience and general biology. The book will also include a detailed description how to implement these methods in practice. Moreover new developments focusing on non-standard assumptions are also included, especially multiple tests for discrete data. The book primarily addresses researchers and practitioners but will also be beneficial for graduate students.

Modeling Dose-Response Microarray Data in Early Drug Development Experiments Using R

Modeling Dose-Response Microarray Data in Early Drug Development Experiments Using R
Author: Dan Lin
Publisher: Springer Science & Business Media
Total Pages: 285
Release: 2012-08-27
Genre: Mathematics
ISBN: 3642240070

Download Modeling Dose-Response Microarray Data in Early Drug Development Experiments Using R Book in PDF, Epub and Kindle

This book focuses on the analysis of dose-response microarray data in pharmaceutical settings, the goal being to cover this important topic for early drug development experiments and to provide user-friendly R packages that can be used to analyze this data. It is intended for biostatisticians and bioinformaticians in the pharmaceutical industry, biologists, and biostatistics/bioinformatics graduate students. Part I of the book is an introduction, in which we discuss the dose-response setting and the problem of estimating normal means under order restrictions. In particular, we discuss the pooled-adjacent-violator (PAV) algorithm and isotonic regression, as well as inference under order restrictions and non-linear parametric models, which are used in the second part of the book. Part II is the core of the book, in which we focus on the analysis of dose-response microarray data. Methodological topics discussed include: • Multiplicity adjustment • Test statistics and procedures for the analysis of dose-response microarray data • Resampling-based inference and use of the SAM method for small-variance genes in the data • Identification and classification of dose-response curve shapes • Clustering of order-restricted (but not necessarily monotone) dose-response profiles • Gene set analysis to facilitate the interpretation of microarray results • Hierarchical Bayesian models and Bayesian variable selection • Non-linear models for dose-response microarray data • Multiple contrast tests • Multiple confidence intervals for selected parameters adjusted for the false coverage-statement rate All methodological issues in the book are illustrated using real-world examples of dose-response microarray datasets from early drug development experiments.