ASA2022
Invited Speakers
Keynote Speakers
- Julia O'Neill
- Distinguished Fellow, CMC Modeling & Statistics Lead at Moderna
- Title: The Impact of Statistics on Vaccine Development: Moderna Lessons Learned
-
Abstract: Responding to the COVID-19 pandemic galvanized the technical team at Moderna to prepare the vaccine mRNA-1273 for emergency use in less than 12 months from sequencing to authorization. The foundation for this acceleration was the deep knowledge and expertise established during ten years of work on the mRNA (messenger RNA) product platform. Quality by Design (QbD) enabled acceleration while maintaining the highest standards of safety, efficacy, and robust manufacturing and quality controls. Certain elements of the QbD framework were essential, including patient centric quality standards and process characterization employing Design of Experiments (DoE). The extensive application of DoE on a fast-paced timeline would traditionally require a large team of statistical specialists to support the design and analysis of many experiments. But Moderna scientists and engineers from the vibrant statistics Community of Practice are empowered to design and analyze their own experiments, leaning on supportive dialogue with statistical specialists when needed. This community is nurtured by the organization’s culture, access to tools, and management support for continued learning of new skills. This statistics community of practice is founded on the principle of teaching experimenters to fish (statistically speaking) – “democratizing statistics.” This presentation will cover some highlights of Moderna’s work in design of experiments and principles behind the highly engaged statistics community of practice.
- Ying Yuan
- Bettyann Asche Murray Distinguished Professor, Department of Biostatistics, University of Texas MD Anderson Cancer Center
- Title: Model-Assisted Designs: A Novel Paradigm to Improve and Accelerate Drug Development
- Abstract: Drug development enterprise is struggling due to prohibitively high costs and slow progress. There is urgent need for adoption of novel adaptive designs to improve the efficiency and success of clinical trials. A major barrier is that novel adaptive designs are often difficult to understand, require complicated statistical modelling, demand complex computation, and need expensive infrastructure for implementation. Model-assisted design framework has been developed to address this barrier. Model-assisted designs enjoy the superior performance comparable to more complicated, model-based adaptive designs, but their decision rule can be pre-tabulated and included in the protocol, and thus implemented as simple as the conventional designs. I will review state-of-the-art model-assisted designs for early phase clinical trials. Freely available user-friendly software and trial examples will be provided.
Invited Session Speakers - Statistical Methods for Big Data Analysis
- Jonathan R. Bradley
- Associate Professor, Department of Statistics, Florida State University
- Title: Bypassing Markov Chains for Bayesian Generalized Linear Mixed Effects Models
- Abstract: Markov chain Monte Carlo (MCMC) is an all-purpose tool that allows one to generate dependent replicates from a posterior distribution for effectively any Bayesian hierarchical model. As such, MCMC has become a standard in Bayesian statistics. However, convergence issues, tuning, and the effective sample size of the MCMC are nontrivial considerations that are often overlooked or can be difficult to assess. Moreover, these practical issues can produce a significant computational burden. This motivates us to consider finding closed-form expressions of the posterior distribution that are computationally straightforward to sample from directly. We focus on a broad class of Bayesian generalized linear mixed-effects models (GLMM) that allows one to jointly model data of different types (e.g., Gaussian, Poisson, and binomial distributed observations). Exact sampling from the posterior distribution for Bayesian GLMMs is such a difficult problem that it is now arguably overlooked as a possible problem to solve. To solve this problem, we derive a new class of distributions that gives one the flexibility to specify the prior on fixed and random effects to be any conjugate multivariate distribution. We refer to this new distribution as the generalized conjugate multivariate (GCM) distribution. The expression of the exact posterior distribution is given along with the steps to obtain direct independent simulations from the posterior distribution. These direct simulations have an efficient projection/regression form, and hence, we refer to our method as Exact Posterior Regression (EPR). Several illustrations are provided.
- Kshitij Khare
- Associate Professor, Department of Statistics, University of Florida
- Title: Bayesian inference in high-dimensional mixed frequency regression
- Abstract: Technological advancements in recent years have enabled organizations to collect, organize, store and analyze very large amounts of data from variables that are available at different temporal frequencies - e.g. monthly, weekly, daily. Such data is commonly referred to as mixed frequency time series data. In this talk, we will focus on mixed frequency regression, where the response variable and the covariates are available at different frequencies (for example, quarterly vs. monthly). We will present novel Bayesian methodology for (sparse) estimation of the regression coefficients and of the (autoregressive) lag length using a nested spike-and-slab framework. This is joint work with Satyajit Ghosh and George Michailidis.
- Jiwoong Kim
- Assistant Professor, Department of Statistics, University of South Florida
- Title: Simultaneous Predictions of Both eQTLs and Chromatin Interactions Through Using a Bayesian Approach
-
Abstract: Expression quantitative trait locus (eQTL) mapping investigates how single nucleotide polymorphisms (SNPs) can affect the gene expression. Having been successful in identifying associations between risk loci and genes for various diseases, the eQTL analysis - especially using a Bayesian approach which employs the posterior probability as a criterion for predicting eQTLs - has been popular. We introduce a more advanced Bayesian approach which rigorously leverages underlying genetic features, that is, incorporates chromatin interactions into the analysis, thereby yielding a better prediction of eQTLs when compared with other methods. For the comparison purpose, we used human leukemia data, and our proposed method outstrips other competing methods. Furthermore, the application of our method was extended to other analyses: transcript factor (TF) binding peak and GWAS-SNPs. Our findings suggest that (i) eQTL SNPs identified by our method reside spatially more proximal to clusters of leukemia-relevant TF binding sites and SNPs and (ii) the posterior probability (the final result of our Bayesian approach) is a valid marker which indicates leukemia-relevant genes.
Invited Session Speakers - Statistical Methods for Medical Studies
- Rhonda Bacher
- Assistant Professor, Department of Biostatistics, University of Florida
- Title: Using R/Shiny for visualizations of DNA methylation and nucleosome occupancy in single-molecule and single-cell data
-
Abstract: Alterations in DNA methylation have been detected in nearly all cancer types. In addition, both cancer- and tissue-specific differences exist in nucleosome positioning and occupancy, as well as transcription factor binding activity, which together determine chromatin accessibility. Experiments able to simultaneously capture nucleosome occupancy and methylation states at single-molecule and single-cell resolution are now possible. We present an approach we call methylscaper, a framework implemented in R/Shiny to simultaneously visualize DNA methylation and nucleosome occupancy states from joint experiments. We demonstrate methylscaper’s ability to reveal biological patterns on both single-molecule and single-cell datasets.
- Rickey E. Carter
- Professor of Biostatistics, Department of Quantitative Health Sciences, Mayo clinic
- Title: Building a team during a pandemic: lessons learned from the United States Convalescent Plasma Expanded Access Program
- Abstract: In April 2020, the United States Convalescent Plasma Study initiated enrollment under a regulatory process known as expanded access. This real-world clinical trial allowed experimental access to convalescent plasma to obtain necessary safety and preliminary efficacy data on whether convalescent plasma could provide some signal for therapeutic benefit in the treatment of Covid-19. The scale of the study shifted from an initial sample size projection of 5,000 participants to over 100,000 participants over five months of enrollment prior to the issuance of the emergency use authorization in August 2020. In this presentation, lessons learned from this monumental undertaking will be shared focusing on the roles the statisticians and data scientists played during the study.
- Xiaoqing Yu
- Assistant professor, Department of Biostatistics and Bioinformatics, Moffitt Cancer center
- Title: Automatic single cell type identification using wide and deep learning
- Abstract: A deep understanding of the immune cells composition at the site of the tumor lesions is the key to the prediction of response to immune checkpoint blockages and the development of personalized immunotherapies. Identification of highly specific cell types is now possible with the development of single cell RNA-sequencing technology. However, the challenges in cell annotation in single cell RNA-sequencing are: 1) it heavily relies on the researchers’ knowledge on the cell-type-specific signature genes, and 2) the transcription profiles are difficult to transfer between different platforms. I will introduce a wide and deep learning framework my group has developed to automatically classify cells from scRNA-seq data. I will discuss how wide and deep learning tackled the above challenges and lesions we have learned.