Empirical Likelihood Methods in Nonignorable Covariate-missing Data Problems

Empirical Likelihood Methods in Nonignorable Covariate-missing Data Problems
Author: Yanmei Xie
Publisher:
Total Pages: 125
Release: 2019
Genre: Estimation theory
ISBN:

Download Empirical Likelihood Methods in Nonignorable Covariate-missing Data Problems Book in PDF, Epub and Kindle

Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. This dissertation contains three topics in nonignorable covariate-missing data problems, in which we study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. First, by exploitation of a probability model of missingness and a working conditional score model from a semiparametric perspective, we propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. These unbiased estimating equations naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. Based on the proposed estimating equations, we introduce three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. By utilizing the proposed empirical likelihood method on a data set from the US National Health and Nutrition Examination Survey (NHANES), we study the effect of daily alcohol consumption on hypertension. Second, we explore unconstrained and constrained empirical likelihood ratio statistics to construct empirical likelihood confidence regions for the underlying regression parameters without and with constraints. We establish the asymptotic distributions of the proposed empirical likelihood ratio statistics. The proposed empirical likelihood methods have a better finite-sample performance than other existing competitors in terms of coverage probability and interval length. An analysis on the data set from the US NHANES demonstrates that increased alcohol consumption per day is significantly associated with increased systolic blood pressure. In addition, higher body mass index and older age have a significantly higher risk of hypertension. Third, we propose a pseudo empirical likelihood ratio statistic, yet it is demonstrated following an asymptotically chi-squared distribution. Our proposed method allows for confidence interval construction without variance estimation and thus is more computationally feasible. Simulation results suggest that the proposed empirical likelihood confidence interval has a better finite-sample performance than the corresponding Wald-based competitor in terms of coverage probability and interval length. Moreover, the proposed empirical likelihood ratio test is always superior to the Wald method in terms of their power performances in our simulation studies.

Empirical Likelihood Methods in Missing Response Problems and Causal Interference

Empirical Likelihood Methods in Missing Response Problems and Causal Interference
Author: Kaili Ren
Publisher:
Total Pages: 114
Release: 2016
Genre: Causation
ISBN:

Download Empirical Likelihood Methods in Missing Response Problems and Causal Interference Book in PDF, Epub and Kindle

This manuscript contains three topics in missing data problems and causal inference. First, we propose an empirical likelihood estimator as an alternative to Qin and Zhang (2007) in missing response problems under MAR assumption. A likelihood-based method is used to obtain the mean propensity score instead of a moment-based method. Our proposed estimator shares the double-robustness property and achieves the semiparametric efficiency lower bound when the regression model and the propensity score model are both correctly specified. Our proposed estimator has better performance when the propensity score is correctly specified. In addition, we extend our proposed method to the estimation of ATE in observational causal inferences. By utilizing the proposed method on a dataset from the CORAL clinical trial, we study the causal effect of cigarette smoking on renal function in patients with ARAS. The higher cystatin C and lower CKD-EPI GFR for smokers demonstrate the negative effect of smoking on renal function in patients with ARAS. Second, we explore a more efficient approach in missing response problems under MAR assumption. Instead of using one propensity score model and one working regression model, we postulate multiple working regression and propensity score models. Moreover, rather than maximizing the conditional likelihood, we maximize the full likelihood under constraints with respect to the postulated parametric functions. Our proposed estimator is consistent if one of the propensity scores is correctly specified and it achieves the semiparametric efficiency lower bound when one of the working regression models is correctly specified as well. This estimator is more efficient than other current estimators when one of the propensity scores is correctly specified. Finally, I propose empirical likelihood confidence intervals in missing data problems, which make very weak distribution assumptions. We show that the -2 empirical log-likelihood ratio function follows a scaled chi-squared distribution if either the working propensity score or the working regression model is correctly specified. If the two models are both correctly specified, the -2 empirical log-likelihood ratio function follows a chi-squared distribution. Empirical likelihood confidence intervals perform better than Wald confidence intervals of the AIPW estimator, when sample size is small and distribution of the response is highly skewed. In addition, empirical likelihood confidence intervals for ATE can also be built in causal inference.

Multiply Robust Empirical Likelihood Inference for Missing Data and Causal Inference Problems

Multiply Robust Empirical Likelihood Inference for Missing Data and Causal Inference Problems
Author: Shixiao Zhang
Publisher:
Total Pages: 119
Release: 2019
Genre: Medical statistics
ISBN:

Download Multiply Robust Empirical Likelihood Inference for Missing Data and Causal Inference Problems Book in PDF, Epub and Kindle

Missing data are ubiquitous in many social and medical studies. A naive complete-case (CC) analysis by simply ignoring the missing data commonly leads to invalid inferential results. This thesis aims to develop statistical methods addressing important issues concerning both missing data and casual inference problems. One of the major explored concepts in this thesis is multiple robustness, where multiple working models can be properly accommodated and thus to improve robustness against possible model misspecification. Chapter 1 serves as a brief introduction to missing data problems and causal inference. In this Chapter, we highlight two major statistical concepts we will repeatedly adopt in subsequent chapters, namely, empirical likelihood and calibration. We also describe some of the problems that will be investigated in this thesis. There exists extensive literature of using calibration methods with empirical likelihood in missing data and causal inference. However, researchers among different areas may not realize the conceptual similarities and connections with one another. In Chapter 2, we provide a brief literature review of calibration methods, aiming to address some of the desirable properties one can entertain by using calibration methods. In Chapter 3, we consider a simple scenario of estimating the means of some response variables that are subject to missingness. A crucial first step is to determine if the data are missing completely at random (MCAR), in which case a complete-case analysis would suffice. We propose a unified approach to testing MCAR and the subsequent estimation. Upon rejecting MCAR, the same set of weights used for testing can then be used for estimation. The resulting estimators are consistent if the missingness of each response variable depends only on a set of fully observed auxiliary variables and the true outcome regression model is among the user-specified functions for deriving the weights. The proposed testing procedure is compared with existing alternative methods which do not provide a method for subsequent estimation once the MCAR is rejected. In Chapter 4, we consider the widely adopted pretest-posttest studies in causal inference. The proposed test extends the existing methods for randomized trials to observational studies. We propose a dual method to testing and estimation of the average treatment effect (ATE). We also consider the potential outcomes are subject to missing at random (MAR). The proposed approach postulates multiple models for the propensity score of treatment assignment, the missingness probability and the outcome regression. The calibrated empirical probabilities are constructed through maximizing the empirical likelihood function subject to constraints deducted from carefully chosen population moment conditions. The proposed method is in a two-step fashion where the first step is to obtain the preliminary calibration weights that are asymptotically equivalent to the true propensity score of treatment assignment. Then the second step is to form a set of weights incorporating the estimated propensity score and multiple models for the missingness probability and the outcome regression. The proposed EL ratio test is valid and the resulting estimator is also consistent if one of the multiple models for the propensity score as well as one of the multiple models for the missingness probability or the outcome regression models are correctly specified. Chapter 5 extends Chapter 4's results to testing the equality of the cumulative distribution functions of the potential outcomes between the two intervention groups. We propose an empirical likelihood based Mann-Whitney test and an empirical likelihood ratio test which are multiply robust in the same sense as the multiply robust estimator and the empirical likelihood ratio test for the average treatment effect in Chapter 4. We conclude this thesis in Chapter 6 with some additional remarks on major results presented in the thesis along with several interesting topics worthy of further exploration in the future.

Empirical Likelihood Inference for Two-sample Problems

Empirical Likelihood Inference for Two-sample Problems
Author: Ying Yan
Publisher:
Total Pages: 40
Release: 2010
Genre:
ISBN:

Download Empirical Likelihood Inference for Two-sample Problems Book in PDF, Epub and Kindle

In this thesis, we are interested in empirical likelihood (EL) methods for two-sample problems, with focus on the difference of the two population means. A weighted empirical likelihood method (WEL) for two-sample problems is developed. We also consider a scenario where sample data on auxiliary variables are fully observed for both samples but values of the response variable are subject to missingness. We develop an adjusted empirical likelihood method for inference of the difference of the two population means for this scenario where missing values are handled by a regression imputation method. Bootstrap calibration for WEL is also developed. Simulation studies are conducted to evaluate the performance of naive EL, WEL and WEL with bootstrap calibration (BWEL) with comparison to the usual two-sample t-test in terms of power of the tests and coverage accuracies. Simulation for the adjusted EL for the linear regression model with missing data is also conducted.

Missing Data in Longitudinal Studies

Missing Data in Longitudinal Studies
Author: Michael J. Daniels
Publisher: CRC Press
Total Pages: 324
Release: 2008-03-11
Genre: Mathematics
ISBN: 1420011189

Download Missing Data in Longitudinal Studies Book in PDF, Epub and Kindle

Drawing from the authors' own work and from the most recent developments in the field, Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis describes a comprehensive Bayesian approach for drawing inference from incomplete data in longitudinal studies. To illustrate these methods, the authors employ

Biased Sampling, Over-identified Parameter Problems and Beyond

Biased Sampling, Over-identified Parameter Problems and Beyond
Author: Jing Qin
Publisher: Springer
Total Pages: 626
Release: 2017-06-14
Genre: Business & Economics
ISBN: 9811048568

Download Biased Sampling, Over-identified Parameter Problems and Beyond Book in PDF, Epub and Kindle

This book is devoted to biased sampling problems (also called choice-based sampling in Econometrics parlance) and over-identified parameter estimation problems. Biased sampling problems appear in many areas of research, including Medicine, Epidemiology and Public Health, the Social Sciences and Economics. The book addresses a range of important topics, including case and control studies, causal inference, missing data problems, meta-analysis, renewal process and length biased sampling problems, capture and recapture problems, case cohort studies, exponential tilting genetic mixture models etc. The goal of this book is to make it easier for Ph. D students and new researchers to get started in this research area. It will be of interest to all those who work in the health, biological, social and physical sciences, as well as those who are interested in survey methodology and other areas of statistical science, among others.

Empirical Likelihood

Empirical Likelihood
Author: Art B. Owen
Publisher: Chapman and Hall/CRC
Total Pages: 304
Release: 2001-05-18
Genre: Mathematics
ISBN: 9781584880714

Download Empirical Likelihood Book in PDF, Epub and Kindle

Empirical likelihood provides inferences whose validity does not depend on specifying a parametric model for the data. Because it uses a likelihood, the method has certain inherent advantages over resampling methods: it uses the data to determine the shape of the confidence regions, and it makes it easy to combined data from multiple sources. It also facilitates incorporating side information, and it simplifies accounting for censored, truncated, or biased sampling. One of the first books published on the subject, Empirical Likelihood offers an in-depth treatment of this method for constructing confidence regions and testing hypotheses. The author applies empirical likelihood to a range of problems, from those as simple as setting a confidence region for a univariate mean under IID sampling, to problems defined through smooth functions of means, regression models, generalized linear models, estimating equations, or kernel smooths, and to sampling with non-identically distributed data. Abundant figures offer visual reinforcement of the concepts and techniques. Examples from a variety of disciplines and detailed descriptions of algorithms-also posted on a companion Web site at-illustrate the methods in practice. Exercises help readers to understand and apply the methods. The method of empirical likelihood is now attracting serious attention from researchers in econometrics and biostatistics, as well as from statisticians. This book is your opportunity to explore its foundations, its advantages, and its application to a myriad of practical problems.

Statistical Methods for Handling Incomplete Data

Statistical Methods for Handling Incomplete Data
Author: Jae Kwang Kim
Publisher: CRC Press
Total Pages: 380
Release: 2021-11-19
Genre: Mathematics
ISBN: 1000466299

Download Statistical Methods for Handling Incomplete Data Book in PDF, Epub and Kindle

Due to recent theoretical findings and advances in statistical computing, there has been a rapid development of techniques and applications in the area of missing data analysis. Statistical Methods for Handling Incomplete Data covers the most up-to-date statistical theories and computational methods for analyzing incomplete data. Features Uses the mean score equation as a building block for developing the theory for missing data analysis Provides comprehensive coverage of computational techniques for missing data analysis Presents a rigorous treatment of imputation techniques, including multiple imputation fractional imputation Explores the most recent advances of the propensity score method and estimation techniques for nonignorable missing data Describes a survey sampling application Updated with a new chapter on Data Integration Now includes a chapter on Advanced Topics, including kernel ridge regression imputation and neural network model imputation The book is primarily aimed at researchers and graduate students from statistics, and could be used as a reference by applied researchers with a good quantitative background. It includes many real data examples and simulated examples to help readers understand the methodologies.