- The Comparison between Polynomial Regression, Generalized Additive Models and Functional Regression for Sparse data, the Children Height and Weight case study.
-
Alireza Abadi,1 Ali-Asghar Kolahi,2,* Soheila Khodakarim,3 Mohammad Fayaz,4
1. Professor of Biostatistics, Department of Community Medicine, Faculty of medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
2. Social Determinants of Health Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
3. Associate Professor, School of Allied Medical Sciences, School of Public Health and Safety, Shahid Beheshti University of Medical Sciences, Tehran, Iran
4. PhD Student of Biostatistics - School of Allied Medical Sciences, Shahid Beheshti Univesity of Medical Sciences, Tehran, Iran
- Introduction: Functional Data analysis is a branch in statistics that considers the underlying curves for the observed data with dimension reduction methods. The observations are recorded dense or sparse over the domain. In this research, we compare polynomial regression, generalized additive models, and functional regression for sparse data in the children’s height and weight dataset.
- Methods: The weight and height data are not recorded for every child in every time points, therefore there are some missing values within curves. The standard statistical models have some limitations and they do not capture the behavior of the curves. We find the optimum bandwidth for polynomial regression and generalized additive models by grid search. In this regard, we randomly split the dataset for training and testing the models and we compare them with mean square prediction error. We also register, align, approximate, and smoothing the curves by B-Spline basis functions and generalized cross-validation, respectively.
- Results: The outliers were removed and the underlying curves for the height and weight of children were estimated with the optimal bandwidth and generalized cross-validation, respectively. They have an increasing pattern.
- Conclusion: The sparse functional data are common in the different disciplines and we compare three statistical methods for modeling them. The polynomial regression, splines in the generalized additive models, and B-Splines as basis functions for functional regression models. We conclude that the behavior of the estimates is not very different from which other.
- Keywords: Height, Weight, Children, Sparse Data