The Effect of Cluster Size for Model Performance in High-Dimensional Longitudinal Studies: A Simulation Study
| dc.contributor.author | Şengül, Merve Türkegün | |
| dc.contributor.author | Tasdelen, Bahar | |
| dc.contributor.author | Yologlu, Saim | |
| dc.date.accessioned | 2026-01-24T12:01:34Z | |
| dc.date.available | 2026-01-24T12:01:34Z | |
| dc.date.issued | 2023 | |
| dc.department | Alanya Alaaddin Keykubat Üniversitesi | |
| dc.description.abstract | Objective: In order to prevent model estimation er- rors and deviations in high-dimensional longitudinal studies, risk models are established through penalized methods. The aim of this study is to examine the effect of small cluster effects on the gener- alized estimating equations (GEE) and penalized GEE (PGEE) model performances in high-dimensional longitudinal data. Mate- rial and Methods: A simulation study was designed to compare the GEE and PGEE model performances, Type I error rates, and power in two-period longitudinal data structures with different clus- ter sizes (n=20, 30, 50, 100, 200), different numbers of predictors (p=10, 20, 50) and different correlation levels between predictors (r=0.20, 0.50, 0.80). Results: It was observed that the GEE coef- ficient estimates were misleading and inconsistent, the Type I error rates were high, and the power of the test was weak at insuf- ficient cluster sizes and high correlations between predictors. Even when the number of predictors and cluster size were in the balance (p=10, n=100, 200), Type I error rates were obtanied high for GEE. Increasing the cluster size was not enough to re- duce the Type I error rate of GEE. The PGEE produced more successful results than GEE in all conditions. The power of PGEE increased to over 80% in all scenarios. Conclusion: The PGEE yielded more consistent results by controlling the relationships both within the cluster and between the predictors. In high- dimensional longitudinal studies, it was observed that the use of PGEE is more effective than GEE. | |
| dc.identifier.doi | 10.5336/biostatic.2023-98699 | |
| dc.identifier.endpage | 170 | |
| dc.identifier.issn | 1308-7894 | |
| dc.identifier.issn | 2146-8877 | |
| dc.identifier.issue | 3 | |
| dc.identifier.startpage | 161 | |
| dc.identifier.trdizinid | 1258730 | |
| dc.identifier.uri | https://search.trdizin.gov.tr/tr/yayin/detay/1258730 | |
| dc.identifier.uri | https://doi.org/10.5336/biostatic.2023-98699 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12868/4424 | |
| dc.identifier.volume | 15 | |
| dc.indekslendigikaynak | TR-Dizin | |
| dc.language.iso | en | |
| dc.relation.ispartof | Türkiye Klinikleri Biyoistatistik Dergisi | |
| dc.relation.publicationcategory | Makale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/openAccess | |
| dc.snmz | KA_TR-Dizin_20260121 | |
| dc.subject | Model selection | |
| dc.subject | Generalized estimating equations | |
| dc.subject | penalized generalized estimating equations | |
| dc.subject | penalized methods | |
| dc.subject | high dimensional longitudinal data | |
| dc.title | The Effect of Cluster Size for Model Performance in High-Dimensional Longitudinal Studies: A Simulation Study | |
| dc.type | Article |












