This was a retrospective case–control study utilizing the data available in the All of Us database. The database includes patients 18 years and older recruited within the period of May 2018 to January 2022. In the All of Us platform, volunteers may complete health-related surveys and consent to sharing their electronic health records (EHRs). The health data of each participant were transferred to the database; they were then deidentified and made accessible to registered users for research purposes.
37 Data used in this study are available to registered researchers of the All of Us Researcher Workbench. For information about access, please visit
https://www.researchallofus.org/.
This study selected two cohorts of adult patients with associated EHRs using the All of Us data analysis workbench: KC patients and matched non-KC patients. The KC patients were identified by the Systemized Nomenclature of Medicine, Clinical Terms (SNOMED-CT) term “keratoconus,” based on the International Classification of Diseases (ICD) codes for KC. The KC cohort was built to include all patients with at least one KC diagnosis in their EHRs (n = 572). Age, ethnicity, and sex data were extracted from the survey data available in the database. Participants in the KC cohort were matched with three controls based on age ± 1 year, ethnicity, and sex to create the matched non-KC cohort (n = 1716).
ICD codes were used to identify covariates, including a history of diabetes, sleep apnea, obesity, dry eye, eczema, allergic rhinitis, allergic or atopic conjunctivitis, tetracycline medication, vitamin C supplementation, pregnancy, and estrogen-containing medication. History of smoking was identified using a combination of survey data and ICD codes. Participants who reported smoking more than 100 cigarettes in their lifetime, smoking for more than 5 years, or smoking a cigarette in the last month were defined as smokers by the survey results. Due to significant overlap in the pathophysiology and previously reported strong evidence for dry eye, allergic or atopic conjunctivitis, and eczema, all participants with these risk factors were grouped into one category referred to as “ocular surface disease.”
Advanced statistical techniques were employed within the All of Us researcher workbench to analyze the collected data. Data analysis was conducted on the Jupyter Notebook web-based platform using the R programming language (R Foundation for Statistical Computing, Vienna, Austria). Tests were performed using the “oddsratio()” function from the “epitools” package and “glm()” function from the “stats” package. Univariable odds ratios (ORs) were calculated comparing KC and each collected element of the participants’ health histories (ocular surface disease, diabetes, sleep apnea, obesity, smoking, allergic rhinitis, pregnancy, estrogen-containing medications, tetracyclines, and vitamin C supplementation). A logistic regression was performed using all variables to show significant correlations to KC (ocular surface disease, diabetes, sleep apnea, obesity, smoking, allergic rhinitis, pregnancy, estrogen-containing medications, tetracyclines, and vitamin C supplementation) as covariates to calculate multivariable ORs. Also, 95% confidence intervals (CIs) and P values were additionally calculated. All analyses exploring the history of pregnancy and estrogen-containing medication included only female KC participants (n = 316) and their matched controls (n = 948).