Healthcare, Vol. 14, Pages 500: Predicting Anticipated Telehealth Use: Development of the CONTEST Score and Machine Learning Models Using a National U.S. Survey
Healthcare doi: 10.3390/healthcare14040500
Authors:
Richard C. Wang
Usha Sambamoorthi
Objectives: Anticipated telehealth use is an important determinant of whether telehealth can function as a durable component of hybrid care models. However, there are limited practical tools to identify patients at risk of discontinuing telehealth. We aim to (1) identify factors associated with anticipated telehealth use; (2) develop a risk stratification tool (CONTEST); (3) compare its performance with machine learning (ML) models; and (4) evaluate model fairness across sex and race/ethnicity. Methods: We conducted a retrospective cross-sectional analysis of the 2024 Health Information National Trends Survey 7 (HINTS 7), including U.S. adults with ≥1 telehealth visit in the prior 12 months. The primary outcome was anticipated telehealth use. Survey-weighted multivariable logistic regression informed a Framingham-style point score (CONTEST). ML models (XGBoost, random forest, logistic regression) were trained and evaluated using the area under the receiver operating characteristic curve (AUROC), precision, and recall. Global interpretation used SHAP values. Fairness was assessed using group metrics (Disparate Impact, Equal Opportunity) and individual counterfactual-flip rates (CFR). Results: Approximately one-third of adults reported at least one telehealth visit in the prior year. Among these users, nearly one in ten expressed an unwillingness to continue using telehealth in the future. Four telehealth experience factors were independently associated with unwillingness to continue: lower perceived convenience, technical problems, lower perceived quality compared to in-person care, and unwillingness to recommend telehealth. CONTEST demonstrated strong discrimination for identifying individuals with lower anticipated telehealth use (AUROC 0.876; 95% CI, 0.843–0.908). XGBoost performed best among the ML models (AUROC 0.902 with all features). With the same four top features, an ML-informed point score achieved an AUROC of 0.872 (95% CI, 0.839–0.904), and a four-feature XGBoost model yielded an AUROC of 0.893 (95% CI, 0.821–0.948, p > 0.05). Group fairness metrics revealed disparities across sex and race/ethnicity, whereas individual counterfactual analyses indicated low flip rates (sex CFR: 0.024; race/ethnicity CFR: 0.013). Conclusions: A parsimonious, interpretable score (CONTEST) and feature-matched ML models provide comparable discrimination for stratifying risk of lower anticipated telehealth use. Sustained engagement hinges on convenience, technical reliability, perceived quality, and patient advocacy. Implementation should pair prediction with operational support and routine fairness monitoring to mitigate subgroup disparities.
Source link
Richard C. Wang www.mdpi.com
