[ad_1]
Abstract
Introduction This study seeks to determine incidence, comorbidities and drivers for new HIV infections to develop, test and validate a risk prediction model for screening for new cases of HIV.
Methods and analysis The study has two components: a cross-sectional study to develop the prediction model using the HIV dataset from the Kenya AIDS and STI Control Programme and a 15-month prospective study for the validation of the model. Inferential analysis will be conducted using algorithms that perform best in disease prediction: Extreme Gradient Boosting (XGBoost) and Multilayer Perceptron. Model sensitivity and specificity will be examined using the receiver operating characteristic curve, and performance will be evaluated using metrics: accuracy, precision, recall and F1 score.
Ethics and dissemination The study obtained ethical approval (JKU/ISERC/02321/1421) from the Jomo Kenyatta University of Agriculture and Technology Ethical and Research Board and a research licence (NACOSTI/P/24/414749) from the National Commission for Science, Technology and Innovation.
Introduction
HIV remains one of the most life-threatening infectious diseases worldwide.1 A recent report by WHO shows that 85.6 million people have been infected with HIV globally. Sub-saharan Africa bears the greatest burden of HIV, with a prevalence of 25.6 million people living with HIV. In Kenya, 1.4 million people are reported to be infected with HIV, and 13 000 adults aged 15 years and above were newly infected with HIV during the year 2023.2
Screening for HIV cases is an effective method for the identification of new HIV infections.3 Community-based screening, despite the associated cost, is the most effective method for the identification of new infections of HIV. Universal screening, which involves subjecting all persons seeking primary healthcare services, seems to be a possible alternative to community-based screening.4 However, HIV testing faces challenges such as inadequate personnel and resource limitations.5 6
HIV incidence stands at about 8.5% and 13% for males and females, respectively. Predictors for new HIV infection include level of education, economic capability, use of alcohol, age when a person first had sex, gender and relationship to the head of the family.7 Also, the presence of morbidities such as oral candidiasis, lymphadenopathy and persistent fever is more common among cases of new HIV infections.8
A recent review9 identified a few studies that attempted to develop HIV risk-scoring functions using retrospective study designs. One10 compared the performance of eight machine learning algorithms, and results had limitations such as a small sample size comprised of imbalanced classes. Also,7 11 utilised large secondary datasets (87 000 and 83 000, respectively) characterised by a high degree of missing data12 and model performance was inconclusive due to the fact that the dataset was self-reported.
This study seeks to determine the incidence, comorbidities and drivers for new HIV infections among persons 15 years and above attending outpatient clinics at selected hospitals in Kenya and develop, test and validate a risk-scoring function for screening for new HIV infections. Even though this study uses a retrospective dataset for the development of the ML model, we are planning to ensure that the dataset used in this study is both representative and of good quality. Unbalanced data results in skewed over-fitted models that are ungeneralisable, thus resulting in high recall. Therefore, our study aims to bridge these gaps in order to develop a predictive model for HIV screening that enhances performance while minimising low recall.
[ad_2]
Source link




