Risk Factors and Prevalence of Malaria in District Malakand: A Statistical Analysis Using Logistic Regression and K-Means Clustering

Authors

  • Fazal Shakoor
  • Muhammad Ismail Kifayat Ullah3, Syed Ishtiaq Ahmad4, Sehran Hassan5
  • Kifayat Ullah
  • Syed Ishtiaq Ahmad

Abstract

Malaria continues to be a significant public health burden in tropical regions such as District Malakand, Pakistan, where Plasmodium falciparum and P. vivax are endemic. This study employs binary logistic regression and K-means clustering to identify and analyze key risk factors associated with malaria occurrence. Logistic regression results reveal that variables such as gender (p = 0.016, OR = 5.03), type of malaria (p = 0.010, OR = 56.15), symptoms (p = 0.041, OR = 34.19), first appearance of malaria (p = 0.011, OR = 121.75), and route of transmission (p = 0.001, OR = 32.62) are statistically significant predictors. Additionally, environmental factors (p = 0.001, OR = 32.62) and the presence of water and toilet facilities (p = 0.017, OR = 0.002) play a crucial role in disease prevalence. Results from a separate model also indicate a significant negative relationship between age and malaria risk (p = 0.048), with older individuals being less susceptible, while recurrence of malaria is highly predictive of future cases (p < 0.001).

To uncover underlying patterns in the dataset, K-means clustering was applied. The optimal number of clusters was determined using the Elbow method, Silhouette analysis, and the Gap Statistic, which collectively assessed within-cluster compactness and between-cluster separation. These methods confirmed distinct groupings based on key risk attributes, supporting the identification of high-risk populations. Overall, the study underscores the need for targeted interventions, improved public health infrastructure, and enhanced awareness programs to mitigate malaria transmission in District Malakand.

Key Words: Malaria prevalence, Risk factors, Binary logistic regression, Odds Ratio, K-Mean Clustering.

Downloads

Published

2025-08-15

How to Cite

Fazal Shakoor, Muhammad Ismail, Kifayat Ullah, & Syed Ishtiaq Ahmad. (2025). Risk Factors and Prevalence of Malaria in District Malakand: A Statistical Analysis Using Logistic Regression and K-Means Clustering. Dialogue Social Science Review (DSSR), 3(8), 85–102. Retrieved from https://dialoguessr.com/index.php/2/article/view/847

Issue

Section

Applied Sciences