잠시만 기다려 주세요. 로딩중입니다.

Mortality Prediction from Hospital-Acquired Infections in Trauma Patients Using an Unbalanced Dataset

Healthcare Informatics Research 2020년 26권 4호 p.284 ~ 294
Karajizadeh Mehrdad, Nasiri Mahdi, Yadollahi Mahnaz, Zolfaghari Amir Hussain, Pakdam Ali,
소속 상세정보
 ( Karajizadeh Mehrdad ) - Shiraz University of Medical Sciences School of Management and Information Sciences
 ( Nasiri Mahdi ) - Shiraz University of Medical Sciences School of Management and Information Sciences
 ( Yadollahi Mahnaz ) - Shiraz University of Medical Sciences Shahid Rajaee Emtiaz Trauma Hospital Trauma Research Center
 ( Zolfaghari Amir Hussain ) - Laurentian University Department of Computer Science
 ( Pakdam Ali ) - Shiraz University of Medical Sciences School of Management and Information Sciences

Abstract


Objectives: Machine learning has been widely used to predict diseases, and it is used to derive impressive knowledge in the healthcare domain. Our objective was to predict in-hospital mortality from hospital-acquired infections in trauma patients on an unbalanced dataset.

Methods: Our study was a cross-sectional analysis on trauma patients with hospital-acquired infections who were admitted to Shiraz Trauma Hospital from March 20, 2017, to March 21, 2018. The study data was obtained from the surveillance hospital infection database. The data included sex, age, mechanism of injury, body region injured, severity score, type of intervention, infection day after admission, and microorganism causes of infections. We developed our mortality prediction model by random under-sampling, random over-sampling, clustering (k-mean)-C5.0, SMOTE-C5.0, ADASYN-C5.5, SMOTE-SVM, ADASYN-SVM, SMOTE-ANN, and ADASYN-ANN among hospital-acquired infections in trauma patients. All mortality predictions were conducted by IBM SPSS Modeler 18.

Results: We studied 549 individuals with hospital-acquired infections in a trauma hospital in Shiraz during 2017 and 2018. Prediction accuracy before balancing of the dataset was 86.16%. In contrast, the prediction accuracy for the balanced dataset achieved by random under-sampling, random over-sampling, clustering (k-mean)-C5.0, SMOTE-C5.0, ADASYN-C5.5, and SMOTE-SVM was 70.69%, 94.74%, 93.02%, 93.66%, 90.93%, and 100%, respectively.

Conclusions: Our findings demonstrate that cleaning an unbalanced dataset increases the accuracy of the classification model. Also, predicting mortality by a clustered under-sampling approach was more precise in comparison to random under-sampling and random over-sampling methods.

키워드

Machine Learning; Mortality; Injuries; Healthcare Associated Infections; Data Mining; Decision Tree; C5.0

원문 및 링크아웃 정보

등재저널 정보