根據衛生福利部國民健康署統計,全國約有200多萬名糖尿病的病友,且每年以25,000名的速度持續增加。由於疾病特性,糖尿病患者相較於一般住院患者,更容易因感染併發症而再次入院且住院天數較長。除了構成過度的醫療負擔,再入院通常與死亡率增加相關。因此,對於影響國人甚鉅的糖尿病,本研究試圖尋找影響糖尿病患者出院後再入院之關鍵因素。
本研究使用UCI網站1999-2008年間美國130家醫院臨床護理紀錄進行實驗,運用資料探勘技術進行分析,模型建立採用決策樹、貝式分類及類神經網路,以及集成法中的隨機森林、AdaBoostM1及Vote等方式,亦同時探討不同特徵篩選技術搭配不同分類方式之整合結果,以找尋最佳預測模型組合。
本研究結果發現病患入院前醫療就診狀況、入院時血糖檢測結果,以及入院後執行檢驗與診斷的數量,皆與其是否再次入院有高度關聯,對於建立糖尿病患者再次入院之預測模型有其重要性;並且驗證結合特徵選取及分類模型技術,的確能有效提高模型準確率。
According to the Health Promotion Administration, Ministry of Health and Welfare, there are more than 2 million patients with diabetes in the country, and continue to increase at a rate of 25,000 per year. Due to the disease characteristics, diabetic patients are more likely to be hospitalized again due to complications of infection and have a longer hospital stay. In addition to posing an excessive medical burden, readmissions are often associated with increased mortality. Therefore, this study attempts to identify the key factors affecting the re-admission of diabetic patients after discharge from hospitals.
We used the data from the UCI database which included the clinical care records at 130 hospitals in the United States from 1999 to 2008, and used data mining techniques for analysis. The forecast model were established using decision tree, Naive Bayes, neural network, and the random forests、AdaBoostM1、Vote of ensemble learning. Furthermore, we explores the integration results of different feature selection techniques with different classification methods to find the best combination of prediction models.
The results of this study found that the medical status before hospitalization, the blood glucose test results at admission, and the number of tests and diagnoses performed after admission were highly correlated with their re-admission. And we verifying the combination of feature selection and classification model technology can effectively improve the accuracy of the model.