Classification in an imbalanced dataset is a current challenge in machine learning communities, as the class-imbalanced problem deteriorates the performance of numerous classifiers. This study introduces a two-stage intelligent data preprocessing approach to tackle the class-imbalanced problem. By modifying the penalty parameter of the support vector machine (SVM), the discriminating boundary will move toward the majority class and in turn misclassify the majority class examples as minority class examples. That is, more misclassifications for the majority class examples are equivalent to a greater number of minority class examples. Executing the SVM as a preprocessor can be used to overcome the class imbalanced problem. Sequentially, the modified dataset undergoes the random forest to defy the curse of dimensionality. Finally, the preprocessed data are fed into a rule-based classifier to generate comprehensive decision rules. According to the empirical results, the presented architecture is a promising alternative for the class-imbalanced problem.
關聯:
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS 卷: 8 期: 6 頁碼: 1981-1992