Main Article Content

Abstract

One of the essential problems in data mining is the removal of negligible variables from the data set. This paper proposes a hybrid approach that uses rough set theory based algorithms to reduct the attribute selected from the data set and utilize reducts to raise the classification success of three learning methods; multinomial logistic regression, support vector machines and random forest using 5-fold cross validation. The performance of the hybrid approach is measured by related statistics. The results show that the hybrid approach is effective as its improved accuracy by 6-12% for the three learning methods.

Keywords

Rough set Reduction Performance Accuracy

Article Details

Author Biography

Betul Kan Kilinc, Eskisehir Technical University Department of Statistics 26470 Eskisehir Turkey

Department of Statistics
How to Cite
Kan Kilinc, B., & YAZIRLI, Y. (2020). Performance of the Hybrid Approach based on Rough Set Theory. Pakistan Journal of Statistics and Operation Research, 16(2), 217-224. https://doi.org/10.18187/pjsor.v16i2.3069

References

  1. Al-Radaideh, Q. A., Sulaiman, M. N., Selamat, M. H., & Ibrahim, H. (2005). Approximate reduct computation by rough sets based attribute weighting. IEEE International Conference on Granular Computing, Beijing, China.
  2. Asilkan, O., Faqolli, A., Gerdeci, A., & Cico, B. (2012). Estimating the market values of houses in Tirana using data mining. AWER Procedia Information Technology and Computer Science, 1, 1224–1234.
  3. Bazan J. G., Skowron A., & Synak P. (1994). Dynamic reducts as a tool for extracting laws from decisions tables. In: Ras Z.W., Zemankova M. (eds) Methodologies for Intelligent Systems. ISMIS 1994. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), 869. Springer, Berlin, Heidelberg.
  4. Bazan, J. G. (1998). A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables. Rough Sets in Knowledge Discovery 1: Methodology and Applications, 18 of Studies in Fuzziness and Soft Computing. Chapter 17, pp. 321-365, Physica-Verlag, Heidelberg, Germany.
  5. Breiman, L., Friedman, J., Stone, C.J., & Olshen, R.A. (1984). Classification and regression trees. Chapman and Hall/CRC Press, Florida.
  6. Burges, C.J.C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121-167.
  7. Chogle, A., Khaire, P., Gaud, A., & Jain, J. (2017). House price forecasting using data mining techniques. International Journal of Advanced Research in Computer and Communication Engineering, 6(12), 81–90.
  8. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
  9. Hromada, E. (2015). Mapping of real estate prices using data mining techniques. Procedia Engineering, 123, 233–240.
  10. Liu, G., & Zong, X. (2017). Research of second-hand real estate price forecasting based on data mining. IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference, Chengdu, China.
  11. Godinez, F., Hutter, D., & Monroy, R. (2004). Attribute reduction for effective intrusion detection. Advances in Web Intelligence, Second International Atlantic Web Intelligence Conference, AWIC 2004, Cancun, Mexico.
  12. Han, J., & Kamber, M. (2006). Data mining: concepts and techniques. 2nd ed., Elsevier, San Francisco, California.
  13. Hastie, T., Tibshirani, R., & Friedman, J. (2008). The elements of statistical learning: data mining, inference, and prediction. 2nd ed., Springer, Standford, California.
  14. Johnson, D. (1974). Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9, 256–278.
  15. Lawson, C., & Montgomery, D. (2006). Logistic regression analysis of customer satisfaction data. Quality Reliability Engineering International, 22, 971-984.
  16. Ohrn, A. (2001). ROSETTA: Technical Reference Manual, Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway.
  17. Pawlak, Z. (1982). Rough sets. International Journal of Computer and Information Science, 11(5), 341–356.
  18. Pawlak, Z. (1991). Rough sets theoretical aspect of reasoning about data. Kluwer Academic, Boston, Mass, USA.
  19. Pawlak, Z., Grzymala-Busse, J., Slowinski, R. & Ziarko, W. (1995). Rough sets. Communications of the ACM, 38(11), 89-95.
  20. Skowron, A., & Rauszer, C. (1992). The discernibility matrices and functions in information systems, in Slowifiski R.(ed.), Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer, Dortrecht. 331–362.
  21. Srivastava, D. K., Patnaik, K. S., & Bhambhu, L. (2010). Data classification: A Rough-SVM approach. Contemporary Engineering Sciences, 3(2), 77-86.
  22. Swiniarski, R.W., & Skowron, A. (2003). Rough set methods in feature selection and recognition. Pattern Recognition Letters, 24(6), 833-849.
  23. Tan, S., Cheng, X., & Xu, H. (2007). An efficient global optimization approach for rough set based dimensionality reduction. International Journal of Innovative. Computing, Information and Control, 3(3), 725–736.
  24. Vinterbo, S., & Ohrn, A. (2000). Minimal approximate hitting sets and rule templates. International Journal of Approximate Reasoning, 25(2), 123–143.
  25. Wroblewski, J. (1995). Finding minimal reducts using genetic algorithms. Second Annual Join Conference on Information Sciences, 186–189.
  26. Zeng, A., Pan, D., Zheng, Q. L., & Peng, H. (2006). Knowledge acquisition based on rough set theory and principal component analysis. IEEE Intelligent Systems, 21(2), 78– 85.