Main Article Content
Abstract
One of the essential problems in data mining is the removal of negligible variables from the data set. This paper proposes a hybrid approach that uses rough set theory based algorithms to reduct the attribute selected from the data set and utilize reducts to raise the classification success of three learning methods; multinomial logistic regression, support vector machines and random forest using 5-fold cross validation. The performance of the hybrid approach is measured by related statistics. The results show that the hybrid approach is effective as its improved accuracy by 6-12% for the three learning methods.
Keywords
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following License
CC BY: This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
References
- Al-Radaideh, Q. A., Sulaiman, M. N., Selamat, M. H., & Ibrahim, H. (2005). Approximate reduct computation by rough sets based attribute weighting. IEEE International Conference on Granular Computing, Beijing, China.
- Asilkan, O., Faqolli, A., Gerdeci, A., & Cico, B. (2012). Estimating the market values of houses in Tirana using data mining. AWER Procedia Information Technology and Computer Science, 1, 1224-1234.
- Bazan J. G., Skowron A., & Synak P. (1994). Dynamic reducts as a tool for extracting laws from decisions tables. In: Ras Z.W., Zemankova M. (eds) Methodologies for Intelligent Systems. ISMIS 1994. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), 869. Springer, Berlin, Heidelberg.
- Bazan, J. G. (1998). A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables. Rough Sets in Knowledge Discovery 1: Methodology and Applications, 18 of Studies in Fuzziness and Soft Computing. Chapter 17, pp. 321-365, Physica-Verlag, Heidelberg, Germany.
- Breiman, L., Friedman, J., Stone, C.J., & Olshen, R.A. (1984). Classification and regression trees. Chapman and Hall/CRC Press, Florida.
- Burges, C.J.C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121-167.
- Chogle, A., Khaire, P., Gaud, A., & Jain, J. (2017). House price forecasting using data mining techniques. International Journal of Advanced Research in Computer and Communication Engineering, 6(12), 81-90.
- Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297.
- Hromada, E. (2015). Mapping of real estate prices using data mining techniques. Procedia Engineering, 123, 233-240.
- Liu, G., & Zong, X. (2017). Research of second-hand real estate price forecasting based on data mining. IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference, Chengdu, China.
- Godinez, F., Hutter, D., & Monroy, R. (2004). Attribute reduction for effective intrusion detection. Advances in Web Intelligence, Second International Atlantic Web Intelligence Conference, AWIC 2004, Cancun, Mexico.
- Han, J., & Kamber, M. (2006). Data mining: concepts and techniques. 2nd ed., Elsevier, San Francisco, California.
- Hastie, T., Tibshirani, R., & Friedman, J. (2008). The elements of statistical learning: data mining, inference, and prediction. 2nd ed., Springer, Standford, California.
- Johnson, D. (1974). Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9, 256-278.
- Lawson, C., & Montgomery, D. (2006). Logistic regression analysis of customer satisfaction data. Quality Reliability Engineering International, 22, 971-984.
- Ohrn, A. (2001). ROSETTA: Technical Reference Manual, Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway.
- Pawlak, Z. (1982). Rough sets. International Journal of Computer and Information Science, 11(5), 341-356.
- Pawlak, Z. (1991). Rough sets theoretical aspect of reasoning about data. Kluwer Academic, Boston, Mass, USA.
- Pawlak, Z., Grzymala-Busse, J., Slowinski, R. & Ziarko, W. (1995). Rough sets. Communications of the ACM, 38(11), 89-95.
- Skowron, A., & Rauszer, C. (1992). The discernibility matrices and functions in information systems, in Slowifiski R.(ed.), Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer, Dortrecht. 331-362.
- Srivastava, D. K., Patnaik, K. S., & Bhambhu, L. (2010). Data classification: A Rough-SVM approach. Contemporary Engineering Sciences, 3(2), 77-86.
- Swiniarski, R.W., & Skowron, A. (2003). Rough set methods in feature selection and recognition. Pattern Recognition Letters, 24(6), 833-849.
- Tan, S., Cheng, X., & Xu, H. (2007). An efficient global optimization approach for rough set based dimensionality reduction. International Journal of Innovative. Computing, Information and Control, 3(3), 725-736.
- Vinterbo, S., & Ohrn, A. (2000). Minimal approximate hitting sets and rule templates. International Journal of Approximate Reasoning, 25(2), 123-143.
- Wroblewski, J. (1995). Finding minimal reducts using genetic algorithms. Second Annual Join Conference on Information Sciences, 186-189.
- Zeng, A., Pan, D., Zheng, Q. L., & Peng, H. (2006). Knowledge acquisition based on rough set theory and principal component analysis. IEEE Intelligent Systems, 21(2), 78- 85.