JAMC

Article http://dx.doi.org/10.26855/jamc.2018.02.003

Performance analysis of classification Algorithms: A case study of Naïve Bayes and J48 in Big Data

TOTAL VIEWS: 7803

Festim Halili1, Festim Kamberi2,*

1Department of Informatics State University of Tetovo, SUT Tetovo, Macedonia.

2Department of Computer Engineering International Balkan University, IBU Skopje, Macedonia.

*Corresponding author: Festim Kamberi

Published: February 27,2018

Abstract

In the world of technology the term Big Data has emerged with new opportunities and challenges to deal with the massive amount of data. Big Data is a collection of massive and complex data sets and data volume that include the huge quantities of data, data management capabilities, social media analytics and real-time data. To find the useful information from massive amount of data to organizations, businesses, companies, vendors, we need to analyze and classify the data. Initially, in this paper we have provided an in-depth analysis of 5Vs characteristics of Big Data. In addition, we have used a comparison methodology for the two well-known classifications Naïve Bayes and J48 decision tree algorithms by using QoS parameters: accuracy, sensitivity and specificity. These results will help us deriving the conclusion for which of two algorithms are the best.

References

[1] S.V. Phaneendra & E.M. Reddy “Big Data – solutions for RDBMS problems – A survey” In 12th IEEE/IFIP Network Operations & Management Symposium (NOMS 2010) (Osaka, Japan, Arp 19-23 (2013).

[2] K. K. Reddi & D. Indira “Different Technique to Transfer Big Data: survey” IEEE Transactions on 52(8) (Aug.2013).

[3] J. Lin, MapReduce Is Good Enough? The control project. IEEE Computer 32 (2013).

[4] A. Bifet, “Mining Big Data in Real Time” Informatica 37 (2013) 15-20 DEC 2012.

[5] M. J. Berry, G. Linoff, Data Mining Techniques: For Marketing, Sales, and Customer Support, New York: John Wiley & Sons, Inc, 1997.

[6] D. T. Larose, Data Mining Methods and Models, Canada: A John Wiley & Sons, Inc, 2006.

[7] A. Kumar, O. Singh, V. Rishiwal, R. K. Dwivedi, R. Kumar, “Association Rule Mining On Web Logs For Extracting Interesting Patterns Through Weka Tool,” International Journal of Advanced Technology In Engineering And Science, vol. 3, no. 1, pp. 134-140, 2015.

[8] C. D., Discovering Knowledge in Data: An Introduction to Data Mining, Canada: John Wiley & Sons, 2014.

[9] S. Rajagopal, “Customer Data Clustering Using Data Mining Technique,” International Journal of Database Management Systems, vol. 3, no. 4, pp. 1-11, 2011.

[10] S.V. Phaneendra, E.M. Reddy,“Big Data solutions for RDBMS problems- A survey”, In 12th IEEE/ IFIP Network Operations & Management Symposium (NOMS 2010) (Osaka, Japan, Apr 19{23 2013).

[11] Aveksa Inc. (2013). Ensuring “Big Data” Security with Identity and Access Management. Waltham, MA: Aveksa.

[12] Hewlett-Packard Development Company. (2012). Big Security for Big Data. L.P.: Hewlett Packard Development Company.

[13] A. Katal, M. Wazid, Goudar, R. H. (2013). Big Data: Issues, Challenges, Tools and Good Practices. IEEE, 404-409.

[14] D. Zhu, Y. Zhang, X. Wang, et al.: Research on the methodology of technology innovation management with big data. Sci. Sci. Manage. S. & T. 4, 172–180 (2013).

[15] Q. Yu, J. Ling, Research of cloud storage security technology based on HDFS. Comput. Eng. Des. 8, 2700–2705 (2013).

[16] B. Huang, S. Xu, W. Pu, Design and implementation of MapReduce based data mining platform. Comput. Eng. Des. 2, 495–501 (2013).

[17] J. Song, X. Liu, Z. Zhu, et al.: An energy efficiency optimized resource ratio model for MapReduce. Chin. J. Comput. 1, 59–73 (2015).

[18] J. Zheng, Y. Ye, T. Tai, et al.: Design of live video streaming, recording and storage system based on Flex, Red5 and MongoDB. J. Comput. Appl. 2, 589– 592 (2014).

[19] M. H. Danham, S. Sridhar, Data mining, Introductory and Advanced Topics, Person education , 1st ed., 2006.

[20] W. Lee, S. J. Stolfo, K. W. Mok, A Data Mining Framework for Building Intrusion Detection Models.

[21] F. Halili, A. Dika. Integrated Orchestration of Web Services and the Impact of the Query Optimization. Department of Informatics State University of Tetovo, SUT Tetovo, Macedonia.

[22] F. Halili, M. K. Halili, I. Ninka. A new framework of Qos-based web service Discovery and Binding. Department of Informatics State University of Tetovo, SUT Tetovo, Macedonia.

How to cite this paper

Performance analysis of classification Algorithms: A case study of Naïve Bayes and J48 in Big Data

How to cite this paper: Festim Halili, Festim Kamberi. (2018). Performance analysis of classification Algorithms: A case study of Naïve Bayes and J48 in Big Data. Journal of Applied Mathematics and Computation, 2(2), 50-57.

DOI: http://dx.doi.org/10.26855/jamc.2018.02.003