Comparison Of Data Collection Algorithms Using Some Mixed Distributions
Abstract
The current research aims to compare different data clustering algorithms, focusing on algorithms that use Mixture distributions. We will discuss how these algorithms work, their advantages and disadvantages, and their efficiency in clustering diverse data in size and structure. A comprehensive analysis will also be conducted by applying these algorithms to multiple data sets to evaluate the performance, efficiency, and accuracy of clustering by using them in Mixture distributions. (Mixture Exponential Distribution, Mixture Weibull Distribution, Mixture Pareto distribution) were chosen as applications to study clustering algorithms. Comparing different data clustering algorithms when using Mixture distributions, which are a type of statistical models that depend on merging several probability distributions to represent data, as well as the widespread use of these algorithms in data analysis and extracting patterns of that data, which makes them a powerful tool in many practical applications such as classification, pattern recognition, and statistical predictions. After a detailed presentation of the different clustering algorithms, the algorithm evaluation mechanism, and some Mixture distributions, the researcher concluded that each algorithm has a work that cannot be dispensed with or replaced and that all algorithms are highly efficient in their field of work if the conditions and specifications of each algorithm are adhered to. Therefore, the researcher recommended dealing with these algorithms, each according to its work, to obtain the best results.
Downloads
References
Abdulmunen, Ashwan Anwer, "Converting Color Real Image to Cartoon Image Using Non-Parametric Mean-Shift Technique", Journal of Kerbala University, vol. 10 no.3 scientific, 2012.
Al-Bayati, Khader Naseef, "Comparison of Methods for Estimating the Reliability Function of the Mixture Exponential Distribution Using the Simulation Method" PhD Thesis, University of Baghdad, College of Administration and Economics, 2012.
Al-Dhafri, Abdul-Jabbar Hussein, "Comprehensive Introduction to Algorithms and Flowcharts", Ibb University, College of Education, Department of Educational and Information Technology, Republic of Yemen, 2024.
Al-Douri, Intisar Arabi Faddam and Bahiya, Ward Basem, "Comparison of Some Robust Methods for the Mixture Exponential and Mixture Pareto Distributions", University of Baghdad, College of Administration and Economics, 2018.
Allyn Treshansky and Robert McGraw, "AN OVERVIEW OF CLUSTERING ALGORITHMS", Enabling Technology for Simulation Science V, Vol. 4367, 2001.
Al-Wakeel, Ali Abdul-Hussein, "Finding the Mixture Weibull Distribution", Journal of Administrative and Economic Sciences, Volume 16, Issue 59, 2010.
Al-Wakeel, Ali A. Salih, Ahamd Mahir Razali, Asaad Mahdi, "Estimation Accuracy of Weibull Distribution Parameters", Journal of Applied Sciences Research, 2009.
Backlund, H., Hedblom, A., and Neijman, N., "DBSCAN: A Density-Based Spatial Clustering of Applications with Noise", Linköpings Universitet, 2011.
Bhat, A. A., Sofi Mudasir, and S.P. Ahmad, "Mixture of Exponential and Weighted Exponential Distribution: Properties and Applications", International Journal of Scientific Research in Mathematical and Statistical Sciences, Volume-5, Issue-6, pp.38-46, 2018.
Brian S. Everitt, Sabine Landau, Morven Leese, Daniel Stahl, "Cluster Analysis", King’s College London, UK, 5th Edition, 2011.
Christopher M. Bishop, "Pattern Recognition and Machine Learning", Microsoft Research Ltd Cambridge CB3 0FB, U.K., 2006.
Chunxia Xiao, Meng Liu, "Efficient Mean-shift Clustering Using Gaussian KD-Tree", Computer Graphics Forum, Volume 29, Issue 7, September 2010.
Frederico, Caeiro and Mina Norouzirad, "Comparing Estimation Methods for the Power–Pareto Distribution", Journals Econometrics, Volume 12, Issue 3, 2024.
Garavaglia, F., Lang, M., Paquet, E., Gaillard, J., Garcon, R., and Renard, B., "Reliability and Robustness of Rainfall Compound Distribution Model Based on Weather Pattern Sub-sampling", 2011.
http://research.microsoft.com/cmbishop
http://research.microsoft.com/cmbishop
https://doi.org/10.1111/j.1467-8659.2010.01793.x
https://doi.org/10.3390/econometrics12030020
LaPlante, F., Belacel, N., Kardouchi, M., "A Heuristic Automatic Clustering Method Based on Hierarchical Clustering", In: Duval, B., van den Herik, J., Loiseau, S., Filipe, J. (eds) Agents and Artificial Intelligence, 2015.
Mok, P., Huang, H., Kwok, Y., Au, J., "A Robust Adaptive Clustering Analysis Method for Automatic Identification of Clusters", Pattern Recognition, 45, 2012.
Nareerat Nanuwong, Winai Bodhisuwan, and Chookait Pudprommarat, "A New Mixture Pareto Distribution and Its Application", Thailand Statistician, 2015.
Nilashi, M., O. bin Ibrahim, N. Ithnin, & N. H. Sarmin, "A Multi-criteria Collaborative Filtering Recommender System for the Tourism Domain Using Expectation Maximization (EM) and PCA–ANFIS", Electronic Commerce Research and Applications, 2015.
Puzicha, J., Hofmann, T., and Buhmann, J., "Histogram Clustering for Unsupervised Image Segmentation", IEEE Proceedings of the Computer Vision and Pattern Recognition, Vol. 2, 2000.
Raval, R. U., and Jani C., "Implementing and Improvisation of K-means Clustering", International Journal of Computer Science and Mobile Computing, 4(11): 72 – 76, 2015.
Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X., "DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN", 2017.
Trevor Hastie, Robert Tibshirani, Jerome Friedman, "The Elements of Statistical Learning: Data Mining, Inference, and Prediction", Second Edition, Stanford, California, May 2001.
Yilmaz, Mehmet and Buse Buyum, "Parameter Estimation Methods for Two-Component Mixture Exponential Distributions", Journal Of The Turkish Statistical Association, Vol. 8, No. 3, 2015.