Featre selection on large-scale issues using clustering and meta-algorithms

  • Fardin Akhlaghian Tab Faculty of Communication and Modern Languages Universiti Putra Malaysia (UPM)
  • Shabnam Amiri Faculty of Communication and Modern Languages Universiti Putra Malaysia (UPM)
Keywords: Featre selection, clustering and meta-algorithms.

Abstract

Selection of appropriate input features in the increase of the efficiency of data mining algorithms has a direct and significant effect. More precisely, this extraction of knowledge from problem data is facilitated by three things: reducing data volumes, eliminating duplicate features, and eliminating unrelated features. Given this necessity, extensive research has been carried out in recent years with a variety of trends (statistical, algorithmic, and learning) in this regard. In the meantime, hyper-algorithms such as genetic algorithms have been considered by many researchers. In this research, we have tried to achieve more efficiency by combining clustering and genetic algorithms and reducing computational time.
In this regard, a new representation of the genetic algorithm corresponding to this problem is presented and its operators are appropriately defined. Also, for efficient use of clustering in this study, it was necessary to provide a relatively new algorithm for rapid clustering. To validate the proposed methods and determine their efficiency in solving real problems, several experiments have been carried out on standard data. In the next step, analyzing the proposed methods, we compared the results of the experiments with various algorithms reported in valid and new articles. These comparisons have shown improvements in the efficiency of proposed methods in terms of the accuracy of categorization and feature reduction compared to competing methods. According to the analysis, this improvement was due to the positive effect of clustering in a faster search of the problem space by the genetic algorithm and adapted display.

Downloads

Download data is not yet available.

Author Biographies

Fardin Akhlaghian Tab, Faculty of Communication and Modern Languages Universiti Putra Malaysia (UPM)

English Language Department, Faculty of Communication and Modern Languages
Universiti Putra Malaysia (UPM)

Shabnam Amiri, Faculty of Communication and Modern Languages Universiti Putra Malaysia (UPM)

English Language Department, Faculty of Communication and Modern Languages
Universiti Putra Malaysia (UPM)

References

Alexandridis, A., Patrinos, P., Sarimveis, H., & Tsekouras, G, (2005), A two-stage evolutionary algorithm for variable selection in the development of RBF neural network models. Chemometrics and Intelligent Laboratory Systems, 75(2), 149-162. doi: http://dx.doi.org/10.1016/j.chemolab.2004.06.004

Bekkerman, R., El-Yaniv, R., Tishby, N., & Winter, Y. (2003). Distributional word clusters vs. words for text categorization. J. Mach. Learn. Res., 3, 1183-1208 .

Blum, A. L., & Langley, P, (1997), Selection of relevant features and examples in machine learning. Artif. Intell., 97(1-2), 245-271. doi: 10.1016/s0004-3702(97)00063-5

Chuang, L.-Y., Chang, H.-W., Tu, C.-J., & Yang, C.-H, (2008), Improved binary PSO for feature selection using gene expression data. Comput. Biol. Chem., 32(1), 29-38. doi: 10.1016/j.compbiolchem.2007.09.005

Dhillon, I. S., Mallela, S., & Kumar, R. (2003). A divisive information theoretic feature clustering algorithm for text classification. J. Mach. Learn. Res., 3, 1265-1287 .

Guyon, I., Andr, #233, & Elisseeff, (2003), An introduction to variable and feature selection. J. Mach. Learn. Res., 3, 1182-157 .

Guyon, I., Weston, J., Barnhill, S., & Vapnik, V, (2002), Gene Selection for Cancer Classification using Support Vector Machines. Mach. Learn., 46(1-3), 389-422. doi: 10.1023/a:1012487302797

Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1–2), 273-324 . doi: http://dx.doi.org/10.1016/S0004-3702(97)00043-X

Torkkola, K, (2003), Feature extraction by non parametric mutual information maximization. J. Mach. Learn. Res., 3, 1415-1438 .
Published
2018-04-30
How to Cite
Tab, F., & Amiri, S. (2018). Featre selection on large-scale issues using clustering and meta-algorithms. Amazonia Investiga, 7(13), 17-30. Retrieved from https://www.amazoniainvestiga.info/index.php/amazonia/article/view/490
Section
Articles
Bookmark and Share