The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


A Hybrid Grey Wolf-Whale Optimization Algorithm for Classification of Corona Virus Genome Sequences using Deep Learning

Genome sequence data is widely accepted as complex data and is still growing in an exponential rate. Classification of genome sequences plays a crucial role as it finds its applications in the area of biology, medical and forensics etc. For classification, Genome sequences can be represented in terms of features. More number of less significant features leads to lower accuracy in classification task. Feature selection addresses this issue by selecting the most important features which aids to improve the accuracy and lessens the computational complexity. In this research, Hybrid Grey Wolf-Whale Optimization Algorithm (HGWWOA) is proposed for Genome sequence classification. The proposed algorithm is evaluated using 23 benchmark objective functions along with Convolutional Neural Network classifier and its efficiency is verified using a novel metric namely “Feature Reduction Rate”. The proposed optimization algorithm can be applied for any optimization problems. In this research work, the proposed algorithm is used for classification of Corona Virus genome sequences. Performance comparison of the proposed and existing algorithms was carried out and it is evident that the performance of proposed algorithm exceeds the previous algorithms with an accuracy of 98.2%.

  1. Afshar M. and Usefi H., “High-Dimensional Feature Selection for Genomic Datasets,” Knowledge-Based Systems, vol. 206, pp. 106370, 2020.
  2. Aghdam M., Tanha J., Naghsh-Nilchi A., and Basiri M., “Combination of Ant Colony Optimization and Bayesian Classification for Feature Selection in a Bioinformatics Dataset,” Journal of Computer Science and Systems Biology, vol. 2, no. 3, pp. 186-199, 2009.
  3. Ahuja J. and Ratnoo S., “Feature Selection Using Multi-Objective Genetic Algorith M: a Hybrid Approach,” INFOCOMP Journal of Computer Science, vol. 14, no. 1, pp. 26-37, 2015.
  4. Al-Janabi M. and Ismail M., “Improved Intrusion Detection Algorithm Based on TLBO and GA Algorithms The International Arab Journal of Information Technology, vol. 18, no. 2, pp. 170-179, 2021.
  5. Chatzilygeroudis K., Vrahatis A., Tasoulis S., and Vrahatis, M., “Feature Selection in Single-Cell RNA-seq Data via a Genetic Algorithm,” in Proceedings of the International Conference on Learning and Intelligent Optimization, Athens, pp. 66-79, 2021.
  6. Chuzhanova N., Jones A., and Margetts S., “Feature Selection for Genetic Sequence Classification,” Bioinformatics (Oxford, England), vol. 14, no. 2, pp. 139-143, 1998.
  7. Elsadekl S., Makhlouf M., El-Sayed B., and Mohamed H.., “Hybrid Feature Selection using Swarm and Genetic Optimization for DNA Copy Number Variation,” International Journal of Engineering Research and Technology, Vol. 12, no. 7, pp. 1110-1116, 2019.
  8. Faris H., Mirjalili S., Aljarah I., Mafarja M., and Heidari A., Nature-Inspired Optimizers, Springer, 2020.
  9. Garcia-Díaz P., Sánchez-Berriel I., Martínez-Rojas J., and Diez-Pascual A., “Unsupervised Feature Selection Algorithm for Multiclass Cancer Classification of Gene Expression RNA-Seq Data,” Genomics, vol. 112, no. 2, pp. 1916-1925, 2020.
  10. https://www.cdc.gov/coronavirus/types.html, Last Visited, 2021.
  11. https://www.ncbi.nlm.nih.gov/, Last Visited, 2020.
  12. Krithiga R. and Ilavarasan E., “A Novel Hybrid Algorithm to Classify Spam Profiles in Twitter,” Webology, vol. 17, no. 1, pp. 260-279, 2020.
  13. Leclercq M., Vittrant B., Martin-Magniette M., Scott Boyer M., Perin O., Bergeron A., Fradet Y., and Droit A., “Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional Omics Data,” Frontiers in Genetics, vol. 10, no. 452, 2019.
  14. Leung M., Delong A., Alipanahi B., and Frey B., “Machine Learning in Genomic Medicine: a Review of Computational Problems and Data Sets,” Proceedings of the IEEE, vol. 104, no. 1, pp. 176-197, 2015.
  15. Lo Bosco G. and Pinello L., “A New Feature Selection Methodology for K-Mers Representation of DNA Sequences,” in proceedings of the International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, Naples pp. 99-108, 2014.
  16. Ma S. and Huang J., “Penalized Feature Selection and Classification in Bioinformatics,” Briefings in Bioinformatics, vol. 9, no. 5, pp. 392-403, 2008.
  17. Mirjalili S. and Lewis A., “The Whale Optimization Algorithm,” Advances in Engineering Software, vol. 95, pp. 51-67, 2016.
  18. Mirjalili S., Mirjalili S., and Lewis A., “Grey Wolf Optimizer,” Advances in Engineering Software, vol. 69, pp. 46-61, 2014.
  19. Mohammed H., Umar S., and Rashid T., “A Systematic and Meta-Analysis Survey of Whale Optimization Algorithm,” Computational Intelligence and Neuroscience, vol. 2019, 2019.
  20. Muthulakshmi M. and Murugeswari G., “A Novel Feature Extraction from Genome Sequences for Taxonomic Classification of Living Organisms,” Turkish Journal of Computer and Mathematics Education, vol. 12, no. 2, pp. 1436-1451, 2021.
  21. Nguyen N., Tran V., Ngo D., Phan D., Lumbanraja F., Faisal M., Abapihi B., Kubo M., and Satou K., “DNA Sequence Classification by Convolutional Neural Network,” Journal Biomedical Science and Engineering, vol. 9, no. 5, pp. 280-286, 2016.
  22. Perez-Riverol Y., Kuhn M., Vizcaíno J., Hitz M., and Audain, E., “Accurate and Fast Feature Selection Workflow for High-Dimensional Omics Data,” PloS one, vol. 12, no. 12, 2017.
  23. Qin X., Zhang S., Yin D., Chen D., and Dong X., “Two-Stage Feature Selection For Classification Of Gene Expression Data Based on An Improved Salp Swarm Algorithm,” Mathematical Biosciences and Engineering, vol. 19, no. 12, pp. 13747-13781, 2022.
  24. Qin Y., Yalamanchili H., Qin J., Yan B., and Wang J., “The Current Status and Challenges in Computational Analysis of Genomic Big Data,” Big Data Research, vol. 2, no.1, pp. 12-18, 2015.
  25. Singh N. and Singh S., “Hybrid Algorithm of Particle Swarm Optimization and Grey Wolf Optimizer for Improving Convergence Performance,” Journal of Applied Mathematics, vol. 2017, 2017.
  26. Singh N., Singh S., and Houssein, E., “Hybridizing Salp Swarm Algorithm with Particle Swarm Optimization Algorithm for Recent Optimization Functions,” Evolutionary Intelligence, pp. 1-34, 2022.
  27. Stawiski K., Kaszkowiak M., Mikulski D., Hogendorf P., Durczynski A., Strzelczyk J., Chowdhury D., and Fendler W., “Omicselector: Automatic Feature Selection and Deep Learning Modeling for Omic Experiments,” BioRxiv, 2022.
  28. Tadist K., Najah S., Nikolov N., Mrabti F., and Zahi A, “Feature Selection Methods and Genomic Big Data: a Systematic Review,” Journal of Big Data, vol. 6, no. 79, pp. 1-24, 2019.
  29. Trivedi I., Jangir P., Kumar A., Jangir N., and Totlani R., “A Novel Hybrid PSO-WOA Algorithm for Global Numerical Functions Optimization,” Advances in Computer and Computational Sciences, pp. 53-60, 2018.
  30. Wang D., Tan D., and Liu L., “Particle Swarm Optimization Algorithm: An Overview,” Soft Computing, vol. 22, no. 2, pp. 387-408, 2018.
  31. Wang L., Wang Y., and Chang Q., “Feature Selection Methods for Big Data Bioinformatics: A Survey from the Search Perspective,” Methods, vol. 111, pp. 21-31, 2016.