Downloads 156

..............................

..............................

Cited by

..............................

Received date April 23, 2025

Accepted date September 15, 2025

Constrained Convolutional Neural Network Models for Optimizing Fully Connected Layer Weights in CNNs

Author Ugur Talas, Burakhan Cubukcu, Ugur Yuzgec,

Keywords #Artificial intelligence #image classification #constrained #convolutional neural network #hyperparameter #deep learning

Abstract

This study proposes constrained convolutional neural network models for determining the initial connection weights in the Fully Connected Network (FCN) layer within the Convolutional Neural Network (CNN) model, resulting in an increase in the CNN model’s performance. A literature review indicates that the constrained method is used in conjunction with CNN. However, previous studies have typically focused on using the constrained method before feature selection in CNN. In contrast, this study aims to calculate the initial values of the connection weights, one of the hyperparameters in the FCN, by using the constrained method between feature selection and the FCN layer. Five different models are proposed: The constrained Difference CNN (D-CNN), the sample Constrained CNN (C-CNN), the constrained Sum CNN (S-CNN), the Random Sum CNN (RS-CNN), and the constrained Mixed CNN (M-CNN). These proposed models and classical CNN, have been applied to the Modified National Institute of Standards and Technology (MNIST), MNIST fashion, and CIFAR-10 datasets then the results have been examined. According to the average accuracy results, the C-CNN model achieved the highest performance in the MNIST dataset with an accuracy rate of 99.03%. In the MNIST fashion dataset, the best result was obtained by the D-CNN model with an accuracy rate of 91.80%. Similarly, the D-CNN model achieved the highest performance in the CIFAR-10 dataset with an accuracy rate of 71.44%. D-CNN and C-CNN models have outperformed the other proposed models and the classical CNN. The proposed D-CNN model, which achieved successful performance on the MNIST fashion and CIFAR-10 datasets, was compared with other recent studies in the literature. The reason for the better performance of D-CNN is considered to be their calculation based on the differential operation of two different classes.

References

[1] Alsubai S., Alqahtani A., and Sha M., “Genetic Hyperparameter Optimization with Modified Scalable-Neighbourhood Component Analysis for Breast Cancer Prognostication,” Neural Networks, vol. 162, pp. 240-257, 2023. https://doi.org/10.1016/j.neunet.2023.02.035

[2] Aslam S. and Nassif A., “Deep Learning Based CIFAR-10 Classification,” in Proceedings of the Advances in Science and Engineering Technology International Conferences, Dubai, pp. 1-4, 2023. https://doi.org/10.1109/ASET56582.2023.101807 67

[3] Azam B. and Akhtar N., “Suitability of KANs for Computer Vision: A Preliminary Investigation,” arXiv Preprint, vol. arXiv: 2406.09087v2, pp. 1- 11, 2024. https://arxiv.org/abs/2406.09087v2

[4] Baldominos A., Saez Y., and Isasi P., “A Survey of Handwritten Character Recognition with MNIST and EMNIST,” Applied Sciences, vol. 9, no. 15, pp. 1-16, 2019. https://doi.org/10.3390/app9153169

[5] Bayar B. and Stamm M., “Constrained Convolutional Neural Networks: A New Approach Towards General Purpose Image Manipulation Detection,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 11, pp. 2691-2706, 2018. https://doi.org/10.1109/TIFS.2018.2825953

[6] Belete D. and Huchaiah M., “Grid Search in Hyperparameter Optimization of Machine Learning Models for Prediction of HIV/AIDS Test Results,” International Journal of Computers and Applications, vol. 44, no. 9, pp. 875-886, 2022. https://doi.org/10.1080/1206212X.2021.1974663

[7] Bergstra J. and Bengio Y., “Random Search for Hyper-Parameter Optimization,” Journal of Machine Learning Research, vol. 13, no. 2, pp. 281-305, 2012. https://jmlr.org/papers/v13/bergstra12a.html

[8] Bergstra J., Bardenet R., Bengio Y., and Kegl B., “Algorithms for Hyper-Parameter Optimization,” in Proceedings of the 25th International Conference on Neural Information Processing Systems, New York, pp. 2546-2554, 2011. https://dl.acm.org/doi/10.5555/2986459.2986743

[9] Cabada R., Rangel H., Estrada M., and Lopez H., “Hyperparameter Optimization in CNN for Learning-Centered Emotion Recognition for Intelligent Tutoring Systems,” Soft Computing, vol. 24, no. 10, pp. 7593-7602, 2020. https://doi.org/10.1007/s00500-019-04387-4

[10] Chen L., Chen P., and Lin Z., “Artificial Intelligence in Education: A Review,” IEEE Access, vol. 8, pp. 75264-75278, 2020. https://doi.org/10.1109/ACCESS.2020.2988510

[11] Coronel L., Fajardo A., and Medina R., “Horizontal Sequence Pooling Technique in Convolutional Neural Networks to Optimize Feature Extraction for DNA Sequence Classification,” The International Arab Journal of Information Technology, vol. 21, no. 5, pp. 844- 853, 2024. https://doi.org/10.34028/iajit/21/5/6

[12] Cubukcu B., Yuzgec U., Zileli R., and Zileli A., “Reliability and Validity Analyzes of Kinect V2 Based Measurement System for Shoulder Motions,” Medical Engineering and Physics, vol. 76, no. 1, pp. 20-31, 2020. https://doi.org/10.1016/j.medengphy.2019.10.017

[13] Cui J., Wang R., Si S., and Hsieh C., “Scaling Up Dataset Distillation to Imagenet-1k with Constant Memory,” in Proceedings of the International Conference on Machine Learning, Honolulu, pp. 6565-6590, 2023. https://dl.acm.org/doi/10.5555/3618408.3618670

[14] Dogan Y., “Which Pooling Method is Better: Max, Avg, or Concat (Max, Avg),” Communications Constrained Convolutional Neural Network Models for Optimizing Fully Connected ... 271 Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, vol. 66, no. 1, pp. 95-117, 2023. https://doi.org/10.33769/aupse.1356138

[15] Erkoc T. and Eskil M., “A Novel Similarity Based Unsupervised Technique for Training Convolutional Filters,” IEEE Access, vol. 11, pp. 49393-49408, 2023. https://doi.org/10.1109/ACCESS.2023.3277253

[16] Gaspar A., Oliva D., Cuevas E., Zaldivar D., and et al., “Hyperparameter Optimization in a Convolutional Neural Network Using Metaheuristic Algorithms,” Metaheuristics in Machine Learning: Theory and Applications, vol. 967, pp. 37-59, 2021. https://doi.org/10.1007/978- 3-030-70542-8_2

[17] Goodfellow I., Abadie J., Mirza M., Xu B., and et al., “Generative Adversarial Nets,” Advances in Neural Information Processing Systems, vol. 27, pp. 1-9, 2014. https://papers.nips.cc/paper_files/paper/2014/has h/f033ed80deb0234979a61f95710dbe25- Abstract.html

[18] Gronningsaeter Y., Smorvik H., and Granmo O., “An Optimized Toolbox for Advanced Image Processing with Tsetlin Machine Composites,” arXiv Preprint, vol. arXiv:2406.00704v2, pp. 1-8, 2024. https://arxiv.org/abs/2406.00704v2

[19] Haider U., Hanif M., Rashid A., and Hussain S., “Dictionary-Enabled Efficient Training of ConvNets for Image Classification,” Image and Vision Computing, vol. 135, pp. 104718, 2023. https://doi.org/10.1016/j.imavis.2023.104718

[20] Hochreiter S. and Schmidhuber J., “Long Short- Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735

[21] Huang G., Zhu Q., and Siew C., “Extreme Learning Machine: Theory and Applications,” Neurocomputing, vol. 70, no. 1-3, pp. 489-501, 2006. https://doi.org/10.1016/j.neucom.2005.12.126

[22] Huang Y., Zhou P., Yang Y., Chen T., and Li N., “Time-Delayed Reservoir Computing Based on a Two-Element Phased Laser Array for Image Identification,” IEEE Photonics Journal, vol. 13, no. 5, pp. 1-9, 2021. https://doi.org/10.1109/JPHOT.2021.3115598

[23] Hutter F., Lucke J., and Thieme L., “Beyond Manual Tuning of Hyperparameters,” KI- Kunstliche Intelligenz, vol. 29, pp. 329-337, 2015. https://doi.org/10.1007/s13218-015-0381-0

[24] Jha K., Doshi A., Patel P., and Shah M., “A Comprehensive Review on Automation in Agriculture Using Artificial Intelligence,” Artificial Intelligence in Agriculture, vol. 2, pp. 1- 12, 2019. https://doi.org/10.1016/j.aiia.2019.05.004

[25] Jiao L. and Zhao J., “A Survey on the New Generation of Deep Learning in Image Processing,” IEEE Access, vol. 7, pp. 172231- 172263, 2019. 10.1109/ACCESS.2019.2956508

[26] Kasula B., “Advancements and Applications of Artificial Intelligence: A Comprehensive Review,” International Journal of Statistical Computation and Simulation, vol. 8, no. 1, pp. 1- 7, 2016. https://journals.threws.com/index.php/IJSCS/arti cle/view/214

[27] Kaur J. and Kaur P., “A CNN Transfer Learning- Based Automated Diagnosis of COVID-19 from Lung Computerized Tomography Scan Slices,” New Generation Computing, vol. 41, no. 4, pp. 795-838, 2023. https://doi.org/10.1007/s00354- 023-00232-3

[28] Khan A., Sohail A., Zahoora U., and Qureshi A., “A Survey of the Recent Architectures of Deep Convolutional Neural Networks,” Artificial Intelligence Review, vol. 53, pp. 5455-5516, 2020. https://doi.org/10.1007/s10462-020-09825-6

[29] Krizhevsky A., Learning Multiple Layers of Features from Tiny Images, University of Toronto, 2009. https://www.cs.toronto.edu/~kriz/learning- features-2009-TR.pdf

[30] Lecun Y. and Cortes C., The MNIST Database of Handwritten Digits, https://www.lri.fr/~marc/Master2/MNIST_doc.pd f, Last Visited, 2025.

[31] Lecun Y., Bengio Y., and Hinton G., “Deep Learning,” Nature, vol. 521, no. 7553, pp. 436- 444, 2015. https://doi.org/10.1038/nature14539

[32] Lecun Y., Bottou L., Bengio Y., and Haffner P., “Gradient-based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. https://doi.org/10.1109/5.726791

[33] Li J., Lefevre P., and Abdul Majeed A., “Randomness and Interpolation Improve Gradient Descent: A Simple Exploration in CIFAR Datasets,” in Proceedings of the International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, Guangzhou, pp. 56-59, 2024. https://ieeexplore.ieee.org/document/10771347

[34] Mishra V. and Kane L., “A Survey of Designing Convolutional Neural Network Using Evolutionary Algorithms,” Artificial Intelligence Review, vol. 56, no. 6, pp. 5095-5132, 2023. https://doi.org/10.1007/s10462-022-10303-4

[35] Nichols J., Chan H., and Baker M., “Machine Learning: Applications of Artificial Intelligence to Imaging and Diagnosis,” Biophysical Reviews, vol. 11, pp. 111-118, 2019. https://doi.org/10.1007/s12551-018-0449-9

[36] Ortiz E., Trigo M., Morillo L., Caparrini F., and Olmos J., “Exploring Deep Echo State Networks 272 The International Arab Journal of Information Technology, Vol. 23, No. 2, March 2026 for Image Classification: A Multi-Reservoir Approach,” Neural Computing and Applications, vol. 36, pp. 11901-11918, 2024. https://doi.org/10.1007/s00521-024-09656-4

[37] Prakash D., Kumar K., and Kumar R., “Hyper- Parameter Optimization Using Metaheuristic Algorithms,” CVR Journal of Science and Technology, vol. 23, no. 1, pp. 37-43, 2022. https://cvr.ac.in/ojs/index.php/cvracin/article/vie w/804/648

[38] Rere L., Fanany M., and Arymurthy A., “Metaheuristic Algorithms for Convolution Neural Network,” Computational Intelligence and Neuroscience, vol. 2016, no. 1, pp. 1-13, 2016. https://doi.org/10.1155/2016/1537325

[39] Rodrigues I., Neto S., Kelner J., Sadok D., and Endo P., “Convolutional Extreme Learning Machines: A Systematic Review,” Informatics, vol. 8, no. 2, pp. 1-33, 2021. https://doi.org/10.3390/informatics8020033

[40] Srinivasan S., Francis D., Mathivanan S., Rajadurai H., and et al., “A Hybrid Deep CNN Model for Brain Tumor Image Multi- Classification,” BMC Medical Imaging, vol. 24, no. 1, pp. 1-21, 2024. https://doi.org/10.1186/s12880-024-01195-7

[41] Sun Y., Shi G., Dong W., and Xie X., “MADPL- Net: Multi-Layer Attention Dictionary Pair Learning Network for Image Classification,” Journal of Visual Communication and Image Representation, vol. 90, pp. 103728, 2023. https://doi.org/10.1016/j.jvcir.2022.103728

[42] Talas U., Yuzgec U., and Cubukcu B., “Object Recognizing Robot Application with Deep Learning,” European Journal of Science and Technology, vol. 2021, no. 31, pp. 127-133, 2021. https://doi.org/10.31590/ejosat.962558

[43] Thadeshwar H., Shah V., Jain M., Chaudhari R., and Badgujar V., “Artificial Intelligence based Self-Driving Car,” in Proceedings of the 4th International Conference on Computer, Communication and Signal Processing, Chennai, pp. 1-5, 2020. https://ieeexplore.ieee.org/document/9315223

[44] Traore B., Foguem B., and Tangara F., “Deep Convolution Neural Network for Image Recognition,” Ecological Informatics, vol. 48, pp. 257-268, 2018. https://doi.org/10.1016/j.ecoinf.2018.10.002

[45] Tuba E., Bacanin N., Strumberger I., and Tuba M., “Convolutional Neural Networks Hyperparameters Tuning,” Artificial Intelligence: Theory and Applications, vol. 973, pp. 65-84, 2021. https://doi.org/10.1007/978-3-030-72711-6_4

[46] Venkataravanappa V., Chowdappa R., Shamanna M., Krishnappa M., and et al., “Conquering Fashion MNIST with CNNs Using Computer Vision by Pretrained Models: VGG19 and RESNET50,” AIP Conference Proceedings, vol. 3131, no. 1, pp. 1-3, 2024. DOI: 10.1063/5.0229823

[47] Weng Q., Mao Z., Lin J., and Liao X., “Land-Use Scene Classification Based on a CNN using a Constrained Extreme Learning Machine,” International Journal of Remote Sensing, vol. 39, no. 19, pp. 6281-6299, 2018. https://doi.org/10.1080/01431161.2018.1458346

[48] Wirkuttis N. and Klein H., “Artificial Intelligence in Cybersecurity,” Cyber, Intelligence, and Security, vol. 1, no. 1, pp. 103-119, 2017. https://www.inss.org.il/wp- content/uploads/2017/03/Artificial-Intelligence- in-Cybersecurity.pdf

[49] Xiao H., Rasul K., and Vollgraf R., “Fashion- MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms,” arXiv Preprint, vol. arXiv:1708.07747v2, 2017. https://arxiv.org/abs/1708.07747v2

[50] Xin P., Yi T., Yi V., Yu P., and Salam Z., “Convolutional Neural Network for Fashion Images Classification (Fashion-MNIST),” Journal of Applied Technology and Innovation, vol. 7, no. 4, pp. 11-16, 2023. https://jati.sites.apiit.edu.my/wp- content/uploads/sites/11/2023/08/Volume7_Issue 4_Paper2_2023.pdf

[51] Yang C., Li H., Lin F., Jiang B., and Zhao H., “Constrained R-CNN: A General Image Manipulation Detection Model,” in Proceedings of the IEEE International Conference on Multimedia and Expo, London, pp. 1-6, 2020. https://doi.org/10.1109/ICME46284.2020.9102825

[52] Yang L. and Shami A., “On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice,” Neurocomputing, vol. 415, pp. 295-316, 2020. https://doi.org/10.1016/j.neucom.2020.07.061

[53] Yu T. and Zhu H., “Hyper-Parameter Optimization: A Review of Algorithms and Applications,” arXiv Preprint, vol. arXiv:2003.05689, pp. 1-56, 2020. https://arxiv.org/abs/2003.05689v1

[54] Zhu W., Miao J., and Qing L., “Constrained Extreme Learning Machine: A Novel Highly Discriminative Random Feedforward Neural Network,” in Proceedings of the International Joint Conference on Neural Networks, Beijing, pp. 800-807, 2014. https://doi.org/10.1109/IJCNN.2014.6889761

,abstract={

},
keywords={Artificial intelligence,image classification,constrained,convolutional neural network,hyperparameter,deep learning},
ISSN={2413-9351},
month={Jan}}