North Sulawesi Single Local Fruit Detection Using Efficient Attention Module Based on Deep Learning Architecture

Vecky C. Poekoel; Dwisnanto Putro; Jane Litouw; Rivaldo Karel; Pinrolinvic D. K. Manembu; Abdul Haris Junus Ontowirjo; Feisy D. Kambey; Reynold F. Robot

doi:10.23887/janapati.v12i2.54754

Authors

Vecky C. Poekoel Sam Ratulangi University
Dwisnanto Putro Sam Ratulangi University
Jane Litouw Sam Ratulangi University
Rivaldo Karel Sam Ratulangi University
Pinrolinvic D. K. Manembu Sam Ratulangi University
Abdul Haris Junus Ontowirjo Sam Ratulangi University
Feisy D. Kambey Sam Ratulangi University
Reynold F. Robot Sam Ratulangi University

DOI:

https://doi.org/10.23887/janapati.v12i2.54754

Keywords:

local fruits, detection system, convolutional neural network, efficient architecture, attention module

Abstract

A Local fruit detection system is an agricultural vision field that can be implemented to increase the profit of a commodity. Besides that, North Sulawesi has a variety of local fruits which are widely used by people in their area and have a high selling value. The sorting system is an essential process of agricultural robots to sequentially separate fruit one by one. This automation process requires an accurate vision system to detect and separate fruit precisely and precisely. In addition, the implementation of a practical application demands a method to be able to work in real-time on low-cost devices. This work aims to design a local single fruit detection system for Sulawesi North by applying deep learning architecture to produce high performance. The architecture is designed to consist of an effective backbone for rapidly separating the distinctive features, an efficient attention module to improve feature extraction performance, and a classifier module employed to estimate the probabilities of each local fruit category. As a result, the designed model produces an accuracy value of 99,27% and 99,57% on the Fruits-360 and the local datasets, respectively. It outperforms other light architectures. In addition, deep learning models are designed to produce higher efficiency values than other competitors and can operate quickly at 100,488 Frames per Second.

References

Utara, B. P. S. (2021). Produksi Buah–Buahan dan Sayuran Tahunan Menurut Jenis Tanaman (Kuintal), 2019-2021. https://sulut.bps.go.id/statictable/2022/06/24/200/produksi-buah-buahan-dan-sayuran-tahunan-menurut-jenis-tanaman-kuintal-2019-2021.html.

Alem, A., & Kumar, S. (2022). Deep Learning Models Performance Evaluations for Remote Sensed Image Classification. IEEE Access, 10, 111784–111793.

Falahkhi, B., Achmal, E. F., Rizaldi, M., Rizki, R., & Yudistira, N. (2018). Comparison of AlexNet and ResNet Models in Flower Image Classification Utilizing Transfer Learning. Jurnal Ilmu Komputer Dan Agri-Informatika, 9(Kew 2016), 70–78.

Putro, M. D., Nguyen, D.-L., & Jo, K.-H. (2022). A Fast CPU Real-Time Facial Expression Detector Using Sequential Attention Network for Human–Robot Interaction. IEEE Transactions on Industrial Informatics, 18(11), 7665–7674.

Miranda, N. D., Novamizanti, L., & Rizal, S. (2020). Convolutional Neural Network Pada Klasifikasi Sidik Jari Menggunakan Resnet-50. Jurnal Teknik Informatika (Jutif), 1(2), 61–68.

Hu, H.-C., Chang, S.-Y., Wang, C.-H., Li, K.-J., Cho, H.-Y., Chen, Y.-T., Lu, C.-J., Tsai, T.-P., & Lee, O. K.-S. (2021). Deep Learning Application for Vocal Fold Disease Prediction Through Voice Recognition: Preliminary Development Study. Journal of Medical Internet Research, 23(6), e25247.

Putro, M. D., Kurnianggoro, L., & Jo, K.-H. (2021). High Performance and Efficient Real-Time Face Detector on Central Processing Unit Based on Convolutional Neural Network. IEEE Transactions on Industrial Informatics, 17(7), 4449–4457.

Yu, H., Xu, Z., Zheng, K., Hong, D., Yang, H., & Song, M. (2022). MSTNet: A Multilevel Spectral–Spatial Transformer Network for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing, 60, 1–13.

Dandekar, M., Punn, N. S., Sonbhadra, S. K., Agarwal, S., & Kiran, R. U. (2021). Fruit classification using deep feature maps in the presence of deceptive similar classes. Proceedings of the International Joint Conference on Neural Networks, 2021-July, 0–5.

Himabindu, D. D., & Praveen Kumar, S. (2020). A comprehensive analytic scheme for classification of novel models. Proceedings of the 3rd International Conference on Intelligent Sustainable Systems, ICISS 2020, 564–569.

Albardi, F., Kabir, H. M. Di., Bhuiyan, M. M. I., Kebria, P. M., Khosravi, A., & Nahavandi, S. (2021). A Comprehensive Study on Torchvision Pre-trained Models for Fine-grained Inter-species Classification. Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, 2767–2774.

Kabir, H. M. D., Abdar, M., Khosravi, A., Jalali, S. M. J., Atiya, A. F., Nahavandi, S., & Srinivasan, D. (2022). SpinalNet: Deep Neural Network With Gradual Input. IEEE Transactions on Artificial Intelligence.

Rathnayake, N., Rathnayake, U., Dang, T. L., & Hoshino, Y. (2022). An Efficient Automatic Fruit-360 Image Identification and Recognition Using a Novel Modified Cascaded-ANFIS Algorithm. Sensors, 22(12).

Pande, A., Munot, M., Sreeemathy, R., & Bakare, R. V. (2019). An Efficient Approach to Fruit Classification and Grading using Deep Convolutional Neural Network. 2019 IEEE 5th International Conference for Convergence in Technology, I2CT 2019, 2–8.

Srivastava, H., & Sarawadekar, K. (2020). A Depthwise Separable Convolution Architecture for CNN Accelerator. 2020 IEEE Applied Signal Processing Conference (ASPCON), 1–5.

Shadin, N. S., Sanjana, S., & Lisa, N. J. (2021). COVID-19 Diagnosis from Chest X-ray Images Using Convolutional Neural Network(CNN) and InceptionV3. 2021 International Conference on Information Technology (ICIT), 799–804.

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations (ICLR 2015), 1–14.

Zhuge, M., Fan, D.-P., Liu, N., Zhang, D., Xu, D., & Shao, L. (2023). Salient Object Detection via Integrity Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3), 3738–3752.

Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. 1–13.

Xiong, B., Fan, S., He, X., Xu, T., & Chang, Y. (2022). Small Logarithmic Floating-Point Multiplier Based on FPGA and Its Application on MobileNet. IEEE Transactions on Circuits and Systems II: Express Briefs, 69(12), 5119–5123.

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. (2018). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 4510–4520.

Andrew, H., Mark, S., Grace, C., Liang-Chieh, C., Bo, C., Mingxing, T., Weijun, W., Yukun, Z., Ruoming, P., & Vijay, V. (2019). Searching for mobilenetv3. Proceedings of the IEEE International Conference on Computer Vision, 1314–1324.

Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 6848–6856.

Ferrari, V., Sminchisescu, C., Hebert, M., & Weiss, Y. (2018). ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11218, vii–ix.

Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., & Xu, C. (2020). GhostNet: More features from cheap operations. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1577–1586.

Hu, J., Shen, L., Albanie, S., Sun, G., & Wu, E. (2020). Squeeze-and-Excitation Networks. 42(8), 2011–2023.

Hou, Q., Zhou, D., & Feng, J. (2021). Coordinate attention for efficient mobile network design. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 13708–13717.

Woo, S., Park, J., Lee, J. Y., & Kweon, I. S. (2018). CBAM: Convolutional block attention module. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11211 LNCS, 3–19.

Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., & Hu, Q. (2020). ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 11531–11539.

Mureşan, H., & Oltean, M. (2018). Fruit recognition from images using deep learning. Acta Universitatis Sapientiae, Informatica, 10(1), 26–42.

Suzuki, S., & be, K. (1985). Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing, 30(1), 32–46.

Canny, J. (1986). A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-8(6), 679–698.