Effect of Word2Vec Weighting with CNN-BiLSTM Model on Emotion Classification

Merinda Lestandy; Abdurrahim

doi:10.23887/janapati.v12i1.58571

Authors

Merinda Lestandy Muhammadiyah Malang University
Abdurrahim Universitas Islam Indonesia

DOI:

https://doi.org/10.23887/janapati.v12i1.58571

Keywords:

Emotion, CNN, BiLSTM, word2vec

Abstract

Emotion is an element that can influence human behavior, which in turn influences a decision. Human emotion detection is useful in many areas, including the social environment and product quality. To evaluate and categorize emotions derived from text, a method is required. As a result, the CNN-BiLSTM model, a classification method, aids in the analysis of the text's emotional content. A word weighting technique employing word2vec as a word weighting will help the model. The CNN-BiLSTM model with Word2vec as a pre-trained model is being used in this study to find the findings with the highest accuracy. The information is split into two groups: training and testing, and it is categorized into six categories according to how each emotion manifests itself: surprise, sadness, rage, fear, love, and joy. The best outcome from the CNN-BiLSTM model's accuracy of emotion classification is 92.85%.

References

M. A. M. Shaikh, H. Prendinger, and M. Ishizuka, “A Linguistic Interpretation of the OCC Emotion Model for Affect Sensing from Text BT - Affective Information Processing,” J. Tao and T. Tan, Eds. London: Springer London, 2009, pp. 45–73. doi: 10.1007/978-1-84800-306-4_4.

F. Fanesya, R. C. Wihandika, and Indriati, “Deteksi Emosi pada Twitter Menggunakan Metode Naive Bayes dan Kombinasi Fitur,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 3, no. 7, p. 3, 2019.

A. Bandhakavi, N. Wiratunga, D. Padmanabhan, and S. Massie, “Lexicon based feature extraction for emotion text classification,” Pattern Recognit. Lett., vol. 93, pp. 133–142, 2017, doi: 10.1016/j.patrec.2016.12.009.

A. Nurdin, B. Anggo Seno Aji, A. Bustamin, and Z. Abidin, “Perbandingan Kinerja Word Embedding Word2Vec, Glove, Dan Fasttext Pada Klasifikasi Teks,” J. Tekno Kompak, vol. 14, no. 2, p. 74, 2020, doi: 10.33365/jtk.v14i2.732.

D. Hand, H. Mannila, and P. Smyth, Principles of Data Mining Cambridge, vol. 2001. 2001. [Online]. Available: http://link.springer.com/10.1007/978-1-4471-4884-5

C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. USA: Cambridge University Press, 2008.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 1st Int. Conf. Learn. Represent. ICLR 2013 - Work. Track Proc., pp. 1–12, 2013.

D. Jatnika, M. A. Bijaksana, and A. A. Suryani, “Word2vec model analysis for semantic similarities in English words,” Procedia Comput. Sci., vol. 157, pp. 160–167, 2019, doi: 10.1016/j.procs.2019.08.153.

D. I. Af’idah, R. Kusumaningrum, and B. Surarso, “Long Short Term Memory Convolutional Neural Network for Indonesian Sentiment Analysis towards Touristic Destination Reviews,” in 2020 International Seminar on Application for Technology of Information and Communication (iSemantic), 2020, pp. 630–637. doi: 10.1109/iSemantic50169.2020.9234210.

E. M. Dharma, F. L. Gaol, H. L. H. S. Warnars, and B. Soewito, “the Accuracy Comparison Among Word2Vec, Glove, and Fasttext Towards Convolution Neural Network (Cnn) Text Classification,” J. Theor. Appl. Inf. Technol., vol. 100, no. 2, pp. 349–359, 2022.

P. F. Muhammad, R. Kusumaningrum, and A. Wibowo, “Sentiment Analysis Using Word2vec and Long Short-Term Memory (LSTM) for Indonesian Hotel Reviews,” Procedia Comput. Sci., vol. 179, no. 2020, pp. 728–735, 2021, doi: 10.1016/j.procs.2021.01.061.

A. K. Sharma, S. Chaurasia, and D. K. Srivastava, “Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 1139–1147, 2020, doi: 10.1016/j.procs.2020.03.416.

W. Yue and L. Li, “Sentiment analysis using word2vec-cnn-bilstm classification,” 2020 7th Int. Conf. Soc. Netw. Anal. Manag. Secur. SNAMS 2020, pp. 3–7, 2020, doi: 10.1109/SNAMS52053.2020.9336549.

L. Xiao, G. Wang, and Y. Zuo, “Research on Patent Text Classification Based on Word2Vec and LSTM,” Proc. - 2018 11th Int. Symp. Comput. Intell. Des. Isc. 2018, vol. 1, pp. 71–74, 2018, doi: 10.1109/ISCID.2018.00023.

E. Saravia, H. T. Liu, Y. Huang, J. Wu, and Y. Chen, “CARER : Contextualized Affect Representations for Emotion Recognition,” pp. 3687–3697, 2018.

M. I. Alfarizi, L. Syafaah, and M. Lestandy, “Emotional Text Classification Using TF-IDF (Term Frequency-Inverse Document Frequency) And LSTM (Long Short-Term Memory),” JUITA J. Inform., vol. 10, no. 2, p. 225, 2022, doi: 10.30595/juita.v10i2.13262.

D. S. Ashari, B. Irawan, and C. Setianingsih, “Sentiment Analysis on Online Transportation Services Using Convolutional Neural Network Method,” in 2021 8th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), Oct. 2021, pp. 335–340. doi: 10.23919/EECSI53397.2021.9624261.

M. Lestandy, A. Abdurrahim, and L. Syafa, “Analisis Sentimen Tweet Vaksin COVID-19 Menggunakan Recurrent,” vol. 5, no. 10, pp. 802–808, 2021.

Y. Kim, “Convolutional Neural Networks for Sentence Classification,” 2014, doi: 10.48550/ARXIV.1408.5882.

K. Zhou and F. Long, “Sentiment Analysis of Text Based on CNN and Bi-directional LSTM Model,” 2018 24th Int. Conf. Autom. Comput., no. September, pp. 1–5, 2018, doi: 10.23919/IConAC.2018.8749069.

A. K. Uysal and S. Gunal, “The impact of preprocessing on text classification,” Inf. Process. Manag., vol. 50, no. 1, pp. 104–112, 2014, doi: 10.1016/j.ipm.2013.08.006.

M. Khader, A. Awajan, and G. Al-Naymat, “The Effects of Natural Language Processing on Big Data Analysis: Sentiment Analysis Case Study,” in 2018 International Arab Conference on Information Technology (ACIT), 2018, pp. 1–7. doi: 10.1109/ACIT.2018.8672697.

A. Filcha and M. Hayaty, “Implementasi Algoritma Rabin-Karp untuk Pendeteksi Plagiarisme pada Dokumen Tugas Mahasiswa,” JUITA J. Inform., vol. 7, no. 1, p. 25, 2019, doi: 10.30595/juita.v7i1.4063.

B. R. Savaliya and C. G. Philip, “Email fraud detection by identifying email sender,” in 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), 2017, pp. 1420–1422. doi: 10.1109/ICECDS.2017.8389678.

S. Pradha, M. N. Halgamuge, and N. Tran Quoc Vinh, “Effective text data preprocessing technique for sentiment analysis in social media data,” Proc. 2019 11th Int. Conf. Knowl. Syst. Eng. KSE 2019, pp. 1–8, 2019, doi: 10.1109/KSE.2019.8919368.

Y. A. Alhaj, J. Xiang, D. Zhao, M. A. A. Al-Qaness, M. A. Elaziz, and A. Dahou, “A Study of the Effects of Stemming Strategies on Arabic Document Classification,” IEEE Access, vol. 7, pp. 32664–32671, 2019, doi: 10.1109/ACCESS.2019.2903331.

W. López, J. Merlino, and P. Rodríguez-Bocca, “Learning semantic information from Internet Domain Names using word embeddings,” Eng. Appl. Artif. Intell., vol. 94, p. 103823, 2020, doi: https://doi.org/10.1016/j.engappai.2020.103823.

H. F. Fadli and A. F. Hidayatullah, “Identifikasi Cyberbullying pada Media Sosial Twitter Menggunakan Metode LSTM dan BiLSTM,” Uii , vol. 2, no. No.1, pp. 1–6, 2021, [Online]. Available: https://journal.uii.ac.id/AUTOMATA/article/view/17364