Improving Sentiment Analysis and Topic Extraction in Indonesian Travel App Reviews Through BERT Fine-Tuning

Oky Ade Irmawan; Indra Budi; Aris Budi Santoso; Prabu Kresna Putra

doi:10.23887/janapati.v13i2.77028

Authors

Oky Ade Irmawan University of Indonesia
Indra Budi Faculty of Computer Science, University of Indonesia
Aris Budi Santoso Faculty of Computer Science, University of Indonesia
Prabu Kresna Putra National Research and Innovation Agency

DOI:

https://doi.org/10.23887/janapati.v13i2.77028

Keywords:

Online Travel Agent, Sentiment Analysis, Topic Modeling, Bidirectional Encoder Representations from Transformers (BERT), LDA

Abstract

The increasing use of the internet in Indonesia has an influence on the presence of Online Travel Agents (OTA). Through the OTA application, users can book transportation and accommodation tickets more easily and quickly. The increasingly rigorous competition is causing companies like PT XYZ to be able to provide solutions to the needs and problems of their customers in the field of online ticket booking. Many customers submit reviews of the use of the PT XYZ application through Playstore and Appstore, and it needs a technique to group thousands of reviews and detect the topics discussed by customers automatically. In this study, we classified reviews from Android and iOS applications using BERT that had been adjusted through fine-tuning with IndoBERT, as well as modeling topics using LDA to evaluate the coherence score of each sentiment. The result of the comparison of hyperparameter models for the most optimal classification is epoch 4 with a learning rate of 5e-5. The accuracy obtained is 0.91, with an f1-score of 0.74. In addition, testing was carried out to compare BERT with other traditional machine learning. The best performing algorithm was Logistic Regression using TF-IDF word embeddings, achieving an accuracy of 0.890 and an F1-score of 0.865. Therefore, it can be inferred that the accuracy achieved by the fine-tuned classification model of IndoBert is sufficiently high for application in the PT XYZ review classification. Using a coherence score, we found 29 positive topics, 6 neutral topics, and 3 negative topics that were considered the most optimal. This finding can be used as evaluation material for PT XYZ to provide the best service to customers.

References

M. I. Rosyidi, “Indonesian Online Travel Agencies: Profiling the services, employment, and users,” vol. 259, no. Isot 2018, pp. 211–216, 2019, doi: 10.2991/isot-18.2019.47.

A. P. Kirana and A. Bhawiyuga, “Coronavirus (COVID-19) Pandemic in Indonesia: Cases Overview and Daily Data Time Series using Naïve Forecast Method,” Indones. J. Electron. Electromed. Eng. Med. informatics, vol. 3, no. 1, pp. 1–8, 2021, doi: 10.35882/ijeeemi.v3i1.1.

A. K. Yudha, J. Tang, and N. Leelawat, “COVID-19 Impact on Tourism Business Continuity in Indonesia: A Preliminary Systematic Review,” J. Disaster Res., vol. 17, no. 6, pp. 913–922, 2022, doi: 10.20965/jdr.2022.p0913.

Google, “Country-specific travel demand,” Destination Insights with Google, 2023. .

E. Spurer and L. Legentil, “How does customer satisfaction impact the performance of an e-commerce company?,” Tampere Univ. Appl. Sci., 2023, [Online]. Available: https://www.theseus.fi/bitstream/handle/10024/804118/Legentil_Leonie_Spurer_Elisa.pdf?sequence=3.

E. Saprudin and H. Albanna, “The Effect of Service Quality, Personal Selling, and Complaint Handling on Customer Retention of Sharia Bank Customers with Customer Satisfaction as Intervening Variable,” Bull. Islam. Econ., vol. 1, no. 2, pp. 19–33, 2023, doi: 10.14421/bie.2022.012-03.

M. Syamala and N. J. Nalini, “A deep analysis on aspect based sentiment text classification approaches,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 8, no. 5, pp. 1795–1801, 2019, doi: 10.30534/ijatcse/2019/01852019.

S. Setyani, S. S. Prasetiyowati, and Y. Sibaroni, “Multi Aspect Sentiment Analysis of Mutual Funds Investment App Bibit Using BERT Method,” vol. 9, no. 1, pp. 44–56, 2023.

Syaiful Imron, E. I. Setiawan, Joan Santoso, and Mauridhi Hery Purnomo, “Aspect Based Sentiment Analysis Marketplace Product Reviews Using BERT, LSTM, and CNN,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 7, no. 3, pp. 586–591, 2023, doi: 10.29207/resti.v7i3.4751.

K. Ahmad, F. Alam, J. Qadir, B. Qolomany, I. Khan, and T. Khan, “Sentiment Analysis of Users ’ Reviews on COVID-19 Contact Tracing Apps with a Benchmark Dataset,” pp. 1–11.

S. Assem and S. Alansary, “Sentiment Analysis From Subjectivity to ( Im ) Politeness Detection : Hate Speech From a Socio-Pragmatic Perspective,” 2022 20th Int. Conf. Lang. Eng., vol. 20, no. Im, pp. 19–23, 2022, doi: 10.1109/ESOLEC54569.2022.10009298.

P. P. A., “Performance Evaluation and Comparison using Deep Learning Techniques in Sentiment Analysis,” J. Soft Comput. Paradig., vol. 3, no. 2, pp. 123–134, 2021, doi: 10.36548/jscp.2021.2.006.

Y. Zhou, “A Review of Text Classification Based on Deep Learning,” ACM Int. Conf. Proceeding Ser., pp. 132–136, 2020, doi: 10.1145/3397056.3397082.

R. Kora and A. Mohammed, “A Comprehensive Review on Transformers Models For Text Classification,” 3rd Int. Mobile, Intelligent, Ubiquitous Comput. Conf. MIUCC 2023, pp. 60–66, 2023, doi: 10.1109/MIUCC58832.2023.10278387.

C. Zhang, “Text Classification Using Deep Learning Methods,” in 2022 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS), 2022, pp. 1327–1332.

A. Chinnalagu and A. K. Durairaj, “Comparative Analysis of BERT-base Transformers and Deep Learning Sentiment Prediction Models,” Proc. 2022 11th Int. Conf. Syst. Model. Adv. Res. Trends, SMART 2022, pp. 874–879, 2022, doi: 10.1109/SMART55829.2022.10047651.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.

S. Gunathilaka and N. De Silva, “Aspect-based Sentiment Analysis on Mobile Application Reviews,” 22nd Int. Conf. Adv. ICT Emerg. Reg. ICTer 2022, pp. 183–188, 2022, doi: 10.1109/ICTer58063.2022.10024070.

E. Noei and K. Lyons, “A survey of utilizing user-reviews posted on google play store,” CASCON 2019 Proc. - Conf. Cent. Adv. Stud. Collab. Res. - Proc. 29th Annu. Int. Conf. Comput. Sci. Softw. Eng., no. November, pp. 54–63, 2020.

M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,” Knowledge-Based Syst., vol. 226, p. 107134, 2021, doi: 10.1016/j.knosys.2021.107134.

D. Kansara and V. Sawant, “Comparison of traditional machine learning and deep learning approaches for sentiment analysis,” in Advanced Computing Technologies and Applications: Proceedings of 2nd International Conference on Advanced Computing Technologies and Applications—ICACTA 2020, 2020, pp. 365–377.

A. Areshey and H. Mathkour, “Transfer Learning for Sentiment Classification Using Bidirectional Encoder Representations from Transformers (BERT) Model,” Sensors, vol. 23, no. 11, 2023, doi: 10.3390/s23115232.

M. Y. A. Salmony and A. R. Faridi, “Bert Distillation to Enhance the Performance of Machine Learning Models for Sentiment Analysis on Movie Review Data,” Proc. 2022 9th Int. Conf. Comput. Sustain. Glob. Dev. INDIACom 2022, pp. 400–405, 2022, doi: 10.23919/INDIACom54597.2022.9763262.

A. Veltman, D. W. J. Pulle, and R. W. De Doncker, “The Transformer,” Power Syst., no. Nips, pp. 47–82, 2016, doi: 10.1007/978-3-319-29409-4_3.

D. Fimoza, A. Amalia, and T. Henny Febriana Harumy, “Sentiment Analysis for Movie Review in Bahasa Indonesia Using BERT,” 2021 Int. Conf. Data Sci. Artif. Intell. Bus. Anal. DATABIA 2021 - Proc., pp. 27–34, 2021, doi: 10.1109/DATABIA53375.2021.9650096.

E. Fernandez, Anderies, M. G. Winata, F. H. Fasya, and A. A. S. Gunawan, “Improving IndoBERT for Sentiment Analysis on Indonesian Stock Trader Slang Language,” Proc. 2022 IEEE Int. Conf. Internet Things Intell. Syst. IoTaIS 2022, pp. 240–244, 2022, doi: 10.1109/IoTaIS56727.2022.9975975.

D. Sebastian, H. D. Purnomo, and I. Sembiring, “BERT for Natural Language Processing in Bahasa Indonesia,” 2022 2nd Int. Conf. Intell. Cybern. Technol. Appl. ICICyTA 2022, pp. 204–209, 2022, doi: 10.1109/ICICyTA57421.2022.10038230.

J. C. Campbell, A. Hindle, and E. Stroulia, “Latent Dirichlet Allocation: Extracting Topics from Software Engineering Data,” Art Sci. Anal. Softw. Data, vol. 3, pp. 139–159, 2015, doi: 10.1016/B978-0-12-411519-4.00006-9.

D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003, doi: 10.1016/B978-0-12-411519-4.00006-9.

M. Işik and H. Dağ, “The impact of text preprocessing on the prediction of review ratings,” Turkish J. Electr. Eng. Comput. Sci., vol. 28, no. 3, pp. 1405–1421, 2020, doi: 10.3906/elk-1907-46.

M. A. Rosid, A. S. Fitrani, I. R. I. Astutik, N. I. Mulloh, and H. A. Gozali, “Improving Text Preprocessing for Student Complaint Document Classification Using Sastrawi,” IOP Conf. Ser. Mater. Sci. Eng., vol. 874, no. 1, 2020, doi: 10.1088/1757-899X/874/1/012017.

M. O. Ibrohim and I. Budi, “Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter,” pp. 46–57, 2019, doi: 10.18653/v1/w19-3506.

B. T. Hung, “Domain-specific versus general-purpose word representations in sentiment analysis for deep learning models,” in Frontiers in Intelligent Computing: Theory and Applications: Proceedings of the 7th International Conference on FICTA (2018), Volume 1, 2020, pp. 252–264.

N. K. Nissa and E. Yulianti, “Multi-label text classification of Indonesian customer reviews using bidirectional encoder representations from transformers language model,” Int. J. Electr. Comput. Eng., vol. 13, no. 5, pp. 5641–5652, 2023, doi: 10.11591/ijece.v13i5.pp5641-5652.

B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” 2020, [Online]. Available: http://arxiv.org/abs/2009.05387.

S. M. Isa, G. Nico, and M. Permana, “Indobert for Indonesian Fake News Detection,” ICIC Express Lett., vol. 16, no. 3, pp. 289–297, 2022, doi: 10.24507/icicel.16.03.289.

A. Marpaung, R. Rismala, and H. Nurrahmi, “Hate Speech Detection in Indonesian Twitter Texts using Bidirectional Gated Recurrent Unit,” KST 2021 - 2021 13th Int. Conf. Knowl. Smart Technol., pp. 186–190, 2021, doi: 10.1109/KST51265.2021.9415760.

B. K. Jha, C. M. V. Srinivas Akana, and R. Anand, “Question Answering System with Indic multilingual-BERT,” Proc. - 5th Int. Conf. Comput. Methodol. Commun. ICCMC 2021, no. Iccmc, pp. 1631–1638, 2021, doi: 10.1109/ICCMC51019.2021.9418387.

Y. Pan et al., “Reusing Pretrained Models by Multi-linear Operators for Efficient Training,” no. NeurIPS, 2023, [Online]. Available: http://arxiv.org/abs/2310.10699.

Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” no. 1, 2019, [Online]. Available: http://arxiv.org/abs/1907.11692.

Devansh, “How does Batch Size impact your model learning,” Medium, 2022. https://medium.com/geekculture/how-does-batch-size-impact-your-model-learning-2dd34d9fb1fa (accessed Mar. 28, 2024).

A. Komatsuzaki, “One epoch is all you need,” arXiv Prepr. arXiv1906.06669, 2019.

M. Fahmy Amin, “Confusion Matrix in Binary Classification Problems: A Step-by-Step Tutorial,” J. Eng. Res., vol. 6, no. 5, pp. 0–0, 2022, doi: 10.21608/erjeng.2022.274526.

S. Syed and M. Spruit, “Full-Text or abstract? Examining topic coherence scores using latent dirichlet allocation,” Proc. - 2017 Int. Conf. Data Sci. Adv. Anal. DSAA 2017, vol. 2018-Janua, pp. 165–174, 2017, doi: 10.1109/DSAA.2017.61.

W. Zhao et al., “A heuristic approach to determine an appropriate number of topics in topic modeling,” BMC Bioinformatics, vol. 16, no. 13, p. S8, 2015, doi: 10.1186/1471-2105-16-S13-S8.

M. Röder, A. Both, and A. Hinneburg, “Exploring the space of topic coherence measures,” WSDM 2015 - Proc. 8th ACM Int. Conf. Web Search Data Min., pp. 399–408, 2015, doi: 10.1145/2684822.2685324.