Comparison of K-NN, SVM, and Random Forest Algorithm for Detecting Hoax on Indonesian Election 2024
DOI:
https://doi.org/10.23887/janapati.v13i1.76079Keywords:
Indonesian Election 2024, TF-IDF, K-NN, TWEET, HOAX DETECTIONAbstract
During the year 2022, The Indonesian National Police (POLRI) received 113 reports related to the spread of hoax news related to 2024 Indonesian Election (PEMILU). There are still relatively few hoax detection tools that already exist in Indonesia. This research creates a system that can detect hoax news in Indonesian tweets about the Indonesian Election (PEMILU) 2024 by comparing three methods, namely K-NN, SVM, and Random Forest. The process of labeling (create model) using validation on ground truth data, namely cekfakta.tempo, cekfakta.kompas, and turnbackhoax.id. In this research, we also check the differences between different types of distance measurements in applying the K-NN algorithm. The method used for feature extraction in this research is TF-IDF. The results of experiments show that the highest accuracy results are obtained using the SVM and K-NN algorithms with distance measurements using Euclidean Distance, which is 86.36%. The best precision value is obtained using the K-NN algorithm with distance measurements using Manhattan Distance, which is 86.95%.
References
Nurhayati and A. Pasaribu, “Perancangan Sistem Pendeteksi Berita Hoax Menggunakan Algoritma Levenshtein Distance Berbasis Php,” J. SAINTIKOM (Jurnal Sains Manaj. Inform. dan Komputer), vol. 19, no. 2, p. 74, 2020, doi: 10.53513/jis.v19i2.2601.
Indra, S. Setiawati, S. Vaddhana, and A. Septiarini, “Comparison of Naive Bayes and Support Vector Machine for Detecting Hoax in Indonesian Tweet Case Study of Tweet Covid-19,” Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2022-Octob, no. October, pp. 61–66, 2022, doi: 10.23919/EECSI56542.2022.9946515.
C. S. Sriyano and E. B. Setiawan, “Pendeteksian Berita Hoax Menggunakan Naive Bayes Multinomial Pada Twitter dengan Fitur Pembobotan TF-IDF,” e-Proceeding Eng. Vol.8, No.2, vol. 8, no. 2, pp. 3396–3405, 2021.
Q. Liao et al., “An Integrated Multi-Task Model for Fake News Detection,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 11, pp. 5154–5165, 2022, doi: 10.1109/TKDE.2021.3054993.
L. Wu, P. Liu, Y. Zhao, P. Wang, and Y. Zhang, “Human Cognition-Based Consistency Inference Networks for Multi-Modal Fake News Detection,” IEEE Trans. Knowl. Data Eng., vol. 36, no. 1, pp. 211–225, 2024, [Online]. Available: https://ieeexplore.ieee.org/document/10138033
A. H. J. Almarashy, M.-R. Feizi-Derakhshi, and P. Salehpour, “Enhancing Fake News Detection by Multi-Feature Classification,” IEEE Trans. Knowl. Data Eng., vol. 11, pp. 139601–139613, 2023, doi: 10.1109/ACCESS.2023.3339621.
A. Heidari, N. J. Navimipour, H. Dag, S. Talebi, and M. Unal, “A Novel Blockchain-Based Deepfake Detection Method Using Federated and Deep Learning Models,” Cognit. Comput., no. 0123456789, 2024, doi: 10.1007/s12559-024-10255-7.
M. Audina, A. E. Karyawati, I. W. Supriana, I. K. G. Suhartana, I. G. S. Astawa, and I. W. Santiyasa, “Klasifikasi Berita Hoaks Covid-19 Menggunakan Kombinasi Metode K-Nearest Neighbor dan Information Gain,” JELIKU (Jurnal Elektron. Ilmu Komput. Udayana), vol. 10, no. 4, p. 319, 2022, doi: 10.24843/jlk.2022.v10.i04.p02.
E. Zuliarso, M. T. Anwar, K. Hadiono, and I. Chasanah, “Detecting Hoaxes in Indonesian News Using TF/TDM and K Nearest Neighbor,” IOP Conf. Ser. Mater. Sci. Eng., vol. 835, no. 1, pp. 0–6, 2020, doi: 10.1088/1757-899X/835/1/012036.
M. K. Elhadad, K. F. Li, and F. Gebali, “Detecting misleading information on COVID-19,” IEEE Access, vol. 8, pp. 165201–165215, 2020, doi: 10.1109/ACCESS.2020.3022867.
I. L. Kharisma, D. A. Septiani, A. Fergina, and K. Kamdan, “Penerapan Algoritma Decision Tree untuk Ulasan Aplikasi Vidio di Google Play,” J. Nas. Teknol. dan Sist. Inf., vol. 9, no. 2, pp. 218–226, 2023, doi: 10.25077/teknosi.v9i2.2023.218-226.
I. Alfina, R. Mulia, M. I. Fanany, and Y. Ekanata, “Hate speech detection in the Indonesian language: A dataset and preliminary study,” 2017 Int. Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2017, vol. 2018-Janua, no. October, pp. 233–237, 2017, doi: 10.1109/ICACSIS.2017.8355039.
F. G. Weddiningrum, “Deteksi Konten Hoax Berbahasa Indonesia Pada Media Sosial Menggunakan Metode Levenshtein Distance,” Perpust. Univ. Islam Neger Sunan Ampel, pp. 1–78, 2018.
P. D. Nugraha, S. al Faraby, and Adiwijaya, “Klasifikasi Dokumen Menggunakan Metode Knn Dengan Information Gain,” eProceedings Eng., vol. 5, no. 1, pp. 1541–1550, 2018.
M. Addanki, “Integrating Sentiment Analysis in Book Recommender System by using Rating Prediction and DBSCAN Algorithm with Hybrid Filtering Technique,” 2023.
Y. Miftahuddin, S. Umaroh, and F. R. Karim, “Perbandingan Metode Perhitungan Jarak Euclidean, Haversine, Dan Manhattan Dalam Penentuan Posisi Karyawan (Studi Kasus : Institut Teknologi Nasional Bandung),” J. Tekno Insentif, vol. 14, no. 2, pp. 69–77, 2020, [Online]. Available: https://jurnal.lldikti4.or.id/index.php/jurnaltekno/article/view/270
V. K. Gupta, A. Gupta, D. Kumar, and A. Sardana, “Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model,” Big Data Min. Anal., vol. 4, no. 2, pp. 116–123, 2021, doi: 10.26599/BDMA.2020.9020016.
V. W. Siburian and I. E. Mulyana, “Prediksi Harga Ponsel Menggunakan Metode Random Forest,” Annu. Res. Semin., vol. 4, no. 1, pp. 144–147, 2018.
M. F. Rahman, D. Alamsah, and M. I. Darmawidjadja, “Klasifikasi Untuk Diagnosa Diabetes Menggunakan Metode Bayesian Regularization Neural Network (RBNN),” J. Inform., vol. 11, no. 1, p. 36, 2017, doi: 10.26555/jifo.v11i1.a5452.
F. Rahutomo, I. Y. R. Pratiwi, and D. M. Ramadhani, “Eksperimen Naïve Bayes Pada Deteksi Berita Hoax Berbahasa Indonesia,” J. Penelit. Komun. Dan Opini Publik, vol. 23, no. 1, 2019, doi: 10.33299/jpkop.23.1.1805.
F. Prasetya and F. Ferdiansyah, “Analisis Data Mining Klasifikasi Berita Hoax COVID 19 Menggunakan Algoritma Naive Bayes,” J. Sist. Komput. dan Inform., vol. 4, no. 1, p. 132, 2022, doi: 10.30865/json.v4i1.4852.
N. K. Widyasanti, I. K. G. Darma Putra, and N. K. Dwi Rusjayanthi, “Seleksi Fitur Bobot Kata dengan Metode TFIDF untuk Ringkasan Bahasa Indonesia,” J. Ilm. Merpati (Menara Penelit. Akad. Teknol. Informasi), vol. 6, no. 2, p. 119, 2018, doi: 10.24843/jim.2018.v06.i02.p06.
W. Hidayat, E. Utami, A. F. Iskandar, A. D. Hartanto, and A. B. Prasetio, “Perbandingan Performansi Model pada Algoritma K-NN terhadap Klasifikasi Berita Fakta Hoaks Tentang Covid-19,” Edumatic J. Pendidik. Inform., vol. 5, no. 2, pp. 167–176, 2021, doi: 10.29408/edumatic.v5i2.3664.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Indra, Agus Umar Hamdani, Suci Setiawati, Zena Dwi Mentari, Mauridhy Hery Purnomo
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with Janapati agree to the following terms:- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work. (See The Effect of Open Access)