Stance Analysis of Policies Related to Emission Test Obligations using Twitter Social Media Data

Dwi Retnoningrum; Dea Annisayanti  Putri; Indra Budi; Aris Budi  Santoso; Prabu Kresna  Putra

doi:10.23887/janapati.v12i3.69004

Authors

Dwi Retnoningrum UI
Dea Annisayanti Putri Universitas Indonesia
Indra Budi Universitas Indonesia
Aris Budi Santoso Universitas Indonesia
Prabu Kresna Putra Universitas Indonesia

DOI:

https://doi.org/10.23887/janapati.v12i3.69004

Keywords:

Emission Test Policy, Social Media, Stance Analysis, Machine Learning, Feature Extraction

Abstract

Social media is currently widely used to disseminate various kinds of information, whether expressing feelings, or opinions. Public opinion is no exception regarding government policies and the implementation of emission tests, which describe the conditions that exist in society. Information on public opinion data obtained through social media in real time can assist the government in evaluating policies and improving the quality of currently implemented policies, particularly evaluating the implementation of emission tests on motorized vehicles. In this research, the application of stance analysis is used to evaluate emission test policies based on public opinion.In addition, this research aims to combine several machine learning methods and feature extraction methods to find the best combination based on accuracy, training time, and prediction time based on emission test policies. The best model based on the level of accuracy is a combination of Decision Tree and BERT, which reaches a value of 66%. Meanwhile, based on training time, the model that has the advantage is the Ridge Classifier with fasttext text representation. Based on prediction time, there are 3 combination models, namely Decision Tree with word2vec, SVM with Word2Vec, and Logistic Regression with fasttext text representation.

References

J. Cruz-Cárdenas, E. Zabelina, O. Deyneka, J. Guadalupe-Lanas, and M. Velín-Fárez, “Role of demographic factors, attitudes toward technology, and cultural values in the prediction of technology-based consumer behaviors: A study in developing and emerging countries,” Technol Forecast Soc Change, vol. 149, Dec. 2019, doi: 10.1016/j.techfore.2019.119768.

L. R. Men and S. Muralidharan, “Understanding Social Media Peer Communication and Organization-Public Relationships: Evidence from China and the United States,” Journalism and Mass Communication Quarterly, vol. 94, no. 1. SAGE Publications Inc., pp. 81–101, Mar. 01, 2017. doi: 10.1177/1077699016674187.

S. Kemp, “Digital 2020: Global Digital Overview,” 2020.

A. Punel and A. Ermagun, “Using Twitter network to detect market segments in the airline industry,” J Air Transp Manag, vol. 73, pp. 67–76, Oct. 2018, doi: 10.1016/j.jairtraman.2018.08.004.

W. Budiharto and M. Meiliana, “Prediction and analysis of Indonesia Presidential election from Twitter using sentiment analysis,” J Big Data, vol. 5, no. 1, Dec. 2018, doi: 10.1186/s40537-018-0164-1.

E. J. Dommett, “Understanding student use of twitter and online forums in higher education,” Educ Inf Technol (Dordr), vol. 24, no. 1, pp. 325–343, Jan. 2019, doi: 10.1007/s10639-018-9776-5.

A. Benlahbib and E. H. Nfaoui, “MTVRep: A movie and TV show reputation system based on fine-grained sentiment and semantic analysis,” International Journal of Electrical and Computer Engineering, vol. 11, no. 2, pp. 1613–1626, Apr. 2021, doi: 10.11591/ijece.v11i2.pp1613-1626.

M. Soleymani, D. Garcia, B. Jou, B. Schuller, S. F. Chang, and M. Pantic, “A survey of multimodal sentiment analysis,” Image Vis Comput, vol. 65, pp. 3–14, Sep. 2017, doi: 10.1016/j.imavis.2017.08.003.

E. Lim, E. I. Setiawan, and J. Santoso, “Stance Classification Post Kesehatan di Media Sosial Dengan FastText Embeddingdan Deep Learning,” Journal of Intelligent System and Computation, 2020.

R. Jannati, R. Mahendra, C. W. Wardhana, and M. Adriani, “Stance Classification towards Political Figures on Blog Writing,” in Proceedings of the 2018 International Conference on Asian Language Processing (IALP) , 2018.

K. Shalini, M. Anand Kumar, and K. Soman, “Deep-Learning-Based Stance Detection for Indian Social Media Text,” in Emerging Research in Electronics, Computer Science and Technology, V. Sridhar, M. C. Padma, and K. A. R. Rao, Eds., Singapore: Springer Singapore, 2019, pp. 57–67.

N. I. M. Dawot and R. Ibrahim, “A review of features and functional building blocks of social media,” in 2014 8th. Malaysian Software Engineering Conference (MySEC), 2014, pp. 177–182. doi: 10.1109/MySec.2014.6986010.

A. Jain and V. Jain, “Sentiment classification of twitter data belonging to renewable energy using machine learning,” Journal of Information and Optimization Sciences, vol. 40, no. 2, pp. 521–533, Feb. 2019, doi: 10.1080/02522667.2019.1582873.

G. Sand, L. Tsitouras, G. Dimitrakopoulos, and V. Chatzigiannakis, “A big data aggregation, analysis and exploitation integrated platform for increasing social management intelligence,” in 2014 IEEE International Conference on Big Data (Big Data), 2014, pp. 40–47. doi: 10.1109/BigData.2014.7004411.

E. Lunando and A. Purwarianti, Indonesian Social Media Sentiment Analysis With Sarcasm Detection. 2013. doi: 10.1109/ICACSIS.2013.6761575.

T. Mangasi, A. Erwin, and H. P. Ipung, “Defined entity extraction based on Indonesian text document,” in 2014 International Conference on ICT For Smart Society (ICISS), 2014, pp. 61–65. doi: 10.1109/ICTSS.2014.7013152.

I. Klampanos, “Introduction to information retrieval,” Inf. Retr., vol. 12, pp. 609–612, Oct. 2009, doi: 10.1007/s10791-009-9096-x.

B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis,” Foundations and Trends in Information Retrieval, vol. 2, pp. 1–135, Jan. 2008, doi: 10.1561/1500000011.

A. Sharma and S. Dey, “Performance Investigation of Feature Selection Methods and Sentiment Lexicons for Sentiment Analysis,” Special Issue of International Journal of Computer Applications, pp. 975–8887, 2012.

Y. Goldberg, “Neural Network Methods for Natural Language Processing,” Synthesis Lectures on Human Language Technologies, vol. 10, no. 1, pp. 1–311, 2017, doi: 10.2200/S00762ED1V01Y201703HLT037.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781

T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and their Compositionality,” Oct. 2013, [Online]. Available: http://arxiv.org/abs/1310.4546

S. Lai, K. Liu, S. He, and J. Zhao, “How to generate a good word embedding,” IEEE Intell Syst, vol. 31, no. 6, pp. 5–14, Nov. 2016, doi: 10.1109/MIS.2016.45.

Y. Zhu, E. Yan, and F. Wang, “Semantic relatedness and similarity of biomedical terms: Examining the effects of recency, size, and section of biomedical publications on the performance of word2vec,” BMC Med Inform Decis Mak, vol. 17, no. 1, Jul. 2017, doi: 10.1186/s12911-017-0498-1.

L. Ma and Y. Zhang, “Using Word2Vec to process big text data,” in Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015, Institute of Electrical and Electronics Engineers Inc., Dec. 2015, pp. 2895–2897. doi: 10.1109/BigData.2015.7364114.

C. T. Chao, W. H. Chu, C. L. Lee, J. K. Lee, M. Y. Hung, and H. W. Sung, “Devise Sparse Compression Schedulers to Enhance FastText Methods,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Aug. 2020. doi: 10.1145/3409390.3409394.

P. Mojumder, M. Hasan, M. F. Hossain, and K. M. A. Hasan, “A study of fasttext word embedding effects in document classification in bangla language,” in Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, Springer, 2020, pp. 441–453. doi: 10.1007/978-3-030-52856-0_35.

B. Kuyumcu, C. Aksakalli, and S. Delil, “An automated new approach in fast text classification (fastText): A case study for Turkish text classification without pre-processing,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Jun. 2019, pp. 1–4. doi: 10.1145/3342827.3342828.

A. Vaswani et al., “Attention Is All You Need,” Jun. 2017, [Online]. Available: http://arxiv.org/abs/1706.03762

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” ArXiv, vol. abs/1810.04805, 2019.

N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” Aug. 2019, [Online]. Available: http://arxiv.org/abs/1908.10084

W. Al-Ghaith, “Developing Lexicon-based Algorithms and Sentiment Lexicon for Sentiment Analysis of Saudi Dialect Tweets,” 2019. [Online]. Available: www.ijacsa.thesai.org

W. Xiu-Shen, J. Wu, and Q. Cui, “Deep Learning for Fine-Grained Image Analysis: A Survey,” arXiv.org, Jul. 2019, [Online]. Available: https://www.proquest.com/working-papers/deep-learning-fine-grained-image-analysis-survey/docview/2254222037/se-2?accountid=17242

F. J. Yang, “An extended idea about decision trees,” in Proceedings - 6th Annual Conference on Computational Science and Computational Intelligence, CSCI 2019, Institute of Electrical and Electronics Engineers Inc., Dec. 2019, pp. 349–354. doi: 10.1109/CSCI49370.2019.00068.

S. A. Alquhtani and A. Muniasamy, “Analytics in Support of E-Commerce Systems Using Machine Learning,” in International Conference on Electrical, Computer, and Energy Technologies, ICECET 2022, Institute of Electrical and Electronics Engineers Inc., 2022. doi: 10.1109/ICECET55527.2022.9872592.

R. Burbidge and B. Buxton, “An Introduction to Support Vector Machines for Data Mining.”

Y. Wang, Y. Ou, X. Deng, L. Zhao, and C. Zhang, The Ship Collision Accidents Based on Logistic Regression and Big Data. 2019.

X. Chen and R. Ye, “Identification model of logistic regression analysis on listed firms’ frauds in China,” in Proceedings - 2009 2nd International Workshop on Knowledge Discovery and Data Mining, WKKD 2009, 2009, pp. 385–388. doi: 10.1109/WKDD.2009.35.

D. Li, Q. Ge, P. Zhang, Y. Xing, Z. Yang, and W. Nai, “Ridge Regression with High Order Truncated Gradient Descent Method,” in Proceedings - 2020 12th International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2020, Institute of Electrical and Electronics Engineers Inc., Aug. 2020, pp. 252–255. doi: 10.1109/IHMSC49165.2020.00063.