STANCE ANALYSIS OF POLICIES RELATED TO EMISSION TEST OBLIGATIONS USING TWITTER SOCIAL MEDIA DATA

Social media is currently widely used to disseminate various kinds of information, whether expressing feelings, or opinions. Public opinion is no exception regarding government policies and the implementation of emission tests, which describe the conditions that exist in society. Information on public opinion data obtained through social media in real time can assist the government in evaluating policies and improving the quality of currently implemented policies, particularly evaluating the implementation of emission tests on motorized vehicles. In this research, the application of stance analysis is used to evaluate emission test policies based on public opinion.In addition, this research aims to combine several machine learning methods and feature extraction methods to find the best combination based on accuracy, training time, and prediction time based on emission test policies. The best model based on the level of accuracy is a combination of Decision Tree and BERT, which reaches a value of 66%. Meanwhile, based on training time, the model that has the advantage is the Ridge Classifier with fasttext text representation. Based on prediction time, there are 3 combination models, namely Decision Tree with word2vec, SVM with Word2Vec, and Logistic Regression with fasttext text representation.


INTRODUCTION
The development of technology and information has increased rapidly and has entered various aspects of human life, such as social, cultural and economic activities [1].Through technology, social activities that previously required physical contact can now be carried out remotely via social media [2].This shift is happening all over the world, 60% of the world's population is connected to the Internet and 53% of the world's population has access to social media [3].
Social media is currently widely used to disseminate various kinds of information, whether expressing feelings, opinions or opinions.For consumers, social media can be used to convey positive feedback about the products used [4].For the government, social media can be used to find out the public's response to an issue.
Twitter is the social media with which a lot of data is collected as research material.Apart from using a lot of data, Twitter is also a social media that has the most active users in Indonesia.According to [3], approximately 108 million Indonesians are active Twitter users.Corporate communication facilities, customer service, and product campaigns are conducted via Twitter.Research topics are also very broad, such as politics [5], education [6], and film [7].
With the availability of important information for companies on social media such as Twitter, the analysis is not only done manually but also utilizes data mining techniques in the textual form which will be grouped into several classes [8].The stance detection technique was widely used before the creation of social media, where this technique was used to analyze an issue whose data was taken from websites, blogs, or surveys.Stance detection can be used to determine the condition of certain aspects of society.Various approaches or methods are used in stance analysis, a popular example is research by [9] which discusses the use of Word2Vec and LSTM to carry out stance classification, [10] who utilizes Word2vec and SVM, and [11] who uses Glove, Word2Vec, and CNN.
Not only private companies have taken advantage of social media, but government agencies have also opened official accounts on the Facebook, Twitter, and Instagram platforms.The government through the related Ministries distributes information and appeals to the public, one of which is through social media channels.No exception is related to one of the government programs in handling global warming, namely emission test regulations test Figure 1 Figure 1.Tweet from the Ministry of Environment and Forestry regarding Emission Tests The government makes a policy for testing exhaust emissions followed by regular and orderly maintenance of motor vehicles which is carried out correctly, effectively has the potential to minimize motor vehicle exhaust gases, and provides many benefits, namely improving air quality and maintaining public health.One of them is contained in DKI Jakarta Governor Regulation (Pergub) Number 66 of 2020.The DKI Jakarta Government has also created a system that can monitor the process of implementing this emission test through the Emission Test Information System (https://ujiemisi.jakarta.go.id /).
Information on social media can be used to evaluate emission test policies and explore public opinion about both policies and their implementation.So, the information obtained is able to describe the situation in society.In addition, information in public opinion data obtained through social media in real-time, can assist the government in evaluating policies and improving the quality of currently implemented policies, especially evaluating the application of emission tests on motorized vehicles.With evergrowing levels of data, the methods used will be updated regularly.So the best method is needed to conduct stance analysis and seek public opinion regarding emission tests policies.By raising this problem, the researcher formulates 2 questions, that is "How is the performance comparison of various algorithms to implement social media listening on Twitter social media based on accuracy, training time, and prediction time?" and "What is the public's response to the emissions test policy?".

Social Media
Social media, users can interact and convey various ideas, content, opinions, and information through social media [12].There are various types of social media with their respective advantages.Examples of social media are Facebook, which is used to share daily activities, Twitter, which is used to quickly share textual information, Youtube, which is used to share videos, and Instagram, which is used to share photos.
The development of social media platforms is directly proportional to the increasing amount of data, which is called social media big data.The data comes from activities carried out by users, such as expressing opinions or complaining about products or services.Users are not restricted from creating content or sharing information [13].Data available on social media is analyzed for various purposes and with various methods.

Twitter
Twitter is also a social media that has the most active users in Indonesia.According to [3] approximately 108 million Indonesians are active Twitter users.Twitter generates an average of over 500 million tweets per day [14].Twitter can be used as a source to find out what's going on right now because news travels quickly worldwide.So that currently many individuals, companies, product vendors, and organizations have used public opinion on social media as a basis for decision-making.Corporate communication facilities, customer service, and product campaigns are conducted via Twitter.Research topics are also very broad, such as politics [5], education [6], and film [7].

Stance Analysis
Stance analysis is a technique to automatically determine the author's views on a text, whether they agree (pros), neutral, or disagree (cons).In general, the stance analysis process can be divided into two stages, the formation of word vectors and stance classification.The popularity of this model is due to its ability to determine the similarity of meaning between words.This similarity information is obtained by observing the similarity of the words around the target word.Stance analysis is a type of task within the science of Natural Language Processing, which is a scientific focus to analyze and understand the meaning contained in a text.The general method used for stance analysis, i.e.Decision Tree, Support Vector Machine (SVM), Regression, and Ensemble.

METHOD
The stages of the research were conducted to answer the research objective, namely to compare five classification algorithms and five text representation algorithms with the aim of stance analysis on Twitter's opinion about the emission test policy.This methodology shown in Figure 2.

Data Collecting
In collecting data, the scrapping method is used, which is a technical process of collecting data on a site through an information extraction process using Hypertext Transfer Protocol (HTTP).In this process, data was collected from October 1, 2020 to October 1, 2022.Key words are "uji emisi", "emisi ribet".The amount of data that could be collected were 4,500 tweets.Tweet data is then labelled into 3 classes, namely positive, negative, and neutral manually by 3 people.To maintain consistency and quality, a cross-check is carried out, in which each person will label 1,500 tweets and will then be re-checked by another person.

Pre-Processing
Pre-processing aims to reduce to a minimum the slang vocabulary or terms used in the text [15].Social media users tend to replace formal words with slang or terms.Examples include using numbers to replace the alphabet, repeated vowel characters, and using nonstandard words.Case folding changes all the letters to lowercase.Tokens are punctuation marks, terms, numbers, etc. [16], while tokenization is dividing the text into certain parts [17].Word filtering is needed to improve and make the text of Twitter (tweets) normal by removing URLs, mentions, and hashtags.The process of removing stop-words is used to clear the text of conjunctions, which research has shown do not carry useful information.
The data generated from the labelling process must go through cleaning beforehand, because the raw data is the free text that can be randomly generated by Twitter users.There are lots of unnecessary words, then lots of characters or emojis, to meaningless punctuation in the following process.The steps taken include case folding to convert all characters into lowercase, tokenizing text into words, filtering meaningless characters, and removing stop-words.In addition to cleaning content, information data from Twitter users will also be deleted to maintain social media ethics.

Feature Extraction
An important part of text processing is converting text values into numeric values, either in vectors or other representations [18].The determination of features for the machine learning approach (machine learning) is adjusted to text categorization.According to [19], the feature extraction method reduces original features by removing irrelevant features, which aims to increase accuracy and reduce machine learning processing time.
The first method is the bag of words (BOW).The BOW approach is a popular feature extraction method for sentences and documents.The text representation will be based on a Also, the method for extracting features using the Word2vec model.Word2vec is a word augmentation technique introduced by Mikolov for word expressions that contain the meaning and context of words in a document and include two learning algorithms, a namely continuous bag of words (CBOW) and skip algorithms [21] [22].The similarity between words is calculated via the cosine similarity of the word vectors in word2vec, which includes the meanings of the words in the document [23].Several studies regarding sentiment analysis or classification of emotions can use word2vec The biggest advantage of BERT would be an unneeded big corpus of text to train models.BERT is pre-trained, so users only need to finetune the BERT model based on specific training data (manually annotated).In addition to classification, BERT can also be used to obtain vector representations.Modification of a trained BERT network that uses triplet and siamese network structures to earn semantically meaningful representations called Sentence-BERT (SBERT) [31].
Data from the cleaning process will then be converted into numeric data.The numerical representation of this result is in the form of a vector or an equal number of numerical values.The methods used to generate this vector are bag of word, tf-idf, word2vec, fasttext and BERT.All algorithms are obtained using the Python instrument and the gensim library.
The word2vec model utilizes the gensim library and is trained with Indonesian Wikipedia data.Like-wise with the fasttext model is used through the gensim library and is trained using data on Indonesian language sites and Wikipedia data.This model is available on site fasttext.For BERT, utilize the sentence-bert library and the Indobert hugging face model.
To get the vector representation of the word models (word2ved and fasttext) at the sentence level, the vectors for each word will be combined, and the average will be calculated.While the sentence model (BERT) then, the representation is made directly from the sentence level.This vector will represent the features of the text which will be the input of the classification model.Word2vec vector size is 1000, fasttext is 300, while BERT is 768.

Model Training
Sentiment classification is done by classifying text into two classes, positive or negative class [32].Neutral classes can be used, but most studies do not use them.Sentiment classification is a form of text classification in general, namely giving a label to the text where this label has been previously defined.Examples include labelling news into class categories such as sports or politics.In classification, related topic words are the main features.But in its derivatives at the sentiment level, the label defined is positive or negative, according to the subjectivity of the text writer.
According to [33]machine learning is part of research in intelligent computing that aims to create programs that can mimic human intelligence without having to write code explicitly.The approach is to analyze the data and other data around it to find patterns.There are two types of machine learning methods, namely supervised and unsupervised.In the supervised learning method, the data to be used and trained already has a label or group.In contrast, the data for training on unsupervised learning has no labels or clusters.The result of this process is the ability to assign labels or group values to unlabelled data based on patterns found from the surrounding data.There are several machine learning methods, but the methods used in this research are Decision Tree, Support Vector Machine (SVM), Logistic Regression, and Ridge.
Decision Tree is a well-known data mining and machine learning technology that takes a series of attribute values as input and outputs a Boolean conclusion [34].In practice, every path of a decision tree represents a decision rule that is easily translatable into either a programming language or human language.Considering all paths (rules), the complete tree corresponds to a compound Boolean expression utilizing disjunction and conjunction to produce a Boolean judgment.Decision Tree may be preferred since it is straightforward and simple to interpret [34].
SVM is a suitable method for text analysis, as Naives Bayes requires data training to get the right results [35].On the other hand, SVM requires more computation than Naive Bayes, but the results are faster and more accurate [36].
Logistic regression is a popular linear regression analysis model, widely used in data mining, automated disease diagnosis, economic forecasting, etc. [37].The goal of this method is to select variables with more information to identify the type of sample estimation and build a model with the lowest probability of error [38].
Ridge Regression approach for analyzing multicollinear regression statistics when statistics are generated with multicollinearity, the least squares estimate is an independent estimate, which has a big variance and consequently has a tendency to go long way from the real value.Ridge regression produces a more reliable estimate by reducing standard errors adding some bias to the regression estimate.Ridge regression is a very flexible and subjective regression assessment, but it is an analytical approach combining qualitative and quantitative assessments.
It is specific to fixing multicollinearity troubles and is regularly utilized in widespread research [39].
The model will be formed to classify tweets into negative (disagree), positive (agree), and neutral attitudes.The model architectures built are Decision Tree, SVM, Logistic Regression, Ridge Classifier, and combined methods (ensemble).
Before conducting experiments and comparing models, it is necessary to divide the data into training, test, and validation data.Training data is data used to train the model to do its job correctly and validation data to validate performance during training.Test data is used to test whether the model after the training process can predict correctly and is evaluated with a matrix such as an accuracy.

Evaluation
Five models of text representation (feature extraction models) and five classification models will each be combined into 25 models shown in Table1.

RESULT AND DISCUSSION
Based on testing of several models with several feature extraction, the results are as follows:

Model Accuracy
According to table 2, 2 models achieve the highest evaluation with an F1 value of 66%, namely SVM and Ensemble, both of which use BERT representation vectors.Based on the average results of all text representation models, the best classification model is the Decision Tree with an average F1 value of 56%.Even though it is the best, this classification model is not significantly different from other models, which are only 1% different.Meanwhile, based on the average results of all classification models, the best text representation model is BERT, with an average F1 value of 63%.This representational model shows the best performance and has a fairly high difference 3% compared to TF-IDF.BOW takes the most time since it needs to build the vocabulary and complexity, to calculate the frequency.BERT with its architecture complexity takes the second longest time in training.At the same time, fasttext is the fastest due to its vector size, which is only 300.

Time Prediction
According to Table 4 and Figure 3

Public's Response To The Emission Test Policy
Based on the best accuracy model, which is SVM-BERT, the following is the result of the classification of all community tweets about emission tests: 1.246 (5%) tweets pro with government 2.2778 (62%) tweets neutral 3.1475 (33%) tweets contra with government

CONCLUSION
Experiments were conducted to answer research questions regarding the best combination of text representation and classification models.It can be concluded that: The best model based on the level of accuracy is a combination of Decision Tree and BERT, which reaches a value of 66%.Meanwhile, based on training time, the model that has the advantage is the Ridge Classifier with fasttext text representation.If based on prediction time, there are 3 combination models, namely Decision Tree with word2vec, SVM with Word2Vec, and Logistic Regression with fasttext text representation.
On average, the model representation in In addition, the public response to the emission test policy is dominated by neutral opinions 66%, followed by opinions that contain cons to emission tests 33%, and tweets that agree with the government 5%.
The model produced in this study is expected to be a reference for the government to build a system that evaluates emission test policies and explores public opinion regarding both policies and their implementation.So from the information obtained, it can describe the situation in society.This system can be implemented in real-time and updated regularly to get good model quality.
The suggestion for further research is to explore the correlation analysis between people's attitudes and an incident.In addition, topic modeling can also be applied to find out the reasons for each community's attitude.Finally, future research can explore such as lexiconbased, neural network-based models and BERTbased classification models
[24] [25].Besides word2vec, another feature extraction method is Fasttext.Fasttext was developed by Facebook's research team to learn how to represent words and categorize text efficiently [26].Fasttext model's most significant contribution is that it considers the internal structure of words by examining word representations.Fasttext is especially useful for languages with a wide variety of morphologies [27].Word representation's approach in fasttext embedding model is quite different from other word representations, such as word2vec [28].Although fasttext assumes a word is arranged by n-gram characters where length could change from one to another, the smallest unit from each term used in word2vec.The advantage of this method is that since it stores word vectors as ngrams of characters, it can find vector representations for words that are not directly in the dictionary [28].A significant advance in neural network models was made by models using the Transformer architecture based on the selfrecognition mechanism [29], such advantages have led to the development of new models, but Bidirectional Encoder Representations from Transformers (BERT), which use context and word embeddings to overcome the limitations of RNNs and LSTMs, significantly improve the performance of sentiment analysis [30].

Figure. 3
Figure. 3 Comparison of training time and time Prediction , 3 models are able to make the fastest predictions (0.03 seconds), namely Decision Tree and SVM, both with Word2Vec text representation, and Logistic Regression with fasttext text representation.Based on the average results of all text representation models, the best classification model based on prediction time is the Decision Tree with an average time of 0.430 seconds.Based on the average results of all classification models, the best text representation model based on training time is fasttext, with an average time of 0.064 seconds.For the same reason with training time, Fasttext outperformed the others due to its number of smaller vectors.

Table 3 .
Comparison of Training Time

Table 4 .
Comparison of Time Prediction

Table 5 .
Comparison of classification models based on average text representation models

Table 6 .
Comparison of text representation models based on average classification model

Table 5 ,
Decision Tree is the best classification model because this model achieves the best performance of all components (accuracy, training time, and prediction time).Meanwhile, based on the average classification model in

Table 6 ,
FastText is the best text representation model.Even though it doesn't have a good level of accuracy, FastText has an advantage in speed in training and prediction.