Pendidikan Teknik Informatika : JANAPATI | 253 IMPLEMENTATION OF CHATBOT FOR MERDEKA BELAJAR KAMPUS MERDEKA PROGRAM USING LONG SHORT-TERM MEMORY

Good service can help the organization improve efficiency and effectiveness in operations. Optimal service can also improve the customer experience and provide added value to an organization that provides services. One of the services that can be optimized is the Merdeka Belajar Kampus Merdeka (MBKM) program which is a learning program organized by the Ministry of Education, Culture, Research, and Technology (Kemendikbudristek), especially MBKM services at the Institut Teknologi Telkom Purwokerto (ITTP). The problem is that the MBKM service at ITTP is not optimal due to inaccessibility to anyone and so many programs available. Thus, resulting in not optimal services provided. Therefore, this study aims to implement a Chatbot service in the MBKM program at ITTP. The method used in building a Chatbot service is the Deep Learning Long Short-Term Memory (LSTM) algorithm. LSTM is a type of artificial neural network architecture that matches text data. The results show an accuracy score of 100% and a loss of 0.121%. Meanwhile, the results of the further evaluation are in the form of average weights consisting of precision, recall, and F1-score, respectively of 100%, 100%, and 100%.


INTRODUCTION
The development of information and communication technology has helped a lot and provided rapid changes for science [1], such as in several fields, namely health [2]- [4], agriculture [5]- [8], business [9], and education [10], [11]. The use of this technology aims to improve more optimal services so that it is expected to make consumers or users can be helped in obtaining information with technology, especially in universities.
Higher Education is one of the sectors that utilize technological developments. One of them is utilizing social media as a service medium and information dissemination [12]. Social media and university websites are also a source of information for students. However, existing services tend not to be optimal because admin operators on social media and websites cannot serve 24 hours. This results in the flexibility of the service being suboptimal and seemingly long. So, technology is needed to accommodate all forms of services, such as the Merdeka Belajar Kampus Merdeka (MBKM) program, which is widely followed in every university.
Merdeka Belajar Kampus Merdeka (MBKM) is one of the components of the Merdeka Belajar policy of the Ministry of Education, Culture, Research, and Technology (Kemendikbudristek), which provides opportunities for all students to hone their abilities according to their talents and interests by entering the workforce directly as a step to prepare for a career. Therefore, to automate the MBKM program's services, artificial intelligence (AI)-based technology is needed [13]. Artificial intelligence is a science that studies how to create a system/computer that can do as humans do [14]. One example is a Chatbot or chatterbot that is used to solve the problem of the question and answer service [15]. Chatbots are designed to mimic human conversational skills using voice, text, or both [16].
Nowadays, the popularity of Chatbots in many sectors is continually increasing because they have great potential to automate user services and reduce human effort or intervention. When used in the field of education, Chatbot technology is beneficial, primarily when used in Colleges to assist students in obtaining relevant information. With the help of Chatbot technology, campus services have now become of a higher caliber.
Previous research has been conducted by several researchers [17] entitled "Dinus Intelligent Assistance (DINA) Chatbot for University Admission Services". The research solved problems related to student admissions services by creating a Chatbot called DINA (Dinus Intelligent Assistance). DINA uses a Deep Learning-based knowledge approach and gets good results by successfully answering questions from every student's perspective.
Another study by [18] entitled "NEUchatbot: Chatbot for admission of National Economics University" also solved problems related to the administration of National Economics University students in Vietnam using a chatbot called NEU-Chatbot. The NEU-Chatbot is applied to the official admissions fan page of the National Economics University on the Facebook platform, a well-known social network in Vietnam. NEU-chatbots get good results, with almost 98.61% of 1000 clients giving happy questions with the answers of these chatbots.
Another study conducted by [19] entitled "A Chatbot Using LSTM-based Multi-Layer Embedding for Elderly Care" created chatbots using Long Short-Term Memory (LSTM). The dataset used is in the form of daily conversations of people in the Elderly. The experiment results showed that the proposed method achieved an accuracy of 79.96% for the top 1, 93.14% for the top 5, and 94.85% for the top 10 message pairs to match. The results show that the proposed system performance in matching message pairs is much better than the base system. Therefore, this study aims to build an AI Chatbot using the Long Short-Term Memory algorithm for MBKM program services at the Institut Teknologi Telkom Puwokerto-selection of LSTM due to the relevant algorithm for the Chatbot [19]. LSTM is a recurrent neural network with state memory and multi-layer cell structures [20]. LSTM is often used for translation, modeling, and time series prediction. It can capture long-term dependencies in sequential data. They have also been used in various other applications, including speech recognition, natural language processing, and machine translation.

MATERIALS AND METHODS A. Academic Services
The Merdeka Belajar Kampus Merdeka Program aims to improve the quality of higher education in Indonesia by encouraging creativity and innovation among students and increasing student involvement in research and development activities [21]. This program is called the Merdeka Belajar Kampus Merdeka (MBKM) program. MBKM at Institut Teknologi Telkom Purwokerto (ITTP) in the last two years has experienced an increase in student involvement in the MBKM program to thousands of students, so this is an essential part of the process to help and assist students in involvement in the MBKM program.
Currently, academic services about MBKM still use the hotline and visit in person. Of course, this process needs to be improved with services that can be accessed anytime and anywhere to ask about the MBKM program.

B. Chatbot
A Chatbot is computer software created to mimic text and audio-based intellectual conversations with one or more humans. A chatbot can also be interpreted as a computer system that allows humans to interact with computers using natural human language [22]. Chatbot comes from two words, namely "chatbot". Chat can be likened to a written media communication activity in the computer world. A bot is a computer program with several data points that, when given input, will output an output in response. The advantages of using a Chatbot increase efficiency in answering questions and saving time. Chatbots can also help improve the customer experience by providing quick and accurate responses to desired requests and queries.

C. Long Short-Term Memory
Long Short-Term Memory (LSTM) is a recurrent neural network with state memory and multilayer cell structures [20]. Sepp Hochreiter and Jürgen Schmidhuber first proposed LSTM in 1997. This algorithm can be considered a Recurrent Neural Network (RNN) algorithm in development. The RNN algorithm uses the previous step's results as input for the running step. The downside of the RNN algorithm is that it cannot predict words already stored in longterm memory. The strength of RNN algorithmsnamely, their ability to make more precise predictions based on the latest data-is retained.
In contrast, LSTM algorithms are created to address these weaknesses. The LSTM architecture includes an input gate, forget gate, and output gate where x_t refers to the current input, c_t and c_(t-1) respectively indicating the status of the new and previous cells, and h_t and h_(t-1) are current and previous outputs respectively [23]. The internal structure of the LSTM is shown in Figure 1. Memory processing is performed by a section known as a gate, and the information collected by the LSTM algorithm will then be stored by the cell. This gate is implemented using the sigmoid activation function and a particular layer type called "forget gate", which determines which information should be discarded from the previous time step. LSTM also has an "input gate" that specifies which information from the current time step should be passed on to the next time step, and an "output gate" specifies which information from the current time step should be used to calculate the final LSTM result.
LSTM is often used for translation, modeling, and time series prediction. It can capture long-term dependencies in sequential data. They have also been used in various other applications, including speech recognition, natural language processing, and machine translation.

D. Methodology
The method for creating a Chatbot using the Long Short-Term Memory (LSTM) algorithm, is shown in Figure 2.

Dataset
The dataset used contains questions often asked to special academic services for the Merdeka Belajar Kampus Merdeka (MBKM) program and the Merdeka Campus website FAQ [21]. Table 1 shows the datasets used in building a Chatbot.

Library
The libraries needed in creating Chatbots use Long Shor Term Memory (LSTM) such as NumPy for mathematical computing, Matplotlib for visualization of data models, NLTK for text processing, Pandas for reading data, and Tensorflow for models on data using LSTM algorithms.

Preprocessing
Data preprocessing is performed for data manipulation or deletion before use. The stages of preprocessing include: a. Remove Punctuations In the initial preprocessing stage for text data, special characters such as exclamation points (!), commas (,), periods as stop(.), question marks (?), and others are omitted [24]- [26]. This stage will facilitate the processing of data to be processed. Table 2 shows the differences in data after the removal of the levy.

b. Lemmatization
Lemmatization, also known as "lematization" is the process of removing only the inflectional suffix from words and restoring the dictionary form (words present in the dictionary) from a word known as "lemma" [27]. Table 3 shows how to turn a word into a base word.    Table 3. Data Differences After Lematization

Before After
Walk, walker, walked, walking walk c. Tokenization Assigning a character string to a specific document unit is a tokenization process. Tokenization is segmenting sentences into "Tokens" and omitting specific components, including punctuation [28]- [30]. Table 4 shows how to break a sentence into token parts.

Before After
Apa itu Kampus Merdeka?

d. Padding Text
Padding is used to sort the exact text. Each sequence in padding will be made equal in length by adding a value of 0 via a suffix or prefix until it reaches the maximum length of the sequence [31]. In addition, padding can shorten the sequence to its maximum length. Table 5 shows the differences after padding text.  e. Encoding Encoding translates categorical data, such as characters or text, into numeric or integer data according to the applied data labels. Encoding converts text in a column of data tags in this operation into numeric data using computer binaries of 0 and 1 [31].

Long Short-Term Memory
In this Chatbot, neural networks consist of embedding layers, one of the most potent things in natural language processing, or NLP [32], [32]. The output of the embedding layer is the text data input of the recurrent layer with the LSTM (Long Shot Term Memory Gate Layer) gate layer. Then the output is flattened, and a solid layer is used with the Softmax activation function where the implementation of this Chatbot has data labels for more than two classes [19], [20], [33].

Model Analys
After running model training with the Long Short-Term Memory (LSTM) algorithm and knowing the accuracy results in the last step, thus, the next stage is to analyze the model with visualization of the accuracy and loss plot to see the accuracy results of the LSTM model training algorithm [34].

Testing
After knowing the accuracy and loss results on the model set with the LSTM algorithm, so, the next stage is to test or test the model on a chatbot that has been previously trained using a confusion matrix [35] with training data parameters. Machine learning classification performance, where the output maybe two or more classes, is evaluated using a confusion matrix.

Save Model
After testing, the Chatbot has been adjusted to the sentences and answers. Thus, the chatbot model can be saved in .h5 or .pkl (pickle) format using AI Chatbot applications with websites or Android systems [36]. Model file storage can be transient direct or in Google Drive.

RESULT AND DISCUSSION
The training process using the Long Short-Term Memory algorithm has 53 classes in the dataset as follows:  The  compiled  model  uses  loss  sparse_categorical_crossentropy, adam optimizer, and accuracy metrics. Then the model in training using 400 epochs is obtained, and visualization of the training results is shown in Figure 4. Then the results of the evaluation of the training model based on Figure 4 and Table 6 showed that the accuracy value calculated from 400 epochs was 100%, and the loss was 0.121%. The next stage is to test the model using a confusion matrix. The results of the classification report are shown in Figure 6. These results show that the model that has been trained is good because there is no overfitting or underfitting. Furthermore, a model evaluation of the training data was carried out to find precision, recall, and F1-score values, as shown in Figure 5. In addition, from the confusion matrix results with the Long Short-Term Memory Model, the corresponding predicted values in the categories/labels could be seen in Figures 5  and 6. The average value of precision, recall, and F1-score weights are 1.00, 1.00, and 1.00, respectively.

Figure 7. Validation Chatbot
The final stage is model testing, as shown in Figure 7. It can be seen that the chatbot can answer several questions related to the explanation, purpose, and program of the independent campus. Not only that, the chatbot can also answer questions related to the Merdeka Belajar Kampus Merdeka program.

CONCLUSION
Implementing the Long Short-Term Memory (LSTM) algorithm in the application of academic service chatbots, especially in the Merdeka Belajar Kampus Merdeka (MBKM) program, through several stages such as collecting datasets, preprocessing data, making models, analyzing models, and testing models have been successfully carried out. The LSTM algorithm can perform classification by obtaining an accuracy score of 100% and a loss of 0.121%. Then further evaluation in the form of average weights consisting of precision, recall, and F1-score respectively of 100%, 100%, and 100%.
Suggestions for the following research, Chatbot is not only for the MBKM program but for other academic services by adding many datasets. In addition, MBKM Chatbot can be developed based on the knowledge base and uses relationships between words in answering a question.

ACKNOWLEDGMENT
Researchers would like to thank the Informatics Engineering study program, Institut Teknologi Telkom Purwokerto, for their support in completing this research.