APPLICATION OF LUNG DISEASES DETECTION BASED ON CSLNet

Lung diseases caused by fungal or bacterial infections can lead to inflammation in lung and even death when not detected early. A standard method for diagnosing lung diseases is the use of chest X-ray, which require careful examination of X-ray images by a radiology expert. Therefore, this study proposes several new architecture models, namely CSLNet, to classify chest X-ray images for diagnosing whether patients suffer from COVID-19, viral pneumonia, bacterial pneumonia, tuberculosis, and normal. The experimental results show that the model has an 0.99 average Accuracy, 0.98 Precision, 0.98 Recall, and 0.98 f1-score. Meanwhile, the Receiver Operating Characteristic (ROC) for bacterial pneumonia, COVID-19, normal, tuberculosis, and viral pneumonia are 0.97, 0.99, 0.99, 0.94, and 0.97 respectively. This study is based on a deep learning with a new model, CSLNet, which can work well on the dataset of chest X-ray images used for diagnosing lung diseases.


INTRODUCTION
Diseases affecting human organs, especially lung, are usually caused by smoking, air pollution, or bacterial infections that can attack the respiratory system, leading to serious health issues.In 2019, lung diseases accounted for 212.3 million cases globally with 74.4 million reported deaths [1].Previous investigations have established that lung diseases cause natural weakness in the respiratory system, requiring patients to be careful in alleviating these issues [2].The risk of lung diseases such as tuberculosis, pneumonia, and COVID-19 is high, especially in developing and low-income countries, where millions of people face poverty and unhealthy air pollution [3].Tuberculosis is an infectious diseases that is globally considered a significant source of mortality among the top 10 causes of death [4][5].Pneumonia caused over 1.1 million hospitalizations and 50,000 deaths in 2010, with the majority of pneumonia-related deaths occurring in patients above 65 years [6].Meanwhile, COVID-19 is an infectious diseases caused by an unprecedented virus [7].As of September 22, 2022, the number of cases has increased by more than 600 million, and deaths have exceeded 6.5 million.Due to its global rapid spread and respiratory infection, COVID-19 has been declared a pandemic by the World Health Organization (WHO) [8].
Chest X-ray are a fundamental indicator for monitoring and examining severe lung diseases such as tuberculosis, pneumonia, and COVID-19 [9].They can be used to detect airspace opacity, which is a common finding on chest X-ray, indicating severe lung infections [3][4][5][6][7][8][9][10]. Therefore, early detection of lung diseases has become more important and can be improved with machine learning and deep learning.Several studies have used deep learning to detect lung diseases.These include a proposed CNN [11], which is used to accurately detect and classify COVID-19 diseases from normal (healthy) and pneumonia patients.The proposed work is to create a model and tools for lung diseases detection from radiology data patterns using a combination of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM).A previous study [11] implemented the binary and multiclass classifications using CNN, where the binary classification yielded the best results.Narula and Kumar also presented the CNN-LSTM technique trained on X-Ray data, using deep learning, and showed that this model provided the best results in classifying COVID-19 diseases [12].The use of transfer learning was investigated for classifying pneumonia in chest X-ray images [13] and the result showed that transfer learning can provide performance benefits from initial training, with a little finetuning.Furthermore, three proposed models, namely ResNet50, InceptionV3, and DenseNet121, were separately trained through transfer learning from scratch.The testing results achieved a 4.1% to 52.5% larger Area Under the Curve (AUC) compared to the previously obtained values.A deep learningbased COVID-19 detection system was also proposed by [14] using X-ray images, with a dataset of 76 COVID-19, 4290 pneumonia, and 1583 normal cases images.This scheme achieved an Accuracy of 98.3% for COVID-19 cases.The use of deep learning in medical images processing and analysis is a challenging topic in the field of AI [15].A study has proposed [16] a CNN model for pneumonia detection, while another [17] proposed vessel extraction from fundus images.An expert system is also proposed for detection of brain tumors on highresolution brain magnetic resonance images [18].This study used a specially designed deep learning called SqueezeNet, which was first proposed in a previous investigation [19].The proposed deep learning for COVID-19 diagnosis was based on the SqueezeNet architecture, due to its smaller structure compared to well-known pre-trained network designs [20,21].Wang and Wong [22] proposed a deep-learning model for automated COVID-19 diagnosis and achieved a classification Accuracy of 92.4% in detection 3 classes labeled normal, pneumonia, and COVID-19.Furthermore, Ioannis et al. [23] improved deep learning, which used 224 approved CT images as input to achieve 98.75% and 93.48% Accuracy scores for classes 2 and 3, respectively.Narin et al. [24] used the ResNet50 model to identify COVID-19 from CX images and achieved 98% classification.The best classification score with a structure involving the ResNet50 model was obtained at 95.38%.Pathak et al. [25] used CT images and transfer learning techniques, which were used in a CNN model for automated COVID-19 diagnosis, and achieved high classification performance.
The remainder of this study is organized as follows: Section 2 presents architectural novelties.Section 3 presents the study framework and methods used.Section 4 presents the experiments and results, while the conclusion is presented in Section 5.

Problem Statement
Recently, X-ray dataset collections have become available on the Kaggle repository [26][27].In this study, the dataset was implemented using deep learning by combining data augmentation, CNN, Spatial Transformer (STN), and LSTM, which was referred to as CNN STN LSTM (CSLNet).
This study applied a new algorithm, CSLNet, to analyze a large dataset of lung diseases for predicting and lung diseases detection in patients.The dataset used was large, making data processing challenging, and also contained a lot of noise.However, there was no sufficient information to predict diseases easily, which made dataset processing a challenging task.
The patients were classified using the CNN deep learning on their X-ray images.The CNN VGG Data STN (VSDNet) [3] can be considered one of the most promising algorithms.Based on the results, the performance of the new CSLNet was compared with the VSDNet method.Therefore, the main contribution of this study was the development of a new CSLNet algorithm that can predict and detect lung diseases in X-Ray images with greater Accuracy than existing methods.

MATERIAL AND METHOD
Figure 1 showed the proposed structure design for this research, where X-ray images of lung diseases was used as input to classify and detect lung diseases.

Dataset Collection
This study used lung diseases X-Ray dataset [26][27], which currently stored a total of 9039 lung X-Ray images from 5 diseases.The dataset was accessed through the Kaggle repository and included X-ray images of patients diagnosed with bacterial pneumonia, COVID-19, tuberculosis, viral pneumonia, and normal.The dataset collected was shown in Table 1.Images were taken from different online sources, hence, the dimensions may vary from one source to another. Medical images usually contained noise that hindered images quality.
In the process of classification in machine learning or deep learning, the imbalanced class distribution ratio of a dataset can significantly affect the performance of the model [14].In this study, a dataset of lung diseases was used to train all neural networks tested.As a preprocessing step, X-ray images were shuffled to reduce variance and aid in faster model convergence.The top 5% of chest X-ray images were cropped before training to reduce unnecessary space.At this stage, resizing was crucial because images were obtained from different online sources with various dimensions varied.The final step was normalization, where the numpy array used to read images was normalized by dividing images matrix by 255.The training of the tested deep neural network architecture required data augmentation with the following configuration, namely rescale (first scale down all chest X-ray images to reduce their size), shear range (±20•), zoom range (±15%), and fill mode.

Sampling
In this study, the sample data was partitioned into three parts, namely training (80%), validation (10%), and testing data (10%), as presented in Table 2. CNN represent a very good class of models but are still limited by their inability to become spatially invariant to input data in a computationally and parameter-efficient manner [28].STN mechanism is divided into three parts, as shown in Figure 2. STN mechanism is divided into three parts, and sequentially, the localization network first takes the input feature map.After passing through several hidden layers, it generates the spatial transformation parameters that should be applied to the feature map, resulting in an inputdependent transformation.Subsequently, the predicted transformation parameters are used to create a sampling grid, which is a set of points where the input map is sampled using a network generator to generate the transformed output.Finally, the feature map and sampling grid are taken as inputs to the sampler, producing an output feature map sampled from the input at the grid points.

Long Short-Term Memory
LSTM is an improvement on Recurrent Neural Networks (RNNs) that proposes memory blocks instead of conventional RNNs units in solving the problem of gradient vanishing and bursting [29].Furthermore, it incorporates a cell state to store long-term memory, which is the main difference from RNNs.LSTM networks can also remember and connect previous information with the current data [30].The LSTM is combined with three gates, namely the input, forget, and output gate, where x t refers to the current input, C t and C t-1 , indicate the current and previous cell state, while h t and h t-1 are the current and previous output, respectively.The internal structure of LSTM is shown in Figure 3. (2)   =    −1 +     ̃ (3) Where ( 1) is used to pass ℎ −1 and   through the sigmoid layer to determine which piece of information to be added.Furthermore, ( 2) is used to obtain new information after ℎ −1 and   are passed through the tanh layer.The current moment information,  ̃ and long-term memory information  −1 become   combined in (3), where   refers to the sigmoid output and  ̃ is the tanh output.Moreover,   represents the weight matrix, and B1 is the bias of the LSTM input gate.The LSTM forgetting gate allows selective information paths using the sigmoid layer and the dot product.The decision to forget related information from the previous cell with a certain probability is carried out using (4), where   refers to the weight matrix,   is the offset, and σ is the sigmoid function.
=  (  • [ℎ −1 ,   ] +   ) (4) The LSTM output gate determines the state required for continuation with inputs ℎ −1 and   following ( 5) and (6).The final output is obtained and multiplied by the state decision vector, which passes the new information, C t , through the tanh layer.
Where   and   are the weighted matrices of the output gate and the biased LSTM, respectively.

A Combination of CNN and LSTM
Figure 4 shows the proposed model architecture for detection lung diseases, which consists of 5 convolutional blocks, each containing 2 convolutional layers and 1 pooling layer for blocks 1 to 3.Moreover, blocks 4 and 5 contain 3 convolutional layers and 1 pooling layer, while all blocks have a 20% dropout rate.This makes a total of 12 convolutional and 5 pooling layers used in this study.Kernel size (3,3) is used with 64, 128, 256, 512, and 1024 filters.As the model goes deeper into the layers, the size increases to capture complex features.For activation functions, ReLU (Rectified Linear Unit) is used because this function brings linearity to the CNN.Additionally, in terms of performance, ReLU provides better Accuracy compared to the Sigmoid or Softmax functions [31].The same padding is used in every convolutional layer, with a dimension step of (1,1).
A pooling layer is used in the CNN architecture to prevent information loss, which occurs every time images shrinks after each convolution operation.The max-pooling layer is preferred because it selects the maximum value from the part of images covered by the kernel, which helps extract the most important features from the input matrix.It also suppresses noise from activations along with dimensionality reduction.Kernel size (2,2) is used for maxpooling layers.The features extracted from the CNN layers are fed into a sequence of LSTM layers using the reshape method, with the input size of the LSTM layer being (49, 1024).The 'tanh' activation function is used in each LSTM cell, consisting of a 20% dropout rate, similar to the CNN layers.The proposed architecture summary is shown in Table 2.After analyzing the time characteristics, the architecture sequences X-ray images through fully connected layers to predict whether they belong to one of the 5 categories, namely bacterial pneumonia, COVID-19, viral pneumonia, normal, and tuberculosis.

Implementation of CSLNet
The algorithms utilized in this study are implemented through the utilization of Jupyter Notebook, Tensorflow, and Keras.The subsequent section provides a detailed description of the implementation process.This represents the fundamental framework of the paper and can be executed on Jupyter Notebook as CSLNet for the entire dataset.The comprehensive architecture of CSLNet is depicted in Figure 5.
The structure contains three key layers in the ensuing order:

Evaluation Metrics
To evaluate the proposed quantitative method's performance, evaluation metrics such as Accuracy (Acc), Precision (Pre), Recall (Rec), and F1-score are calculated statistically from the confusion matrix.Acc measures how often the classifier predicts correctly, while precision describes the positivity of cases predicted as positive.Precision is useful in cases where False Positives are more important than False Negatives.Meanwhile, recall describes how many actual positive cases can be predicted with the created model.Recall is useful when False Negatives are more important than False Positives.F1-score provides a combined idea of Precision and Recall metrics.The F1-score is maximum when Precision is equal to Recall, and the selected evaluation metrics are defined as: TP: True Positive, TN: True Negative, FP: False Positive, FN: False Negative.This study calculates Recall, Precision, and F1-score of the proposed method, and the formula for each metric is divided between equations ( 7)- (10).This metric shows the quality of the classification results and the weaknesses of the model for certain categories.

Experimental Setup
In the experiment, the dataset was divided into 80%, 10%, and 10% for training, validation, and testing, respectively.The results were obtained using a 5-fold cross-validation technique and the proposed network consisted of 12 convolutional layers as shown in Table 3. Learning rate was 0.001 and the maximum number of epochs was 25, as determined experimentally.
CNN, CNN-LSTM, and CSLNet networks were implemented using Jupyter notebook, Python and the hardware library as well as TensorFlow on an Intel® Core™ i7-2.8GHzprocessor.Furthermore, tests were carried out using NVIDIA GTX 1060 graphics processing (GPU) with 6 and 16 GB RAM, respectively.

Model Training and Validation
The proposed model was trained and evaluated using augmented X-ray data.To demonstrate the model's convergence when the test dataset was limited, the proposed CSLNet was evaluated on k-fold cross-validation (k=5 folds).From all the collected samples in Table 1, the dataset distribution was shown in Table 2 and the model was evaluated on various ML classification metrics, including Accuracy, Precision, Recall, f1-score, and ROC.The experimental results comparing CNN, CNN+LSTM, and CSLNet models were shown in Figure 7 and Table 4.  6 showed Accuracy and loss curves for the proposed CNN, CNN+LSTM, and CSLNet models.The 5 folds were run to train the models and Accuracy, as well as the loss for the training and validation datasets, were recorded simultaneously.To avoid overfitting on the training dataset, the models were trained only up to 25 epochs for each fold.The training and validation Accuracy of 0.98 and 0.99 were recorded after 5 folds, with loss values of 0.41 and 0.36, respectively.Learning curves showed that the best Accuracy and loss values were achieved by CSLNet model.

Evaluation Metrics
The performance of the proposed network was evaluated on independent test sets.Accuracy, Precision, Recall, and f1-score for each class were calculated and summarized in Table 5.
The results showed a significantly high Accuracy in the independent test data set for all classes, with a mean of 0.99, 0.98 Precision, 0.98 Recall, and 0.98 f1-score.Receiver Operating Characteristics (ROC) for bacterial pneumonia, COVID-19, normal, tuberculosis, and viral pneumonia were 0.97, 0.99, 0.99, 0.94, and 0.97, respectively as shown in Figure 7a and Figure 7b shows the confusion matrix.

Lung Diseases Detection Results
After the created model successfully classified lung diseases based on their class, the model also detected diseases inside lung, as presented in Figure 8.

Discussion
This study designed and implemented a machine-learning model for diagnosing lung diseases from X-ray.The high performance, as shown in Table 6 suggested that the model can assist radiologists in their diagnoses.This provided a solution with reduced time and effort for diagnosing lung diseases and distinguishing healthy patients.The primary motivation of this study was to ensure easy implementation and use by radiologists when patients suffer from lung disorders caused by diseases.To obtain a comprehensive dataset, a large number of chest X-ray were collected from the Kaggle repository, which included a total of 9039 different diseases.The number-based labeling was used due to its effectiveness in terms of cost and time.

Implementation of Lung Diseases Detection Tools
All coding was carried out using the Python programming language on Visual Studio Code with MySQL as the database.The login page as presented in Figure 9, was specifically designed for specialist doctors to access the start of the system.The radiologist must enter a username and password before accessing lung diagnostic system.When the radiologist entered the username and password incorrectly, the system will reject the login request.However, when correctly entered, the system will approve the radiologist's access to the system.Figure 10 showed the home page when the doctor successfully logged in.
Radiologists must also enter patient data first before diagnosing diseases of the patient.The patient data that should be entered by the radiologist included the name, age, gender, contact, and address of the patient.Figure 11 showed the results of patient data input entered by the radiologist.
When a radiologist entered the results of a chest X-ray images, the radiologist can immediately predict diseases of the patient by pressing the prediction button.Furthermore, the radiologist can save and print the entered diagnosis report to the database by pressing the submit button.Figure 12 showed the results of diseases diagnosis input performed by the radiologist.
Diagnostic reports carried out by radiologists were stored in a database and became output reports that were given to patients.Figure 13 showed the output report on the results of a lung diseases diagnosis.
The objectives of this study include:  To classify and accurately detect lung diseases such as tuberculosis, bacterial pneumonia, viral pneumonia, and COVID-19. To check the generalization of the method using a larger X-Ray images dataset. To achieve higher performance metrics by comparing STN-CNN-LSTM, CNN-LSTM, and CNN combinations. To calculate all performance metrics and compare different parameters of the proposed technique with current techniques. To create an STN-CNN-LSTM-based model and tools for lung diseases detection that can be used by medical professionals and the public as a reference for determining the type of lung diseases.

Figure 2 .
Figure 2. Spatial Transformer of Architecture Figure 2 shows the input feature map U being passed through a localization network that regresses the transformation parameters θ.The regular spatial grid G over V is transformed to the sampling grid Tθ (G), which is applied to U, resulting in the output feature map V.The combination of the localization network and sampling mechanism defines STN.STN mechanism is divided into three parts, and sequentially, the localization network first takes the input feature map.After passing through several hidden layers, it generates the spatial transformation parameters that should be applied to the feature map, resulting in an inputdependent transformation.Subsequently, the predicted transformation parameters are used to create a sampling grid, which is a set of points where the input map is sampled using a network generator to generate the transformed output.Finally, the feature map and sampling grid are

Figure 3 .
Figure 3.Long Short-Term Memory StructureThe principle of the LSTM input gate is shown in the following formula:  =  (  • [ℎ −1 ,   ] +   ) (1)   ̃ = ℎ (  • [ℎ −1 ,   ] +   ) (2)   =    −1 +     ̃ (3)Where (1) is used to pass ℎ −1 and   through the sigmoid layer to determine which piece of information to be added.Furthermore, (2) is used to obtain new information after ℎ −1 and   are passed through the tanh layer.The current moment information,  ̃ and long-term memory information  −1 become   combined in(3), where   refers to the sigmoid output and  ̃ is the tanh output.Moreover,   represents the weight matrix, and B1 is the bias of the LSTM input gate.The LSTM forgetting gate allows selective information paths using the sigmoid layer and the dot product.The decision to forget related information from the previous cell with a certain probability is carried out using (4), where   refers to the weight matrix,   is the offset, and σ is the sigmoid function. =  (  • [ℎ −1 ,   ] +   )(4)The LSTM output gate determines the state required for continuation with inputs ℎ −1 and   following (5) and(6).The final output is obtained and multiplied by the state decision vector, which passes the new information, C t , through the tanh layer.  =  (  • [ℎ −1 ,   ] +   ) (5) ℎ  =   ℎ (  ) (6) Where   and   are the weighted matrices of the output gate and the biased LSTM, respectively.

Figure 4 .
Figure 4.A Combination of CNN and LSTM Networks


Spatial transfoemer layers (i) There are five layers.(ii) The first part is lambda λ to transfer the default routing [-0.5:1.0], which indicates that the feature of the lung X-ray images has a normal value of 3. (iii) The second part is batch normalization.(iv) The thrid part is localization network.(v) The fourth part is grid generator.(vi) The fifth layer is optimal input I.  Extraction of features layers (i) CNN model has been pre-trained.(ii) CNN architecture has 12 convolutional, 5 pooling layers, and 1 LSTM layer. Classification layers (i) In this scenario, the initial layer is referred to as the flattened layer, which is derived from the output of a CNN (Convolutional Neural Network).This CNN output includes five additional classes: 'Bacterial pneumonia', 'COVID-19', 'Normal', 'Tuberculosis', and 'Viral pneumonia'.These extra classes also play a role in determining the sorting, as observed in the simulation.Consequently, they are incorporated into the subsequent layers, forming what is known as the dropout layer.(ii) The final layer consists of dense dropout layers, gradually decreasing in depth.Figure 5 shows the flowchart of implementation CSLNet for detection lung diseases.ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online) Volume 12, Issue 3, December 2023 Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 443

Figure 11 .
Figure 11.Patient's Data Page

Table 6 .
Comparison of Architectural Models Used in Related Studies and This Study