Quantile Regression Analysis for Students’ Difficulty Level in Learning Statistics Online

Statistics learning in the time of distance education makes students uneasy due to the difficulties in the learning environment and processes. This article aims to analyze the level of difficulty faced by students in statistics distance education and captures the causal factors affecting it. Cross-sectional secondary data were used from a current study in literature and described by standard statistical measures. The study utilized the level of difficulty in learning statistics during online setup as the dependent variable in the form of a scale of 1 to 10. The study used standard descriptive measures. Moreover, this study also use quantile regression models were constructed. Results depicted that students were facing "difficulty" as they learn statistics during distance education. The quantile regression revealed that the learning environment, inaccessible places (rural areas), and more family members are the statistically significant factors that influence the difficulty level in learning statistics online. This implies that due to the distractive place of learning, students cannot focus and penetrate their lessons. In addition, students were struggling with communication technology and internet access wherein it is vital in classroom engagement and the acquisition of learning resources. Hence, the study suggests that students must be supported by the government in regard to their learning needs and teachers must promote a positive and interesting educational environment. Furthermore, teachers must undergo training and workshops on how to be equipped in teaching statistics online.


A B S T R A C T
Statistics learning in the time of distance education makes students uneasy due to the difficulties in the learning environment and processes.This article aims to analyze the level of difficulty faced by students in statistics distance education and captures the causal factors affecting it.Cross-sectional secondary data were used from a current study in literature and described by standard statistical measures.The study utilized the level of difficulty in learning statistics during online setup as the dependent variable in the form of a scale of 1 to 10.The study used standard descriptive measures.Moreover, this study also use quantile regression models were constructed.Results depicted that students were facing "difficulty" as they learn statistics during distance education.The quantile regression revealed that the learning environment, inaccessible places (rural areas), and more family members are the statistically significant factors that influence the difficulty level in learning statistics online.This implies that due to the distractive place of learning, students cannot focus and penetrate their lessons.In addition, students were struggling with communication technology and internet access wherein it is vital in classroom engagement and the acquisition of learning resources.Hence, the study suggests that students must be supported by the government in regard to their learning needs and teachers must promote a positive and interesting educational environment.Furthermore, teachers must undergo training and workshops on how to be equipped in teaching statistics online.

INTRODUCTION
Distance education was implemented to combat the spread of COVID-19 and was done through modular and online learning processes (Almarashdi & Jarrah, 2021;Buba et al., 2020;Castroverde & Acala, 2021).Apparently, teachers and students during this time were not able to communicate well due to shortcomings in the real-time classes.Teachers do not have enough opportunity to monitor the real student's academic progress since outside factors are beyond their control (Fatimah & Santiana, 2017;E.-J. Kim et al., 2021).In addition, teachers were facing limitations in presenting their lessons to students because they lack training and seminars about distance education.Also, teachers were facing stress, burnout, and anxiety due to the difficulties and barriers brought on by the COVID-19 pandemic (Abuhmaid, 2022;Bravo et al., 2021).On the face of it, students' learning processes were adversely affected concerning the lesson presentation and lecture-discussion.Plus, students were experiencing problems in regard to misuse of technology and internet connectivity which hinders their acquisition (Casinillo, 2022a;Dontre, 2021).As a consequence, students were facing difficulty understanding the lessons and were unable to grasp vital concepts in their course of study.In fact, in distance learning, students were uninterested to learn because of the boring setup in the lecture discussion and they were distracted by other factors at their respective homes (S.H. Kim & Park, 2021;Onyema et al., 2020).
One of the difficult subjects to learn is statistics which requires logical thinking, focus, and an analytical mind.In the time of distance education, statistics education is a very challenging endeavor to realize due to limitations in regard to communication and proper interaction of teachers and students (Casinillo, 2022b;Miñoza & Casinillo, 2022).Hence, students of statistics courses are having a hard time understanding the core concepts of the subject.According to previous study a statistics course needs thorough guidance from the teacher and a good strategy is a must in order that students can grasp the logical principle of the course (Casinillo et al., 2022).In fact, statistics is a course at the college level that requires analytical thinking that must be guided by the teacher thoroughly due to its technicality (Nurlina et al., 2022;Wu & Nian, 2021).Apparently, the COVID-19 impact on educational setup makes the teachers and students more difficult to communicate and interact due to boundaries and barriers.Plus, they were experiencing a delay in the submission of their outputs or submitting incomplete learning tasks because they do not have enough knowledge to answer or solve the statistics problems (Apriliana, 2021;Efriana, 2021).And this difficult situation that students were facing greatly affects their academic performance in the quality of learning was diminished.In view of that, it is vital to investigate the factors that negatively influence their level of difficulty in learning statistics to put a remedy and somehow improve their learning ability amid distance education.Moreover, many studies in the literature have shown that students' logical comprehension in their lessons has been affected by their socioeconomic aspect since this is adversely impacted by health crises (Almarashdi & Jarrah, 2021;Casinillo, 2022b).
Students' families are also experiencing economic crises, obstacles to their basic needs, and school requirements.In that case, students' education, particularly in statistics courses was adversely influenced in which it is difficult for them to achieve good performances and even learn important topics (Sefriani et al., 2021;Shazia & Khan, 2015).Hence, doing research on the level of difficulty in learning statistics at a distance education is crucial to improve students' ability to attain good academic achievement and understanding.Although students' difficulty level in learning statistics during the pandemic is well-studied, the investigation of its influencing factors using regression modeling is scarce.In addition, scrutinizing the different difficulty levels in statistics online using quantile regression has never been realized (Ariani, 2020;Hasanah, 2022).Thus, this research article is conducted on statistics college students.Generally, this study elucidates the level of learning difficulty in statistics during distance education and determines the governing determinants during the global health crisis (Baticulon et al., 2021;Fabito et al., 2020).Specifically, the article sought the succeeding targets: (i) to describe the socio-demographic and learning profile of statistics students; (ii) to categorize the level of difficulty in learning statistics online; and (iii) to capture the factors affecting the difficulty level of learning statistics with the aid of quantile regression modeling (Bravo et al., 2021;Buba et al., 2020).
The purpose of this study is to make an inference about the student's learning ability in statistics amid the global health crisis that might be used as a basis for distance educational policy.Moreover, the findings of the study might help the students to overcome the difficulty level in learning statistics whenever they encounter similar situations in the future.Furthermore, the results of this current article may give information to statistics educators to enhance their teaching strategies and may be an aid to improve statistics education in the country and throughout the globe.

METHOD
This current study utilized a complex-correlational research design to capture and explain the difficulty level in learning statistics and its governing predictors during online learning.In that regard, the article used some descriptive measures to describe the collected data and employed regression analysis to forecast the relationship between dependent and independent variables.Secondary and cross-sectional data were employed in this study from previous paper.The research paper dealt with engineering students who took up the "Engineering Data Analysis" course offered by the Department of Statistics at Visayas State University, Baybay City, Philippines (Casinillo, 2022a).The published study has investigated the determinants of challenge in statistical learning in the form of online education, however, it does not focus on the level of how difficult is learning statistics online and its predictors at different levels.
Thus, this article is a follow-up study to elucidate the students' difficulty level in learning statistics during the distance education setup that could be useful in improving the statistics educational setup in the country Philippines and beyond.In that case, the study utilized the level of difficulty in learning statistics during online setup as the dependent variable in the form of a scale of 1 to 10, with 1 means not difficult and 10 being extremely difficult.In addition, the difficulty level can be categorized as follows: {1, 2} -not difficult, {3, 4} -slightly difficult, {5, 6} -moderately difficult, {7, 8} -difficult, and {9, 10} -very difficult.Moreover, the study has selected the following variables as possible regressors of difficulty in learning: (1) estimated age of students in years, (2) gender (0=female, 1=male), (3) place of residence (0=Rural, 1=Urban), (4) number of family members (count), ( 5) estimated household assets in Philippine peso (PHP), ( 6) estimated monthly family income (PHP), and students rating (Scale 1 to 10) of the following profile: (7) learning environment, (8) internet signal for learning, (9) health, and (10) leisure time.After the selection of data needed for this article, it has been evaluated and refined by removing extreme responses or outliers through Microsoft Excel.Next to that, the data were also formatted to the statistical software STATA for analysis.
The study used standard descriptive measures such as mean (M), frequency (n), percentages (%), standard deviation (SD), coefficient of variation (CV) for consistency test, and Chi-square test for uniform distribution.Moreover, to capture the determinants of difficulty level in learning mathematics in the online setup, quantile regression models were constructed.The dependent variable was categorized into three levels such as low-level difficulty (25 th quantile), moderate-level difficulty (50 th quantile), and high-level difficulty (75 th quantile).The ordinary least square (OLS) regression was also run and served as baseline information since it represents the mean average difficulty level in the model.Furthermore, post-estimation techniques for OLS regression were employed in ensuring the validity of the model parameters, and statistical results were tested at the standard level of significance (Mátyás & Sevestre, 2013).As gleaned in Table 1, only 0.81% of these students aged 18 years old, 33.87% were 19 years of age, 36.29% were 20 years of age, 25% of them were 21 years of age, 2.42% were 22 years old and 1.61% were 23 years old.On average, the age of these students was close to 19.99 (SD=0.93).About 38.21% of these students were male and 61.79% of them were female.Only 23.58% of these students were living in urban areas and dominant were living in rural (76.42%) areas.About 15.45 % of these students rated their internet signal as in the interval 1-3 out of 10 scaling, 67.48% in the interval 4-7, and 17.07% in the interval 8-10.On average, their rating for internet signal was close to 5.44 (SD=1.97)out of 10 scaling.A rating of 1-3 out of 10 was given to 21.95% of the students for their learning environment, 56.10% of them rated their learning environment as 4-7, and 21.95% rated 8-10.Approximately, the student's overall rating for their learning environment was close to 5.44 (SD=1.97).

Result
About 21.14% of these students rated 1-3 out of 10 in their health status during the pandemic, 66.67% have rated 4-7, and 12.20 have rated 8-10.On average, health status was rated as 5.46 (SD=2.08)out of 10.About 13.01% of these students have rated 1-3 out of 10 for their leisure time during the pandemic, 46.34% of them have rated 4-7, and 40.65% have rated 8-10.On average, students' rating on their leisure activities was close to 6.61 (SD=2.51).Dominant (55.28%) of these students have family members of 1-5, 43.09% of them have 6-10 members, and only 1.63% of them have family members of 11 and above.On average, the estimated household assets and monthly family income in the Philippine peso (PHP) were close to 177,903.3 (SD=333,895.2) and 20,572.36(SD=15191.12),respectively.The Difficulty level in learning statistics at a distance education is show in Table 2.It can be seen in Table 2 that the distribution of the level of difficulty in learning statistics online is not uniform based on the Chi-square test (X 2 =93.54, p-value<0.001).This implies that it is statistically significant to conclude that the number of students who found statistics learning online difficult is different from the number of students who found statistics learning not difficult.
Table 2 also showed that only 1.63% of the students said that learning statistics at a distance is "not difficult", 7.32% said it is "slightly difficult", 12.20% said that it is "moderately difficult", 29.27% of them said that it is "difficult", and dominant of them (49.59%) said that it is "very difficult".On average, students' difficult perception score was close to 7.91 (SD=2.21)which can be interpreted as "difficult".The coefficient of variation (CV>20%) revealed that the difficulty perception score is not consistent.Regression models for difficulty level in statistics online learning is show in Table 3.  Table 3 portrayed the different regression models that capture some influencing factors of difficulty level in learning statistics online.Model I represents the OLS model as baseline information to our main models namely 25 th, 50 th , and 75 th quantile regression (models II, III, and IV).It is revealed that the OLS model is not heteroscedastic, which means that it has constant variances and this result is done by the Breusch-Pagan test (X 2 =3.36, p-value=0.067).It is also revealed that no omitted variable bias exists in the OLS model with the aid of the Ramsey RESET test (X 2 =0.67, p-value=0.57).Plus, no multicollinearity exists between the pairwise predictor variables in the notion of variance inflation factor (VIF) since VIF<10 (Allison, 2012).However, the Shapiro-Wilk test (Z=4.38,p-value<0.001)has shown that the residuals of the OLS model are not normally distributed.The OLS model is not significant (F=1.12,p-value=0.35)and the coefficient of determination is only 0.091.
This indicates that more or less no significant factors that affect the difficulty in learning statistics based on the OLS model results.In fact, the learning environment is the only significant factor based on individual t-tests and it is just significant at a 10% level.In the 25 th quartile regression (Model II: lower level of difficulty), the only significant factor for the difficulty in learning statistics is the "learning environment (significant at 10%)" of students.Additionally, in the 50 th quartile regression (Model III: middle level of difficulty), the significant factor for the difficulty in learning statistics online is the students' "family members" (significant at 10%).Furthermore, in the 75 th (high level of difficulty), three statistically significant factors influence the difficulty in learning statistics at a distance education namely the students' "residence (significant at 10%)", "learning environment (significant at 10%)", and "family members (significant at 10%)."

Discussion
Based on the descriptive statistics results, students' profiles were adversely affected by the COVID-19 pandemic.Students were experiencing an internet connection problem especially since most of them are living in rural or remote areas.Internet connectivity is very vital for students so that they can attend classes and do their research online.However, due to the financial and internet signal problem, their studies in statistics were negatively influenced and their academic performance was affected as well (Irfan et al., 2020;Reed et al., 2002).The learning environment at the students' home is not convenient and not conducive due to some distractions and other factors that disturb their focus on learning statistics.According to previous study students are depressed in their studies due to the boring learning environment and due to the limitations of online classes (Islam et al., 2020).Because of the lockdown during the pandemic, students' mental and physical health was negatively affected and some physical activities are prohibited which cause them anxiety and boredom.
In addition, leisure activities outside the home were also banned which students are stressed because they cannot unwind and relieve from the stressful learning nature in the form of online education.
In the study of previous study due to the anxiety caused by the pandemic, students' health and well-being were disturbed which correlates to their learning outcomes and performance (S.H. Kim & Park, 2021).Plus, the results showed that, on average, students are facing a "difficult" situation in learning statistics amid distance education.This means that some challenges and limitations affect their learning ability and study habits which hinders their cognitive understanding in statistics lessons.According to other study is portrayed that students are struggling to learn effectively without proper tools during online education amid the pandemic (Sefriani et al., 2021).Moreover, the study of other study depicted that students' study habits were adversely affected by the various distractions in the surroundings and barriers in the teaching-learning process during distance education (Casinillo, 2022b).However, their difficulty experience is not consistent which indicates that it will vary depending on the current happenings and learning environment.The study of other studies portrayed that with the right technology for education, pedagogy, and teaching strategies, students can still be effective in their learning on statistics courses (Yates et al., 2021).
The quantile regression model revealed that the learning environment of students is governing the level of difficulty in learning statistics at a distance.This indicates that it is unease for the students to study at their respective homes since there are things that prevent their attention to their classes and studies.Additionally, they are uncomfortable learning statistics in an unconducive environment wherein they cannot ask for help directly from their instructors/professors.According to previous study mentioned that the new learning environment which is online in form reduces the students' interaction and engagement in class discussions and activities (Salta et al., 2022).Likewise, in the study of other study discovered that most of the students do not acquire the appropriate knowledge in the subjects and they are dissatisfied with the unprecedented online learning environment process (Torres Martín et al., 2021).
At the same time, the quantile regression model showed that students who were living in rural areas are more like experiencing difficulty in learning statistics as opposed to students who are residing in urban areas.In rural areas, internet connection is relatively problematic, especially in the remote areas wherein students cannot find a signal and it's very inaccessible for them for urgent information about the classes.There is studies that portrayed that students in rural communities are at a disadvantage in regards to internet connections and other opportunities as opposed to students in urban places (Barrot et al., 2021;Casinillo et al., 2022).It is also said in previous study that the problem of students living in remote areas is the communication gadgets, stable network, and cost of the internet which hinders their studies (Ujianti, 2021).
Furthermore, students with more family members are the ones who are facing more difficulties in learning statistics as revealed by the quantile regression model.This implies that students with large household sizes are experiencing distractions in the learning process.In fact, students with more distractions cannot have a good place for learning wherein they can focus and meditate on their lessons.According to previous study students with more family members are experiencing financial problems, mental health problems, and misunderstandings during the lockdown which students to be discouraged from their learning activities and causes them stress and depression (Al Mamun et al., 2021).
Hence, it is suggested that teachers must give them printed and interesting activities so that students in remote areas can engage in the learning process.Moreover, it is recommended that the government must support the students regarding their learning needs namely gadgets and internet suitable for online education.Also, college teachers must undergo training and workshops in teaching statistics during distance education to improve their teaching strategies in online classes.Furthermore, the also recommends for further research that one may include empirical variables like students' performance (grade) and teachers' perception during online learning as possible weaknesses of this current research article.

CONCLUSION
Conclusively, learning statistics online during the pandemic is a difficult educational process for college students.Students were struggling to understand the technical nature of the subject and had difficulty in doing the learning task due to the limitations and challenges of distance education.Additionally, students were distracted at their home as their learning environment due to the large family size wherein they cannot focus on their classes and lessons activities.Students were also struggling with their internet accessibility and communication process due to a lack of technology for learning classes.

Table 1 .
Frequency Distribution for Socio-Demographic and Work Profile (n=123)

Table 2 .
Difficulty Level in Learning Statistics at a Distance Education (n=123)

Table 3 .
Regression Models for Difficulty Level in Statistics Online Learning (n=123)