Introduction

With the increasingly fierce educational competition led by the college entrance examination, high school students are facing escalating academic and exam pressures, which in turn leads to more common issues related to academic emotions. Academic emotions refer to the various emotions students experience during the learning process, directly related to school activities, classroom teaching, and learning outcomes1. As a very important non-cognitive factor in the learning process, academic emotions not only affect students’ academic performance but also their psychological health development.

The concept of academic emotions was first introduced by Pekrun, referring to emotions directly related to academic learning, classroom teaching, and academic achievement1. Research on academic emotions has primarily focused on test anxiety, while other emotions, aside from anxiety, have received less attention2. However, academic emotions encompass a wide range of emotional experiences, including pleasure, anxiety, boredom, frustration, and shame, all of which students may experience during the learning process. The Control-Value Theory, proposed by Pekrun, is a crucial theoretical framework in the study of academic emotions3. According to the Control-Value Theory, control appraisal and value appraisal are the key determinants of academic emotions4. Control appraisal refers to learners’ perception of the degree of control they have over the learning activity and its outcomes, which can be divided into causal expectations and causal attributions5. For example, when students believe they have made an effort and hope to achieve a good exam result, this represents causal expectation; when students attribute their good exam results to their ability, this represents causal attribution. Value appraisal refers to students’ self-perception of the value of the learning task, which can be divided into intrinsic value appraisal (the learner’s interest in the learning task) and extrinsic value appraisal (the learner’s belief that the task is beneficial for their future development). Therefore, learners’ academic emotions are influenced by their perceptions of control and value regarding academic activities and outcomes. Academic emotions have a significant impact on students’ learning attitudes, academic performance, and the development of their physical and mental health.

Pekrun classified academic emotions based on valence (positive or negative), arousal level (high or low), focus (process-oriented or outcome-oriented), and temporal reference (activity-related, prospective, or retrospective)3,6. This study employs a two-dimensional valence framework, categorizing academic emotions into positive and negative academic emotions. Positive academic emotions can enhance students’ engagement in learning activities, leading to more active participation. Although some studies have suggested that students with moderate levels of anxiety tend to achieve better academic outcomes, as fear of failure and an appropriate level of anxiety can enhance self-motivation and lead to improved performance across various academic tasks7,8,9, the long-term effects of dominant negative academic emotions may be detrimental. From a developmental perspective, when negative emotions prevail, they can diminish students’ enthusiasm for learning, weaken their motivation and interest, and ultimately exert a negative impact on their academic achievement1.

Due to the commonality and universality of negative academic emotions, this phenomenon often fails to attract the attention of educators. However, research indicates that the negative emotions students generate during the learning process hinder their cognitive development10, negatively affect their allocation of attention resources11, and limit their thinking activities and creativity12, thereby affecting their engagement in learning, leading to poor academic performance and a series of psychological problems. Furthermore, negative emotions have a strong correlation with suicidal tendencies, suggesting that experiencing such emotions during adolescence may elevate the risk of developing psychosomatic disorders and facing challenges in academic adjustment later in adulthood13,14. Therefore, paying attention to and researching the negative emotions of high school students has significant practical significance.

Pekrun’s social-cognitive model emphasizes the significant impact of an individual’s cognitive appraisal and social environment on academic emotions from a cognitive perspective. Based on Pekrun’s theoretical framework, this study considers individual cognitive variables as predictors of negative academic emotions, while treating environmental influences as important contextual factors. The social-cognitive model provides a clear structural pathway for analyzing negative academic emotions by exploring how students cognitively process their learning environment to understand their emotional experiences. According to the developmental contextualism, the development of an individual’s emotions is the result of continuous interactions between the individual and their environment15. In contrast to Pekrun’s model, which emphasizes subjective cognitive appraisal, developmental contextualism offers a broader developmental-systems perspective. It highlights that high school students’ negative academic emotions result from the joint influence of individual characteristics and contextual factors. While emphasizing the role of the environment, this theory also underscores the importance of stable individual traits. It provides a new perspective for the present study by suggesting that, in addition to cognitive factors, stable tendencies exhibited by individuals during the learning process should also be considered.

These two theoretical frameworks are complementary, as both highlight the interactive influence of internal and external factors on emotional development. However, they differ in their emphasis on internal factors. Social-cognitive theory focuses on the role of individuals’ cognitive processes in shaping emotional experiences during interactions with the environment, emphasizing constructs such as self-efficacy and attributional style. In contrast, developmental contextualism incorporates relatively stable individual traits, such as psychological resilience, which interact dynamically with contextual factors over time. Based on these two theories, this study categorizes the factors leading to the negative emotions of high school students into two categories: individual factors and contextual factors. The individual factors selected to influence high school students’ academic emotions include psychological resilience, attribution styles, and self-efficacy. The primary contextual factor is the school environment, with a focus on teachers’ disciplinary styles. These two theories are complementary, emphasizing the interaction between individual development and the environment.

School is the primary environment for students’ learning and life, and it is also a crucial microsystem among environmental factors that significantly affects individuals. Research on negative academic emotions of high school students should be conducted within the school context. Teachers are the primary guides for students’ learning and life in school. Previous research indicates that different discipline styles can affect students’ psychology and behavior16,17. Authoritative and fair discipline styles tend to evoke positive emotions in students, leading to more positive behaviors, while indifferent discipline styles may generate negative emotions and trigger more negative behaviors18.

Previous studies have indicated that an individual’s capacity for adaptation, especially as manifested through psychological resilience, plays a crucial role in mitigating negative academic emotions in the face of setbacks and challenges19. Psychological resilience refers to an individual’s positive adaptation in the face of adversity, trauma, tragedy, threats, or other significant sources of life stress20. It reflects one’s capacity to recover from pressure and setbacks. Students with low psychological resilience often exhibit poor emotional regulation when confronted with difficulties or failures21. They tend to lack meaningful interpersonal relationships, struggle to seek help or express their emotions, and have difficulty effectively coping with challenges encountered in life or learning. As a result, they are more vulnerable to the disruptive effects of negative academic emotions22.

In addition to psychological resilience, an individual’s attribution style is another important factor influencing cognitive appraisal. Attribution refers to people’s explanations and evaluations of the causes of events, and attribution style refers to the stable tendency of attributions formed after individuals make attributions multiple times23. Attribution theory points out that the attribution of the outcomes of events affects emotions, and the way of attribution significantly influences individuals’ emotions24. For students, attribution can effectively predict academic emotions. An attribution style that attributes success internally and failure externally (attributing success to oneself and failure to external factors) can generate positive academic emotions25, while attributing success externally and failure internally (attributing success to external factors and failure to oneself) can lead to negative academic emotions26. Research has proven that students’ academic emotions improved through attribution training, further confirming the predictive role of attribution in academic emotions27.

Additionally, research has confirmed the correlation between academic self-efficacy and academic emotions. Academic self-efficacy refers to an individual’s judgment and evaluation of their ability to complete academic tasks28. Surveys have found that students with lower levels of academic self-efficacy are more likely to experience negative academic emotions in their studies29,30,31,32. A study discussing the relationship between mathematics learning self-efficacy and mathematics academic emotions found that mathematics learning self-efficacy can significantly predict mathematics academic emotions33, with similar results obtained in the context of English learning34. With the development of computer networks, many scholars have studied the relationship between academic self-efficacy and academic emotions in online learning, finding that online learning self-efficacy still plays a significant role in predicting academic emotions35,36.

Previous research has identified numerous factors related to negative academic emotions in students, such as parental involvement37, teacher support38, and students’ psychological well-being39. However, traditional correlational studies often fail to provide early prediction models40. Additionally, the lack of a comprehensive framework that includes different psychological factors has limited the practical application of research findings. Machine learning, a method based on algorithms and statistical models used for detecting hidden patterns in data, can optimize analysis results through cross-validation and is less affected by outliers41. It has become one of the most popular tools in educational data mining, showing great potential in areas such as student screening and intervention42. It has achieved positive results in predicting students’ suicidal ideation, self-harm behaviors43, but to date, no studies have used machine learning to predict academic emotions. Therefore, this study aims to: (1) establish an effective prediction model for negative academic emotions among high school students using machine learning algorithms; (2) identify and analyze key factors influencing these emotions; and (3) facilitate early identification and intervention for students experiencing high levels of negative academic emotions, ultimately contributing to their psychological well-being and academic success.

Method

Participants and procedures

This study involved selecting first and second-year high school classes from several ordinary high schools in Hebei Province for participation. Questionnaires were distributed physically in paper format and completed anonymously by students during regular class sessions. A total of 1708 questionnaires were distributed and collected. All participants meet the Chinese age requirements for enrollment, which is between 15 and 17 years old. Among them, there were 749 male students (43.9%) and 959 female students (56.1%); in terms of grade, there were 745 first-year high school students (43.6%) and 963 s-year high school students (56.4%); regarding family structure, there were 984 only children (57.6%) and 724 non-only children (42.4%). All surveyed students signed informed consent forms. After removing subjects with missing values, a total of 1696 valid samples remained, achieving a utilization rate of 99.3%. Before the official survey, this study obtained permission from the Scientific Research Ethics Committee. Informed consent was obtained from all participants and their guardians, ensuring their understanding of the investigation’s purpose. The survey instructions were clearly explained, any ambiguous items were clarified, and the voluntary nature and confidentiality of the survey were emphasized.

Measures

Teacher discipline style

Teacher discipline style was assessed with the teacher discipline style scale44. This questionnaire is divided into two subscales: teacher response and teacher request, with a total of 17 items. The teacher response subscale contains 9 items (e.g., “When I perform poorly, my teacher comforts me”), and the teacher require subscale contains 8 items (e.g., “When I perform poorly on exams, my teacher scolds me in front of the class”), rated on a scale from “1—Never” to “5—Always.” Higher scores indicate a greater tendency towards that discipline style by the teacher. The Cronbach’s alpha coefficients for the two subscales are 0.92 and 0.88, respectively.

Resilience

To measure resilience, the psychological resilience scale was employed45, which consists of 27 items rated on a 5-point scale (from 1 “Never” to 5 “Almost Always”). It includes two dimensions: individual and supportive power. Individual is further divided into three factors: goal planning (5 items, e.g., “I have clear goals in my life”), affect control (5 items, e.g., “It is hard for me to control unpleasant emotions”), and positive thinking (4 items, e.g., “I believe adversity can be motivating”), while supportive power is divided into two factors: family support (6 items, e.g., “My parents respect my opinions”) and help-seeking (6 items, e.g., “I talk to others when I encounter difficulties”). A higher score indicates a higher level of psychological resilience on that item, with 12 items requiring reverse scoring. In this study, the scale’s Cronbach’s alpha coefficient was 0.90.

Attributional style

The attribution style was conducted using the Academic Achievement Subscale of the Multidimensional Multiple Attributional Causality Scale (MMCS)MMCS,46. The MMCS contains 24 items, utilizing a 5-point rating scale (from 1 “Completely Disagree” to 5 “Completely Agree”). It includes four dimensions: attribution to ability (6 items, e.g., “The most important factor in achieving good grades is my learning ability”), attribution to effort (6 items, e.g., “Poor grades indicate that I didn’t work hard enough”), attribution to background (6 items, e.g., “Sometimes I get good grades just because the subject is easy to learn”), and attribution to luck (6 items, e.g., “Sometimes I succeed in exams because of a bit of luck”). The Cronbach’s alpha coefficient for this scale was 0.75.

Academic self-efficacy

The Academic Self-Efficacy Questionnaire was assessed with the Academic Self-Efficacy Questionnaire47. This questionnaire consists of 22 items rated on a 5-point scale (from 1 “Completely Disagree” to 5 “Completely Agree”). It includes two dimensions: self-efficacy for learning ability (11 items, e.g., “I believe I have the ability to achieve good results in learning”) and self-efficacy for learning behavior (11 items, e.g., “I always highlight key parts in textbooks or notebooks to help with learning”). The Cronbach’s alpha coefficient for this scale is 0.83.

Negative academic emotion

Negative Academic Emotion was examined with the shortened version of Academic Emotions Questionnaire (S-EES)48. The questionnaire contains seven items: surprise, curiosity, excitement, anxiety, confusion, frustration, and boredom, rated on a 5-point scale (from 1 “Not at all” to 5 “Very strongly”). Negative academic emotions include anxiety, confusion, frustration, and boredom. In research on mathematics learning emotions, it was found that “boredom” had the same loading on two factors, which did not meet the measurement criteria, and therefore was not included in the negative emotions49.

Given that the research by Zhang focused solely on mathematics, its findings may not be applicable to this study. Hence, this study conducted a confirmatory factor analysis on the questionnaire. The results showed that the two-factor model fit well (χ2/df = 3.34, SRMR = 0.04, GFI = 0.97, CFI = 0.99, IFI = 0.99, TLI = 0.98). The factor loading for “boredom” in negative academic emotions (0.78) was higher than in positive academic emotions (0.41), thus, in this study, “boredom” was attributed to negative academic emotions. The internal consistency of the questionnaire was 0.93.

Data preprocessing

(1) Data Transformation. To avoid prediction errors due to large numerical differences between different types of variables, data normalization was carried out50, standardizing the data to have a mean of 0 and a standard deviation of 1. (2) Dataset Division. Based on the criteria of Z > 1 and Z ≤ 1, subjects were divided into high-level and middle-low level groups regarding negative academic emotions51. 10-fold cross-validation was employed, meaning the original dataset was divided into 10 approximately equal subsets. In each iteration, 9 subsets were used as training data and the remaining one as validation data. This process was repeated 10 times. In each iteration, the model was trained on different training data and then evaluated on the corresponding validation data. This validation method accurately reflects the model’s generalization ability and offers better operational efficiency and stability52. (3) Imbalanced Data Handling. Since the sample dataset was skewed, leading to machine learning predictions biased towards the majority class, the Synthetic Minority Over-sampling Technique (SMOTE) was employed for oversampling53. This technique helps to balance the dataset by generating synthetic samples for the minority class.

Statistical methods

Conventional statistics

Data were analyzed using SPSS 22.0 for descriptive statistics, statistical inference, and tests for common method bias. The Harman single-factor test method was used for the common method bias test. The criterion for passing this test is having at least two factors with eigenvalues greater than 1 and the largest factor explaining less than 40% of the variance.

Group differences were examined using a Bayesian linear mixed-effects model. This modeling approach has gained increasing attention due to its advantages in Bayesian statistical inference—namely, the ability to combine prior information with model likelihood to estimate posterior distributions—and its capacity to handle complex hierarchical structures54,55. The analysis was conducted in R using the “brms” package. The model specified a “student()” distribution for the response variable to improve robustness against heavy-tailed data. Parameter estimation was performed using four Markov chains, each with 4,000 iterations, of which the first 1,000 iterations were used for warm-up. This resulted in a total of 12,000 posterior samples. Summary statistics were extracted for regression coefficients (Estimates), 95% Bayesian credible intervals (CIs), degrees of freedom (ν), and the Markov Chain Monte Carlo (MCMC) convergence diagnostic (Rhat). To further examine group differences, Bayes factors were calculated using the “hypothesis()” function provided by the “brms” package56.

Machine learning model construction

For model construction, Python 3.8 software, based on scikit-learn version 1.1.0, was utilized. We employed various algorithms, including Logistic Regression (LR), Naive Bayes Classifier (NBC), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and Adaptive Boosting (AdaBoost). The effectiveness of these models in handling classification tasks has been widely validated. We compared the performance of these seven models and selected the best one based on the results. After comparing the results across models, the one demonstrating the best overall performance was selected for further analysis and interpretation. This model served as the basis for predicting negative academic emotions and exploring their underlying influencing factors.

The performance of the model was evaluated using four commonly adopted metrics: Accuracy, Precision, Recall, and the Area Under the Receiver Operating Characteristic Curve (AUC). Accuracy refers to the percentage of correctly predicted outcomes out of the total number of samples, providing an overall measure of the model’s predictive correctness. Precision refers to the proportion of students who are in the high negative academic emotions group among all those predicted by the model to be in the high negative academic emotions group. High precision means that most students predicted to be at high risk indeed face serious negative academic emotions issues.Recall indicates the proportion of students who are in the high negative academic emotions group and are successfully predicted by the model. High recall means the model can comprehensively identify all students truly at high risk of negative academic emotions.AUC is a widely used evaluation metric in machine learning, especially in binary classification tasks. It evaluates the model’s performance in distinguishing between positive and negative samples based on predicted probabilities or scores. A higher AUC value indicates better discriminative ability of the model, with larger AUC values signifying better classification effectiveness. To comprehensively evaluate the model’s performance in terms of both precision and recall, this study reports the F1 score. The F1 score provides a single metric that balances the trade-off between the two, particularly in situations where both false positives and false negatives are critical to consider.

Since this study employs a 10-fold cross-validation method to train the models, ten precision, recall, and AUC values will be obtained, with the final results being the average of these ten values. This approach ensures a more reliable and generalized evaluation of the model’s performance across different subsets of the data.

Results

Test for common method bias

The Harman single-factor analysis method was employed to test for common method bias in the data. The results indicated that there are 30 factors with eigenvalues greater than 1, and the largest factor accounted for 16.31% of the variance (less than 40%). Therefore, this study does not suffer from severe common method bias.

Descriptive statistics of variables

As indicated in Table 1, the average score for negative academic emotions is 12.99, which is slightly higher than the theoretical median, suggesting that high school students’ negative academic emotions are at an above-average level. In terms of teacher discipline, the average score for teacher response is much higher than the theoretical median, while the average score for teacher demands is significantly lower than the theoretical median. This implies that teachers tend to prefer communication over control when managing students. The average score for high school students’ psychological resilience is 93.56, indicating a high level of psychological resilience. The average score for academic self-efficacy is 73.22, showing that the level of self-efficacy among high school students is above average.

Table 1 Descriptive statistics for variables and dimensions.

Bayesian univariate analysis of negative academic emotions among high school students

The middle-low group included a total of 1460 individuals, while the high group comprised 236 individuals. The results of the Bayesian mixed-effects modeling are presented in Table 2. The results revealed that the 95% credible intervals (CIs) for psychological resilience, uncontrollable factors attribution (ability, background, and luck), academic self-efficacy, and teacher discipline styles did not include zero, and their corresponding Bayes factors (BF > 3) provided strong evidence in support of significant differences between the two groups. In contrast, the 95% CI for effort attribution included zero, and the Bayes factor was less than 1, suggesting no meaningful difference between the groups, with moderate evidence supporting this null effect. All Rhat values were below 1.01, indicating good convergence of the model. Additionally, the degrees of freedom (ν) for many variables were below 30, supporting the appropriateness of specifying a student distribution for the response variable.

Table 2 Comparison of academic emotion groups across various variable dimensions.

Prediction performance of machine learning models

Using 13 variables including psychological resilience, attribution style, academic self-efficacy, and teacher discipline styles as independent variables, and the classification of high school students’ negative academic emotions as the dependent variable, prediction models for the level of negative academic emotions among high school students were constructed using logistic regression (LR), naive Bayes classifier (NBC), support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting decision tree (GBTD), and AdaBoost. These seven methods were capable of predicting the level of negative academic emotions in high school students to a certain extent. Upon comprehensive comparison of precision, recall, F1 score, and AUC, the random forest algorithm exhibited the best performance across all metrics. See Table 3 for details.

Table 3 Prediction performance of different model for academic emotion levels (N = 1696).

Importance ranking of factors influencing negative academic emotions

The “feature_importances” function of the random forest model was used to analyze the importance of variables. According to Fig. 1, emotional control contributes the most in predicting negative academic emotions. The top five factors, ranked from highest to lowest importance, are affect control, study ability self-efficacy, attribution to luck, attribution to background, and learning behavior self-efficacy.

Fig. 1
figure 1

Importance Ranking.

Discussion and conclusion

The aim of this study was to establish an efficient predictive model using various machine learning algorithms, thereby assisting educators in timely identifying and intervening with high school students who have high levels of negative academic emotions. The results show that all the machine learning models selected for this study demonstrated a certain degree of effectiveness in predicting negative academic emotions among high school students, with all models achieving an AUC value greater than 0.7. This outcome further illustrates the potential application of machine learning technology in the field of educational psychology.

Among all models, the random forest model showed significant superiority in this study, with an AUC value reaching up to 0.96 and precision exceeding 80%, far surpassing other machine learning models. Compared to other models, the random forest model has a unique construction method and algorithmic principle. Random forest is an ensemble learning method that improves the overall prediction’s accuracy and stability by building multiple decision trees and aggregating their predictions. When dealing with complex, high-dimensional educational data, random forest effectively reduces data noise and avoids overfitting, thereby more accurately capturing the factors that affect students’ negative academic emotions. Additionally, the random forest model enhances prediction accuracy by evaluating the voting outcomes of different decision trees. This method gives the model strong robustness when facing large-scale and diverse datasets.

The development of negative academic emotions is a complex psychological process, not caused by a single factor, but rather the result of the interplay of multiple factors. This study, utilizing machine learning methods, especially the random forest algorithm, effectively analyzed the complex relationships among these factors, providing a more comprehensive perspective for understanding the intricate mechanisms behind negative academic emotions. By taking into account variables across multiple dimensions, including individual psychological resilience, attribution style, and academic self-efficacy, this study has revealed the core factors influencing negative academic emotions in high school students.

In RF, emotional control, self-efficacy in learning ability, attribution to luck, attribution to background, and self-efficacy in learning behavior were identified as the top five key factors for predicting the level of negative academic emotions. These variables are distributed among the three factors previously selected for the study: psychological resilience, attribution style, and academic self-efficacy. They reflect how students perceive and respond to learning challenges and stress, as well as how they evaluate their own learning capabilities and achievements.

In this study, all dimensions of psychological resilience played a certain predictive role in forecasting negative academic emotions among high school students, with emotional control identified as the most significant contributing factor among all predictive variables. This result is consistent with previous research. Students with higher levels of psychological resilience are more likely to experience positive academic emotions, such as happiness, whereas those with lower levels of psychological resilience are more prone to negative emotions, such as anxiety52. This confirms that psychological resilience is an important factor in academic emotions. Building on previous research, this study further verifies the critical role of emotional control in negative academic emotions. Emotional control refers to students’ ability to adjust and control pessimistic emotions in difficult situations45. Whether students are adept at controlling their emotions predicts negative academic emotions. Academic emotions, as a sub-concept of emotions, represent a specific domain of emotional response, reflecting students’ emotional experiences related to the learning process and outcomes. If high school students cannot effectively control and regulate their emotions, they are more likely to experience negative emotions such as frustration, boredom, and despair when encountering setbacks and difficulties in learning. These negative academic emotions can not only affect students’ learning motivation and academic achievement but may also have adverse effects on their long-term psychological health.

This study examined the impact of attribution style on negative academic emotions among high school students, highlighting the significant role of background and luck attributions in predicting negative academic emotions. Weiner’s theory of motivation and attribution suggests that individuals’ attributions for success or failure directly affect their emotional responses. Studies have shown that there is a significant positive correlation between extrinsic learning motivation and negative academic emotions. Individuals with strong extrinsic motivation experience greater learning pressure, leading to psychological fragility and high levels of stress, which in turn generate negative emotions57. Building on this, the results of the current study further emphasize the critical role of uncontrollable attributions in predicting negative academic emotions. Attributions to ability, background, and luck all have significant predictive effects on negative academic emotions, whereas the predictive effect of effort attribution is not as apparent. This reflects the unique role of effort attribution—directly affecting academic achievement, while other attribution tendencies impact academic achievement through academic emotions as a mediator58.

This study found academic self-efficacy to be a critical predictive factor, in agreement with earlier research findings that highlight its important role in shaping students’ academic self-concept59. Academic self-efficacy encompasses two dimensions: self-efficacy in learning ability and self-efficacy in learning behavior, reflecting students’ confidence in their learning capabilities and their sense of control over learning outcomes, respectively.

According to self-worth theory, negative self-assessments of one’s abilities can lead to a compromised sense of self-worth, subsequently triggering feelings of shame60. In an academic context, if students lack confidence in their learning abilities, they might feel inadequate to meet learning challenges. This perception can not only impact their academic performance but also lead to negative emotional experiences. Furthermore, control-value theory suggests that when individuals feel they have lost control over learning outcomes, their expectations for the future diminish, potentially giving rise to feelings of boredom. Self-efficacy in learning behavior reflects students’ beliefs in their control over the learning process and outcomes59,61, where a high level of self-efficacy in learning behavior helps students maintain a positive attitude towards learning activities and high engagement. Conversely, when students feel they cannot effectively control the outcomes of their learning, they may experience feelings of helplessness and frustration, leading to disinterest and negative emotions towards learning39,62.

This research contributes to understanding the influence of individual and environmental factors on negative academic emotions in high school students from the perspective of complex models and reveals the feasibility and potential significance of predicting academic emotions through machine learning algorithms. In educational practice, teachers and schools can leverage these findings to implement more personalized interventions aimed at helping students cope with academic stress. By guiding students to develop emotional regulation skills and adopt more adaptive attribution styles, the adverse effects of negative academic emotions on academic performance and mental health can be effectively mitigated.Moreover, the application of machine learning techniques offers educators a valuable tool for predicting students’ emotional states, enabling early warning and timely intervention. This has the potential to enhance students’ academic performance and psychological adjustment.

This study has several limitations. Firstly, this study utilized self-report questionnaires as research tools, which might be subject to social desirability bias as participants could respond in a manner they believe is expected by society. Additionally, the use of cross-sectional analysis makes it challenging to track the temporal changes in how various factors influence negative academic emotions. Furthermore, the selection of factors in this study is not comprehensive, failing to cover all possible elements that could affect students’ academic emotions. In addition, the sample was limited to first- and second-year high school students in Hebei Province, China. Given that educational policies in China are uniformly guided by the Ministry of Education, the findings may be generalizable to other regions within the country. However, their applicability to different educational stages and cultural contexts warrants further investigation. Based on the limitations of this study, future research can be expanded in several directions. First, future studies may consider incorporating a wider range of factors that potentially influence students’ academic emotions and utilize longitudinal data to explore the long-term trends of emotional changes caused by these factors. Second, further research is needed to validate the adaptability of machine learning models across different cultural contexts and educational systems. Third, by integrating specific educational intervention strategies, future studies could examine whether emotion management approaches based on predictive analytics can effectively reduce negative academic emotions in practical applications. Research may also explore the integration of machine learning with traditional educational psychology methods to identify more efficient intervention techniques. This approach would enhance the capacity to design targeted interventions and offer greater potential for mitigating students’ negative academic emotions.

In conclusion, this study is the first to develop a predictive model of negative academic emotions among high school students using machine learning algorithms. Among the various algorithms tested, the random forest model exhibited superior predictive accuracy. The variable importance analysis indicated that emotional control and attributional style significantly contributed to predicting students’ negative academic emotions. These findings provide actionable insights for educators and policymakers. For example, educators could effectively reduce students’ negative academic emotions by cultivating positive attributional patterns during learning activities. Additionally, schools may implement targeted intervention programs aimed at enhancing students’ psychological resilience and improving their ability to cope with setbacks and difficulties, thus promoting overall psychological well-being and academic achievement.