1. Introduction
Artificial intelligence (AI) has become a cornerstone in modern industrial and economic advancements, driving innovations across various sectors. AI’s capacity to analyze large datasets, identify patterns, and optimize complex processes has made it invaluable in financial forecasting, manufacturing, and sustainability [
1,
2]. As industries evolve towards more sustainable and resilient practices, particularly through the transition to Industry 5.0, AI is poised to play a pivotal role in shaping the future of the circular economy. Industry 5.0 envisions a hyperconnected, data-driven industrial landscape that integrates human-centric approaches with cutting-edge AI technologies, fostering economic growth and long-term environmental and societal well-being [
3]. This shift presents an opportunity to leverage AI beyond traditional applications, applying it to optimize resource efficiency, reduce waste, and create sustainable business models. In this context, while focused on optimizing IPO predictions, the current study also aligns with broader industry trends by showcasing how AI can support the principles of the circular economy and contribute to the sustainable transformation of industrial practices.
This study aims to explore several key challenges in the domain of IPO prediction and its broader implications for sustainability. Specifically, it seeks to address the following questions:
How can deep learning-based models be optimized to enhance IPO prediction accuracy in volatile and dynamic financial markets?
How can machine learning techniques address the issue of class imbalance while adapting to changing market conditions for more reliable IPO forecasts?
In what ways can the proposed ensemble framework be extended to contribute to sustainability, particularly in circular economy applications like resource recovery and waste reduction?
An initial public offering (IPO) market is the critical juncture where private companies go public, and investors, underwriters, and other stakeholders accurately predict investments. Recent research suggests that approximately 60 percent of IPOs underperform their benchmark indices within the first three years of trading, indicating that IPO investments are inherently high-risk [
4,
5]. The combination of high dimensionality, temporal dependency, and high-class imbalance poses further challenges to predicting IPO performance. Traditional statistical approaches fail to capture the non-linear relationships and complex patterns affecting IPO outcomes, and therefore, advanced machine learning has been proposed [
6]. Additionally, financial markets are dynamic, and the nature of the market, as well as the market conditions and regulatory environment, change over time, necessitating adaptive prediction models that can incorporate the historical patterns and the current market indicators while being robust to data imbalance challenges [
7,
8].
Therefore, the objective of this study is to create an ensemble framework based on deep learning approach that offers high reliability on IPO performance forecast in periods with high risk of fluctuation. Indeed, this work seeks to overcome the class imbalance problem in IPO prediction and tuning of hyperparameters DMA and SMOTE. Thus, this study also aims to apply these methods in circular economy models and thereby advance the theory and practice of sustainable development. This double idea is based on the purpose to link the models for the financial predictions to the critical sustainable and Industry 5.0 objectives making the framework possible for usage not only for the IPO forecasting’s but also for the achievement of other sustainability objectives in different fields.
Contrary to prior studies, this research proposes a new method for predicting IPOs using a selection of supervised machine learning algorithms designed to work with imbalanced data and non-linear treatment of variable inputs in relation to market volatility. The focus of the research is to improve the efficiency of hyperparameter tuning, as well as incorporating the DMA algorithm that allows for flexibility in the arriving market signals and different investors’ preferences. The inclusion of SMOTE for data balancing of the minority class along with ensemble heuristics improves the modularity and prediction of the model more than standard procedures. In addition, the proposed model goes beyond the conventional use of machine learning in financial forecasting by incorporating the circular economy system, thus revealing the capability of AI in promoting sustainable industrial strategies.
This increasing volatility has further highlighted the need for sophisticated prediction mechanisms, especially in the IPO sector, where information asymmetry and market sentiment matter. Recent technological progress has allowed for market data collection and analysis at larger scales than before, allowing for more sophisticated prediction models while exacerbating challenges with data processing and feature selection [
9]. In addition to social media sentiment, regulatory filings, and macroeconomic indicators, the feature space for predicting IPO performance has increasingly expanded by integrating alternative data sources [
10], calling for more robust and adaptive machine learning frameworks. Additionally, market dynamics are being disrupted post-pandemic, thus emphasizing the need to build prediction models resilient to unprecedented market conditions and maintain accuracy across multiple economic cycles [
11].
The Role of AI in IPO Prediction
The increasing complexity and volatility of global financial markets, especially in emerging economies, have underscored the limitations of traditional econometric models in predicting IPO underperformance. Machine learning (ML) models offer significant advantages over conventional financial prediction techniques by capturing non-linear relationships and adapting to the dynamic nature of market conditions. AI models, such as ensemble methods, can leverage large datasets and uncover hidden patterns that would be challenging to identify using traditional methods, such as linear regression or logit/probit models [
12].
In the context of IPO underperformance, machine learning methods excel by processing a diverse set of variables—structured (financial metrics, market conditions) and unstructured (sentiment analysis, news articles)—that are pivotal in determining the likelihood of an IPO’s success or failure. Unlike traditional models, which often rely on static assumptions about market behavior, AI techniques, such as XGBoost and Bagging, can continuously adapt to evolving market trends, leading to more accurate predictions. Recent studies have demonstrated the effectiveness of machine learning in other financial domains, such as stock market prediction and credit risk assessment, where AI models have outperformed traditional econometric models in terms of accuracy, adaptability, and robustness [
13]. This highlights the untapped potential of AI in improving IPO prediction models, especially in rapidly changing markets.
Machine learning and optimization algorithms are at the forefront of financial engineering, revolutionizing the field by replacing increasingly costly algorithms with more affordable ones and applying advanced machine intelligence to generate practical market solutions. The proposed framework addresses the critical engineering challenge of designing robust prediction schemes that operate over complex, imbalanced data and adhere to domain-specific constraints and optimization objectives. Recent developments in algorithmic trading and automated financial analysis, which leverage machine learning and AI tools, exemplify their transformative power in financial engineering by streamlining and enhancing predictive capabilities [
14]. This current research furthers this transformation by presenting a new approach to tackle class imbalance and hyperparameter optimization tailored to IPO prediction, contributing significantly to computational finance and engineering optimization [
15].
Machine learning and financial engineering have become critical areas of innovation, particularly in developing sophisticated prediction models capable of handling the complexities of today’s financial markets. This current research brings to this growing field research that addresses fundamental challenges in data processing, model optimization, and risk management, corresponding to this Special Issue’s emphasis on practical applications of advanced computational techniques. This current research proposes a paradigm combining machine learning and feature engineering to address these problems, and this current research insights extend well beyond IPO prediction to quantitative finance and risk assessment in general [
16].
Over the past decade, the evolution of machine learning approaches in IPO performance prediction has been remarkable. Traditional tree-based models, such as random forests and decision trees, have been widely used for their interpretability and ability to capture non-linear relationships in financial data [
17]. However, recent studies have demonstrated the superior performance of ensemble methods, particularly gradient-boosting frameworks like XGBoost and Light GBM, in terms of capturing the complex dynamics of the market and improving prediction accuracy [
18]. These approaches are particularly well-suited to IPO data with high-dimensional feature spaces encompassing various disparate factors, including financial metrics and market sentiment indicators.
Implementing deep learning architectures opened the domain of IPO prediction with neural networks demonstrating unique capabilities on complex financial datasets for pattern recognition [
19]. Despite these advances, these high-order models need help with interpretability and computational tractability, essential for real-life financial applications. However, the development of hybrid methods combining traditional statistical algorithms with modern machine learning approaches holds promise in striking an elegant balance between accuracy and interpretability [
20]. Further, by integrating natural-language-processing techniques, textual data from prospectuses, news articles, and social media have been incorporated into the feature space for prediction models [
21].
Research has been conducted on handling class imbalance in financial prediction tasks, and many approaches have been proposed to deal with this issue. However, traditional methods like random under-sampling and SMOTE have not performed well on IPO prediction and tend to debase important information from the minority class [
22]. However, adaptive synthetic sampling approaches and ensemble-based resampling methods also show improved performance but often at the expense of increased computational complexity [
23]. The challenges are acute in IPO prediction, where the rarity of some outcome classes leads to significant performance impact.
Another critical challenge is the optimization of model hyperparameter, especially with ensemble methods and in the context of deep learning architectures. Due to the complexity of financial prediction models, the traditional approaches to hyperparameter tuning, including grid and random search, need to be revised [
24]. Bayesian optimization and evolutionary algorithms enhance prediction by systematically tuning model parameters and identifying optimal solutions in complex data environments. However, their effectiveness comes with trade-offs, as these methods require careful consideration of computational resources and time constraints. Furthermore, hyperparameter optimization interacts intricately with imbalance mitigation strategies. For instance, the choice of sampling or weighting strategy significantly influences how parameter settings are optimized and how class imbalance is addressed [
25].
Despite notable improvements in class imbalance mitigation and hyperparameter optimization, current solutions approach these problems in isolation, which might result in low-quality solutions in practical cases. The flexibility to adapt to varying risk preferences and market conditions, an essential requirement for practical implementation of IPO prediction, often needs to be added to existing frameworks. While some studies have tackled these problems in isolation, there exists a lack of integrated approaches that robustly perform imbalance handling coupled with adaptive hyperparameter optimization [
26]. However, this gap is especially pronounced in high-stakes financial prediction problems, where model performance must be optimized for accuracy and specific risk–return trade-offs preferred by different investors [
27].
An efficient prediction of IPO underperformance is a highly complex optimization problem at the juncture of machine learning and financial engineering. As such, traditional predictive models need to be improved because IPO outcomes inherently follow a highly skewed distribution, wherein successful offerings are much more prominent in number than unsuccessful ones. Consequently, conventional machine learning approaches perform poorly due to the high variance and temporal dependency characteristics associated with the IPO performance data and its fundamental class imbalance. The complexity is further amplified by the need to optimize multiple competing objectives: prediction accuracy, minimizing false positives and missed investment opportunities, and maintaining model robustness at different market conditions [
28].
Due to the lack of available information and the uncertainty of market dynamics, stock markets have been historically sensitive to the difficulty of predicting IPOs. The focus of this study is to develop a deep learning-based ensemble framework that shall give the best prediction of IPO performance in the stock market. In this paper, we leverage state-of-the-art techniques, such as hyperparameter optimization, DMA, and synthetically oversampled minority techniques (SMOTEs) to deal with key issues in machine learning, such as improving risk-adjusted metrics, class imbalance, and model ensemble. The ’ensemble heuristics’ refer to combining these techniques to create a robust and accurate predictive model. Experimental results show that the proposed ensemble heuristics, which refer to methods for combining predictions from multiple models to improve performance, consistently outperform traditional approaches regarding accuracy and robustness to various models [
29].
While the current research primarily focuses on financial forecasting, it is crucial to recognize AI techniques’ transformative potential, such as deep learning and ensemble methods. These techniques extend beyond financial markets and are pivotal in shaping sustainability, resilience, and human-centeredness (Industry 5.0). In the future, sustainability, ethics, and AI-driven innovations will be the cornerstones of responsible growth. By transitioning industries from linear to circular economy models, AI can drive resource efficiency, waste reduction, and long-term sustainability, ushering in a more optimistic and forward-thinking industrial ecosystem [
30].
Although this study is focused on financial prediction, the AI models created here could also be applied to the circular economy. The circular economy, which focuses on resource efficiency, waste reduction, and long-term sustainability, shares similarities with financial markets regarding the need for predictive models. For example, the same ensemble learning techniques for predicting IPO performance could be used to improve the performance of circular business models, predict the success of recycling initiatives, or even assess the viability of sustainable production systems [
31]. Combining AI with the circular economy paradigm enables industries to make data-driven decisions that optimize their financial performance and achieve sustainable development. As a result, this sets the foundation for proving how finance prediction is the forerunner even further to reach sustainability and Industry 5.0’s vision for a more resilient and sustainable industrial ecosystem [
32].
Applying machine learning techniques in finance, deep learning, ensemble learning, and hyperparameter optimization demonstrates that artificial intelligence (AI) has transformative potential in a major credit function of finance. AI has been recently applied to generate insights for IPO performance prediction, including issues related to high dimensionality, temporal dependence, and class imbalance. While these techniques have proved especially useful for ensemble models, they provide robust solutions for problems for which statistical methods are not suitable and are proof of the capability of AI in optimizing complex financial systems. But its reach goes far beyond financial markets. As Industry 5.0 continues to evolve, more businesses are utilizing AI to support sustainable practices, increase operational resilience, and develop human-centric solutions across industries. When the circular economy pushes firms to reinvent themselves, AI technologies are essential tools that optimize resource management, reduce waste, and increase supply chain efficiency [
33].
The circular economy is a concept that stresses the creation of sustainable closed-loop production systems that minimize waste, maximize resource reuse, and reduce the environmental footprint of industrial processes. These efforts can be furthered by using AI to make better resource allocations, predict material recovery outcomes, and improve the recycling processes. Similarly, machine learning models, like those used in this study to predict IPOs, can be incorporated to predict the likelihood of success of circular business models to maximize recycling rates, optimize material recovery processes, and even foresee waste-to-energy successes. For example, if predictive models are used to predict the lifecycle of products, it can help businesses design recyclability or determine the most efficient material recovery paths. During the age of Industry 5.0, which is focused on sustainability and pro-human approaches, the integration of AI with circular economy strategies helps companies to make evidence-based and data-driven decisions that help to grow economies and contribute to the overall achievement of SDGs. This study shows how AI technologies that currently predict financial returns can also change the trajectory of sustainability applications and inspire and motivate toward more sustainable and resilient industrial practices [
34].
However, the current financial forecasting and industrial optimization methodologies must be equipped to deal with these challenges holistically, especially when coping with sustainability issues and adapting to changing market dynamics. Establishing a new framework that balances sometimes conflicting requirements for resource efficiency, risk tolerance, and sustainability goals while maintaining computational efficiency and interpretability is not just a theoretical exercise but a potential game-changer. Considering the complexity of industrial and economic systems in the context of Industry 5.0 and circular economy need not only traditional data inputs but also a variety of other data sources, such as market sentiment indicators, macroeconomic factors [
35], and sustainability metrics. Shifting towards Industry 5.0 and integrating AI to support human-centric approaches and green ways of doing business, there is a need for frameworks that enhance economic performance while promoting environmental responsibility through informed decision-making and sustainable investment strategies. Such a framework must be adaptive and capable of real-time adjustments to fluctuating market conditions and changing regulatory environments, operating consistently and efficiently across economic cycles and sustainability paradigms [
36]. This research proposes an integrated AI solution, incorporating cutting-edge imbalance mitigation techniques combined with adaptive hyperparameter optimization based on users’ risk preferences and considering sustainability outcomes and circular economy principles. Not only is this a model that can help improve financial forecasting, but it also helps achieve some broader goals of Industry 5.0, such as optimizing resource use and promoting long-term sustainable development. The potential impact of this framework is not just in the realm of financial forecasting but in the broader context of Industry 5.0 and sustainability [
37].
The goal is to develop a framework based on an optimized machine-learning model that predicts underperformance in an IPO, taking sustainability into account. Advanced hyperparameter optimization and increased model accuracy will be used to address data imbalance and deal with varying investor risk tolerances.
To implement the synthetic minority oversampling technique (SMOTE) to tackle data imbalance, ensuring balanced representation in IPO predictions.
To integrate hyperparameter optimization to refine model performance, mainly focusing on investor-specific risk tolerances.
To evaluate the model’s adaptability to investor preferences through a dynamic metric adaptation approach.
This current research significantly contributes to the theoretical and practical realm of financial engineering and machine learning and provides several key innovations. This proposed risk-optimized framework advances the field by addressing the class imbalance problem in IPO prediction and the critical hyperparameter optimization problem. From a practical point of view, this current research provides investors and financial institutions with a more potent instrument for evaluating IPO opportunities, thereby reducing investment risks and enhancing portfolio returns. This adaptability to different risk preferences makes the methodology useful for various investment strategies and market conditions. Additionally, the framework is designed modularly, enabling its use beyond IPO markets to other financial prediction problems with class imbalance, namely credit risk assessment and market anomaly detection, and circular economy, including sustainability risk assessment and resource optimization. In the broader context, this research also contributes to understanding how machine learning approaches, in general, can be adapted to overcome problems in financial markets, along with providing evidence for making AI-driven solutions to overcome economic obstacles while promoting sustainable development and the move toward Industry 5.0.
3. Results
This section presents the results of the proposed framework, comparing its performance against benchmark models in predicting IPO outcomes. This research showed that the proposed framework effectively applies tree-based ensemble models with SMOTE for class balancing, feature selection, and hyperparameter optimization to outperform traditional models in all performance metrics. Accuracy, precision, recall rate, F1-score, and Area Under Curve (AUC) were employed to compare the outcome of these approaches and evaluate the techniques’ effectiveness.
Section 3 of this current study equips readers with knowledge of how the Baum–Washburn method, together with the risk-specific metric adaptation, enhances the accuracy of the predictive model and eventually facilitates the formulation of tailored investment programs consistent with the investors’ risk tolerance levels.
Figure 3 is the confusion matrix of IPO classification as underperforming or successful using the proposed predictive model. The parametric offers comprehensive cases of the model results by yielding TP, FP, TN, and FN counts. More importantly, according to the detailed results, the model can pinpoint seven losses as under-performed IPOs (TP) and identify ten gainer stocks as non-underperformed (TN) IPOs. This performance highlights the model’s remarkable strength in accurately identifying IPO outcomes and its overall effectiveness in classification. By reliably capturing trends in IPO performance, the model showcases its robust capabilities, offering valuable insights for IPO analysis. The confusion matrix is a critical evaluation instrument for understanding the model’s effectiveness in identifying poorly performing IPOs and those that performed very well. The findings indicate moderate misclassification, suggesting a moderately imbalanced classification. The color scale effectively highlights the distribution and density of prediction values, providing insight into the general trends in both accurate and inaccurate predictions.
Figure 4 offers a comparative analysis of the performance of ensemble models through two subplots.
Figure 4a presents the ROC curves for random forest, gradient boosting, AdaBoost, and XGBoost, demonstrating the trade-off between the true positive rate (sensitivity) and false positive rate (1-specificity) across various thresholds. XGBoost achieves the highest AUC of 0.85, reflecting its superior classification capability, followed closely by the gradient boosting, random forest, and AdaBoost, each achieving an AUC of 0.84. The diagonal line represents random guessing, effectively highlighting the models’ ability to outperform this baseline.
Figure 4b displays confusion matrices for the ensemble models, with distinct color schemes for enhanced visualization. These matrices illustrate the classification outcomes, including true positives, true negatives, false positives, and false negatives. XGBoost demonstrates the best balance with minimal misclassifications, while gradient boosting also performs consistently. AdaBoost shows slightly more false negatives, indicating occasional challenges in predicting positive cases. Overall,
Figure 4 highlights XGBoost’s superior performance in both sensitivity and classification accuracy, validating its robustness for complex datasets. The combined analysis provides insights into the models’ strengths and areas for improvement, offering guidance in selecting the most effective classifier for real-world applications.
Figure 5 compares the performance of five machine learning classification models random forest, gradient boosting, Bagging, Extra Trees, and XGBoost—using confusion matrices (top row) and Receiver Operating Characteristic (ROC) curves (bottom row). The analysis highlights the models’ accuracy and ability to distinguish between classes.
The confusion matrices summarize predictions into four categories: true negatives, false positives, false negatives, and true positives. Among the models, Bagging (Matrix C) and XGBoost (Matrix E) show superior performance. Bagging achieves 135 true negatives and 126 true positives, with only 10 false positives and 29 false negatives. XGBoost outperforms others, achieving 137 true negatives and 132 true positives, with the lowest false positives (8). In contrast, Extra Trees (Matrix D) has a higher number of false negatives (28) despite reasonable accuracy. Random forest (Matrix A) and gradient boosting (Matrix B) demonstrate balanced performance but with slightly higher false positives and false negatives compared to XGBoost and Bagging.
The ROC curves depict the models’ ability to balance true positive rates and false positive rates. Models with higher Area Under the Curve (AUC) values perform better. XGBoost and Bagging lead with AUC scores of 0.94, followed closely by random forest and Extra Trees (0.93). Gradient boosting achieves an AUC of 0.92, slightly lagging behind. Overall, XGBoost and Bagging demonstrate the highest classification accuracy and reliability.
Figure 6 evaluates four machine learning models—random forest, gradient boosting, AdaBoost, and XGBoost—using ROC curves, confusion matrices, and performance metrics.
Figure 6a highlights XGBoost’s superior classification ability with the highest AUC of 0.94.
Figure 6b,c focus on confusion matrices, where XGBoost demonstrates better accuracy and lower misclassification compared to gradient boosting.
Figure 6d provides a comprehensive comparison of all models, confirming XGBoost’s dominance in all key metrics, followed by random forest and gradient boosting. AdaBoost lags slightly in performance.
Figure 6a: ROC curves: This subplot compares the ROC curves for random forest, gradient boosting, AdaBoost, and XGBoost. XGBoost achieves the highest AUC (0.94), demonstrating its superior ability to distinguish between classes. Random forest and gradient boosting have an AUC of 0.92, while AdaBoost scores slightly lower with 0.91.
Figure 6b: Confusion matrix for gradient boosting: This confusion matrix shows Gradient Boosting’s classification performance. It achieves 128 true negatives and 130 true positives, but there are 18 false positives and 23 false negatives, indicating slightly lower precision.
Figure 6c: The confusion matrix for XGBoost outperforms other models, with 138 true negatives and 139 true positives. It minimizes errors with only 10 false positives and 26 false negatives, showcasing excellent reliability.
Figure 6d: Bar chart of performance metrics: This bar chart compares the models based on accuracy, precision, recall, F1 score, and AUC. XGBoost leads in most metrics, followed by random forest and gradient boosting, while AdaBoost performs slightly lower across all metrics.
Figure 7 presents a comparative analysis of four ensemble models’ model accuracy and ROC-AUC scores. The following are decision trees: random forest, gradient boosting, AdaBoost, and XGBoost. The left panel in
Figure 7 shows that both AdaBoost and XGBoost models are the most accurate models capable of making accurate predictions of IPO outcomes. Random forest follows the same trend but is just a little lower, and gradient boosting is a little less precise again. The right panel depicts ROC-AUC, a standardized measure of the model discriminating capacities about underperforming and outperforming stocks. This current study shows that AdaBoost performs the best in terms of ROC-AUC, and so does the classifier it was used to generate regarding the ability to distinguish between classes, followed by random forest and gradient boosting with equal results. XGBoost for accuracy is high but poorly illustrates ROC-AUC, indicating that its predictive strength could be lower at times. This figure shows that AdaBoost accurately identifies IPO firms for a high level of precision and equal performance in the classification.
Figure 8 highlights the XGBoost model’s effectiveness in classifying IPO performance, achieving an accuracy of 76% and a moderate ROC-AUC of 0.7115. The model demonstrates stronger precision (83%) and F1-scores (0.80) for successful IPOs, indicating better reliability in identifying profitable firms. This suggests the model is highly effective in pinpointing IPOs that yield significant returns, providing confidence for stakeholders seeking to invest in high-performing opportunities. However, slightly lower recall (75%) for underperforming IPOs indicates potential risks in misclassifying these cases, which may require careful consideration when analyzing IPOs with borderline performance characteristics. These findings emphasize the model’s potential to support decision-making by accurately identifying promising IPOs while mitigating risks. Contributing to sustainability, this ensures resource allocation aligns with high-performing opportunities, minimizing financial inefficiencies and supporting long-term economic growth in green, sustainable investment ventures.
Figure 9 classification report and confusion matrix for the AdaBoost model highlights its performance across two classes: underperforming IPOs (label 0) and successful IPOs (label 1). The precision for label 0 is 0.62, with perfect recall at 1.00, suggesting that all actual underperforming IPOs were correctly identified. Conversely, label 1 has a precision of 1.00 but a recall of 0.62, indicating that while the model was precise when predicting successful IPOs, it missed some actual instances. The F1 scores for labels 0 and 1 are 0.76 each, contributing to an overall accuracy of 0.76. Macro and weighted averages hover at 0.76–0.85, implying balanced performance across classes. The confusion matrix reveals that eight underperforming IPOs were correctly predicted, while successful IPOs were misclassified. The high ROC-AUC score of 0.9038 showcases the AdaBoost model’s strong discriminatory capability, denoting robust predictive power. However, the recall for underperforming IPOs could be improved, suggesting an opportunity to enhance the model’s sensitivity to less profitable cases while maintaining its precision for successful IPOs.
Figure 10 compares the performance of the random forest and gradient boosting models for predicting IPO outcomes. The random forest model demonstrates a balanced performance, achieving an accuracy of 71% and a moderate ROC-AUC of 0.7548. It excels in recall for underperforming IPOs (75%) and precision for successful IPOs (82%), reflecting its strength in identifying both underperforming and successful IPOs. However, it misclassifies two underperforming IPOs and four successful IPOs. Gradient boosting achieves a slightly higher ROC-AUC of 0.7692, indicating better discriminatory power, particularly for handling class imbalance. Despite this, its overall accuracy is lower (62%), and recall for successful IPOs is limited (54%). Both models highlight trade-offs in precision, recall, and class-specific performance. These findings are critical for sustainable investment strategies, enabling precise identification of high-performing IPOs, minimizing financial misallocation, and promoting resource efficiency, which supports long-term economic and environmental sustainability goals.
The performance of the ensemble models, including random forest, gradient boosting, and XGBoost, was evaluated using key metrics, such as precision, recall, accuracy, and ROC-AUC. These metrics highlight the models’ ability to correctly classify IPOs as underperforming or successful. While all models demonstrate competitive performance, XGBoost emerges as the most reliable with the highest accuracy and precision, followed by random forest. Gradient boosting, despite slightly lower accuracy, achieves a higher ROC-AUC, showcasing its strength in managing class imbalances.
Table 1 provides a concise summary of the models’ comparative performance metrics.
Figure 11 depicts the results of a circular economy model that predicts the outcomes of a circular economy, including resource recovery rate, waste reduction, and sustainability impact. Both the resource recovery rate and waste reduction metrics demonstrate strong performance, with values of (75.7). This indicates that the model does a good job of identifying and categorizing positive sustainability outcomes, like resource recovery and waste minimization. This reflects the model’s ability to monetize circular economy processes, which help recycling, resource reuse, and waste reduction, which are important parts of sustainable business models.
However, it is important to note that the mean absolute error (MAE) for the sustainability impact is moderate, falling short of 0.087. This indicates that the model’s predictions are very close to the actual values, with only a minor error. In other words, the model can be relied upon to make accurate forecasts regarding sustainability results, instilling confidence in its reliability. Finally, these results underscore the potential of AI models in advancing sustainability within the circular economy. These models can play a significant role in optimizing resource use and minimizing waste, thereby contributing to sustainability goals and aligning with the principles of Industry 5.0 in industrial practices. This promising outlook should inspire optimism about the future of sustainability in this current study field.
The bar chart in
Figure 12 illustrates the performance of a prediction model for the circular economy using five metrics: precision, accuracy, recall, F1-score, and AUC. The model has moderate performance with the highest precision and F1-score (above 0.6), meaning that the model can correctly predict positive cases and keep a good balance between precision and recall. Scores in accuracy and recall are slightly lower, signifying a possible accuracy exchange for classifying some specific kinds of cases. AUC, which indicates model robustness on the overall classification, shows the lowest score.
These findings are significant because they point to areas where the model excels and where improvements are necessary. A high F1 score indicates the potential to support circular economy practices like optimizing resource use and reducing waste. Improving AUC would positively affect decision-making precision and influence sustainability by accurately identifying key components in circular economy adoption.
The radar chart in
Figure 13 displays the AI models’ balanced performance in predicting financial and circular economy outcomes. Both models yield identical results across the key metrics: resource recovery rate and waste reduction, with a score of 0.76. This parity in performance underscores the AI models’ equal proficiency in predicting success in financial markets and sustainability projects. The mean absolute error (MAE) for both models is impressively low at (0.11), indicating highly accurate predictions with minimal errors. This result further underscores AI’s reliability, demonstrating its potential to enhance decision-making in finance and sustainability, including resource recovery and waste reduction. The findings underscore AI’s immense potential to play a pivotal role in achieving sustainable development. AI’s ability to harmonize circular economy practices with financial markets offers a beacon of hope, balancing economic growth with environmental stewardship and aligning with Industry 5.0’s aspirations.
Figure 14 visually represents the confusion matrix for the aggregated model performance. The matrix illustrates the true positive, true negative, false positive, and false negative predictions across the dataset. In this figure, class 0 (underperforming IPOs) and class 1 (successful IPOs) display the distribution of correctly and incorrectly classified instances. The top-left cell indicates the count of true negatives (underperforming IPOs correctly predicted as such), while the bottom-right cell shows true positives (successful IPOs correctly classified). The top-right cell represents false positives (underperforming IPOs incorrectly classified as successful), and the bottom-left cell represents false negatives (successful IPOs incorrectly classified as underperforming). The matrix’s color gradient accentuates the intensity of correct classifications, with a deeper blue indicating a higher count. This visualization highlights the model’s balanced performance, with low false positives and false negatives, making it a reliable predictor. Moreover, it provides insights into the types of errors that occur, enabling targeted improvements to enhance overall prediction accuracy and practical applicability.
In
Table 2, the comparison evaluates the proposed framework against baseline models (logistic regression, decision trees, SVM) and advanced neural network-based techniques (FNN, CNN, RNN). Baseline models like logistic regression and decision trees are interpretable and computationally efficient but struggle with complex data patterns. Neural networks achieve higher accuracy (up to 94%) but at the cost of interpretability, scalability, and high computational requirements. The proposed framework balances high accuracy (92%), scalability, and interpretability, making it ideal for practical applications like IPO performance prediction. It outperforms baseline models and provides a more practical alternative to neural networks for resource-efficient, real-world deployments.
Practical Implications of the Proposed Framework
The results of the proposed framework hold significant potential for practical applications in various real-world scenarios:
- 1.
Investment Decision-Making:
The framework provides accurate predictions of IPO performance, enabling investors to make informed decisions and reduce the risk of financial losses. By identifying underperforming IPOs with high precision and recall, it can guide investment strategies for both risk-averse and risk-tolerant investors.
- 2.
Financial Institutions:
Banks and investment firms can integrate the model into their advisory services to offer better predictions for IPO success, improving portfolio management and optimizing client returns. Additionally, it can support underwriters in assessing IPO risks.
- 3.
Market Regulation:
Regulators can use the framework to monitor market stability, detect potential fraudulent activities, and ensure fair market practices by identifying high-risk IPOs before they disrupt the market.
- 4.
Circular Economy Applications:
Beyond financial markets, the framework can predict outcomes like resource recovery and waste reduction in circular economy initiatives. For instance, it can assess the economic and environmental feasibility of large-scale recycling or waste-to-energy programs.
- 5.
Scalability Across Markets:
The framework’s adaptability allows it to be applied across different international markets and economic conditions, making it suitable for emerging and volatile markets.
These practical applications underscore the framework’s ability to enhance decision-making, optimize resource allocation, and promote sustainable economic development.
4. Discussion
This research aimed to determine whether machine learning methods, incredibly complex tree-based models, could be used to predict IPO underperformance, especially in volatile and scant data markets. Conventional prediction techniques can be problematic because of class imbalance and non-linearity in the relationships in the IPO datasets, particularly those of emerging economies. The findings confirm that ensemble models like Bagging and XGBoost offer superior performance in handling imbalanced IPO data compared to traditional techniques, showcasing advancements in predictive modeling within financial markets. The proposed framework effectively helps investors overcome such shortcomings using the synthetic minority oversampling technique (SMOTE), dynamic metric adaptation (DMA), hyperparameter optimization, and leading to improvement in the predictive accuracy and investor risk type modeling.
While discussing the results based on the AUC, precision, recall, and F1-score of the proposed ensemble models, Bagging and XGBoost perform better than other models in high-dimensional financial data. The high AUC scores suggest an excellent learning capability of the models, especially the Bagging model, with the ability to effectively classify between successful and poor IPO stocks, as required in an investment decision-making process. These findings align with prior research but extend their application to circular economy metrics, bridging financial and sustainability objectives. The class imbalance was addressed using SMOTE, resulting in better outcomes concerning under-fitting IPOs, as seen from the recall’s points of view. This is especially true for emerging markets where there is often a shortage or imbalance of data, which is always ruinous for financial modeling.
Dynamic metric adaptation (DMA) also added more flexibility, allowing the model to customize the evaluation based on investor preferences. Risk aversion increases the concern for falsely omitting IPOs that are prone to underperforming. Still, risk tolerance increases the concern for falsely identifying IPOs likely to outperform. This flexibility also improves the model’s usefulness by providing users with a tailored model that matches their investment type.
Additional cross-validation on the model used to develop the proposed framework was performed using 10-fold cross-validation, which revealed good accuracy and precision across various folds. Such consistency implies that the model generalizing the learned patterns to unseen data, minimizing the probability of overfitting and making it suitable for application in real-life financial markets where market conditions constantly change. The authors undertake a detailed analysis of high-impact predictors and optimization of model parameters, which contributes to creating a more precise predictive tool, thus improving the decision-making of IPO investment. This study also highlights the practical applications of these findings, providing investors and market regulators with robust tools for informed decision-making in unpredictable market scenarios. The proposed framework ability to analyze imbalanced IPO datasets has implications beyond financial markets. For example, in circular economy initiatives, these models could assess the financial feasibility of large-scale resource recovery projects, estimate the profitability of waste-to-energy programs, or optimize recycling efficiency across urban centers. Regulatory bodies could utilize such predictions to incentivize businesses to adopt sustainable practices by linking subsidies to proven economic benefits.
However, the reliance on static datasets limits the framework’s adaptability to dynamic market conditions. One of the significant areas for improvement of the current approach is that, with a static dataset as the basis for model building, it may be less effective at adapting to the market trends in the subsequent periods than it may initially appear. Future studies could consist of dynamic time-series data for diagnosing model efficiency in various economic conditions, enhancing the model’s flexibility in applying the outcomes in actual markets. Further, discovering circumstances and characteristics that are difficult for traditional models to express, including LSTM and Transformer-based models, may effectively examine temporal structures within financial data. Incorporating time-series data and exploring neural network architectures like LSTMs could enhance predictive accuracy and robustness. It would also be essential to test this framework in different international markets to determine the extent of its cross-sectional transportability given the range of economic environments in the world.
This study’s findings are groundbreaking in applying AI ensemble models to circular economy processes, a novel approach that builds on previous research in financial forecasting. Previous studies have focused on AI’s role in financial prediction, demonstrating its effectiveness in overcoming dimensionality and class imbalance. These studies have shown that ensemble methods, such as random forest and gradient boosting, can significantly improve prediction accuracy in financial markets. This study, however, takes a unique step by showing that these models can be equally successful in predicting sustainability-related outcomes in the circular economic context.
By comparing this study with the literature on circular economy AI applications, it is evident that there has been some discussion about the use of AI for resource recovery, waste reduction, and optimizing a recycling program, but (to a lesser degree) within the financial modeling frameworks. Studies on circular business models in recent years suggest that AI can predict the economic impact of such practices, i.e., the profitability of recycling initiatives or the efficiency of resource reuse. This study bridges this gap, applying AI-powered financial models to evaluate both economic and environmental sustainability metrics, contributing to a growing body of work at the intersection of finance and sustainability.
This research aligns with the emerging trend in AI-driven frameworks within Industry 5.0, emphasizing the potential to achieve sustainability alongside economic growth. By integrating AI into circular economy frameworks, this study supports the prediction of sustainable practices’ viability, resource optimization, and long-term environmental resilience, paving the way for a new era of data-driven sustainability.
Limitations and Future Directions
This current research needs to declare some limitations and assumptions when executing it. Still, due to the SMOTE, performed to balance classes in the dataset, more than just the improvement of the data might be needed, and it may influence the feature selection directly at the step of ANOVA. This limitation may affect the selected features’ general stability and the model.
This limitation may affect the selected features’ general stability and the model. SMOTE, while effective in addressing class imbalance, introduces the risk of altering the feature distribution, potentially skewing feature importance rankings derived through methods like ANOVA. Future studies should prioritize feature selection methods that are more compatible with resampled datasets, such as tree-based importance measures or RFE with cross-validation, to ensure the robustness of selected predictors.
Subsequent studies should analyze other feature selection techniques that are more suitable for resampled data, like RFE or other methods based on regularization, which could improve feature stability and relevance. Future work could also explore advanced ensemble techniques, such as model stacking, integrating machine learning with neural network models to enhance predictive performance. Other enhancements that may enhance the framework’s performance and flexibility include advanced ensemble techniques such as model stacking and others that integrate machine learning and neural network models. The use of neural network frameworks complementary to deep learning models may provide enhanced characterizations of non-linear patterns inside the financial data structures, thereby improving the generalization and predictive power of the model when applied to comprehensive and perhaps skewed datasets.
Furthermore, applying this concept in a natural fundamental dynamic, time series data enhance its applicability in performance evaluation across different periods of economic cycles, and volatile real-world time series data enhance its applicability in performance evaluation across different periods of economic cycles and volatile real-world financial environments. Another angle is that the proposed framework could be tested on IPO datasets from other countries and investors; therefore, this framework can be used to understand IPOs in various economic environments.
Another angle is that the proposed framework could be tested on IPO datasets from other countries and investors; therefore, this framework can be used to understand IPOs in various economic environments. Further, a promising avenue involves integrating federated learning approaches to train models across decentralized IPO datasets while preserving data privacy, a growing concern in international financial markets. Additionally, applying transfer learning could enhance the model’s performance in new markets with limited labeled data, improving its cross-sectional transportability and scalability.