1) An improved Group Teaching Optimization Algorithm based on local search and chaotic map for Feature Selection in High-Dimensional Data
Abstract:
The current study proposes a novel binary group teaching optimization algorithm with local search and chaos mapping (BGTOALC) as a wrapper-based feature selection method to solve high-dimensional feature selection problems. The local search and chaos mapping enhance the performance of the proposed algorithm. Also, two novel binary operators called Binary Teacher Phase Good Group (BTPGG) and Binary Teacher Phase Bad Group (BTPBG) are applied to the teacher’s phase for increasing the exploration and exploitation of the algorithm. Moreover, a new Binary Student Opposition-Based Learning (BSOBL) operator is introduced for the student phase, using an opposition-based strategy to achieve better exploitation. Finally, the teacher allocation phase is designed in a binary manner using the new Mean Binary Select (MBS) operator to increase the algorithm’s convergence rate. Subsequently, two other binary group teaching optimization algorithms, named BGTOAV and BGTOAS, are developed utilizing the S-shaped and V-shaped transfer functions to compare their performance with the BGTOALC algorithm. The proposed approaches are compared to other state-of-the-art binary algorithms on 30 datasets with different dimensions. Different experiments prove that the BGTOALC method outperforms the previous methods in terms of reducing the number of selected features and increasing the accuracy of the machine learning algorithm. Eventually, statistical analyses indicate the superiority of the BGTOALC method in terms of efficiency and convergence rate against other binary meta-heuristic algorithms.
Keywords: Feature Selection, Binary Group Teaching Optimization Algorithm, Local Search, Chaos Mapping, S-shaped and V-shaped transfer functions.
DOI: https://doi.org/10.1016/j.eswa.2022.117493
2) A Data-Driven Sequential Learning Framework to Accelerate and Optimize Multi-Objective Manufacturing Decisions
Abstract:
Manufacturing advanced materials and products with a specific property or combination of properties is often warranted. To achieve that it is crucial to find out the optimum recipe or processing conditions that can generate the ideal combination of these properties. Most of the time, a sufficient number of experiments are needed to generate a Pareto front. However, manufacturing experiments are usually costly and even conducting a single experiment can be a time-consuming process. So, it's critical to determine the optimal location for data collection to gain the most comprehensive understanding of the process. Sequential learning is a promising approach to actively learn from the ongoing experiments, iteratively update the underlying optimization routine, and adapt the data collection process on the go. This paper presents a novel data-driven Bayesian optimization framework that utilizes sequential learning to efficiently optimize complex systems with multiple conflicting objectives. Additionally, this paper proposes a novel metric for evaluating multi-objective data-driven optimization approaches. This metric considers both the quality of the Pareto front and the amount of data used to generate it. The proposed framework is particularly beneficial in practical applications where acquiring data can be expensive and resource intensive. To demonstrate the effectiveness of the proposed algorithm and metric, the algorithm is evaluated on a manufacturing dataset. The results indicate that the proposed algorithm can achieve the actual Pareto front while processing significantly less data. It implies that the proposed data-driven framework can lead to similar manufacturing decisions with reduced costs and time.
Keywords: Multi-Objective Bayesian Optimization, Data-driven Decisions, Gaussian Process,Sequential Learning, Smart Manufacturing
DOI: https://doi.org/10.48550/arXiv.2304.09278
3) Identification of the Factors Affecting the Reduction of Energy Consumption and Cost in Buildings Using Data Mining Techniques
Abstract:
Optimizing energy consumption and coordination of utility systems have long been a concern of the building industry. Buildings are one of the largest energy consumers in the world, making their energy efficiency crucial for preventing waste and reducing costs. Additionally, buildings generate substantial amounts of raw data, which can be used to understand energy consumption patterns and assist in developing optimization strategies. Using a real-world dataset, this research aims to identify the factors that influence building cost reduction and energy consumption. To achieve this, we utilize three regression models (Lasso Regression, Decision Tree, and Random Forest) to predict primary fuel usage, electrical energy consumption, and cost savings in buildings. An analysis of the factors influencing energy consumption and cost reduction is conducted, and the decision tree algorithm is optimized using metaheuristics. By employing metaheuristic techniques, we fine-tune the decision tree algorithm's parameters and improve its accuracy. Finally, we review the most practical features of potential and nonpotential buildings that can reduce primary fuel usage, electrical energy consumption, and costs
Keywords: Building Energy Consumption Optimization, Consumption Pattern Identification, MachineLearning Algorithms, Feature Selection, Pattern Recognition
DOI: https://doi.org/10.48550/arXiv.2305.08886
4) Chatbots and ChatGPT: A Bibliometric Analysis and Systematic Review of Publications in Web of Science and Scopus Databases
Abstract:
This paper presents a bibliometric analysis of the scientific literature related to chatbots, focusing specifically on ChatGPT. Chatbots have gained increasing attention recently, with an annual growth rate of 19.16% and 27.19% on the Web of Sciences (WoS) and Scopus, respectively. In this study, we have explored the structure, conceptual evolution, and trends in this field by analyzing data from both Scopus and WoS databases. The research consists of two study phases: (i) an analysis of chatbot literature and (ii) a comprehensive review of scientific documents on ChatGPT. In the first phase, a bibliometric analysis is conducted on all published literature, including articles, book chapters, conference papers, and reviews on chatbots from both Scopus (5839) and WoS (2531) databases covering the period from 1998 to 2023. An in-depth analysis focusing on sources, countries, authors' impact, and keywords has revealed that ChatGPT is the latest trend in the chatbot field. Consequently, in the second phase, bibliometric analysis has been carried out on ChatGPT publications, and 45 published studies have been analyzed thoroughly based on their methods, novelty, and conclusions. The key areas of interest identified from the study can be classified into three groups: artificial intelligence and related technologies, design and evaluation of conversational agents, and digital technologies and mental health. Overall, the study aims to provide guidelines for researchers to conduct their research more effectively in the field of chatbots and specifically highlight significant areas for future investigation into ChatGPT.
Keywords: Chatbot, ChatGPT, Bibliometrics, Artificial Intelligence, Natural LanguageProcessing, Generative Artificial Intelligence
DOI: https://doi.org/10.48550/arXiv.2304.05436