Enhancing Predictive Analytics Effectiveness with Evolutionary AutoML Pipelines
AUTHORS
Florin Stoica,Department of Mathematics and Informatics, Faculty of Sciences, “Lucian Blaga” University, Sibiu, Romania
Laura Florentina Stoica,Department of Mathematics and Informatics, Faculty of Sciences, “Lucian Blaga” University, Sibiu, Romania
ABSTRACT
Machine learning (ML) has achieved considerable success in recent years, and an ever-growing number of disciplines rely on it. With Automated Machine Learning (AutoML) tools, organizations can unlock valuable new business insights, embed advanced AI capabilities in applications, and empower data scientists and nontechnical experts to build predictive models rapidly. AutoML tools are within broader MLOps (Machine Learning Operations) platforms, such as Oracle AutoML (OML4Py) or pure Python frameworks like FEDOT. We have built a simplified AutoML pipeline, focusing on hyperparameter optimization, based on the Optimal Multiple Kernel-Support Vector Machine (OMK-SVM) method. A benchmarking experiment was conducted to identify customers with a higher likelihood of switching from one streaming service to another movie streaming provider. The results revealed that our approach delivers best-in-class performance (SVM), and our evolutionary approach to hyperparameter optimization provides results comparable to those of the FEDOT framework.
KEYWORDS
SVM, AutoML, FEDOT, Customer churn
REFERENCES
[1] N. Shafiabady, N. Hadjinicolaou, F.U. Din, B. Bhandari, R.M.X. Wu, and J. Vakilian, “Using Artificial Intelligence (AI) to predict organizational agility,” PLOS ONE, vol.18, no.5:e0283066, (2023) DOI:10.1371/journal.pone.0283066(CrossRef)(Google Scholar)
[2] M. Zoller and M.F. Huber, “Benchmark and survey of automated machine learning frameworks,” Journal of Artificial Intelligence Research, vol.70, pp.409-474, (2021) DOI:10.1613/jair.1.11854(CrossRef)(Google Scholar)
[3] Research and Markets, “Automated Machine Learning (AutoML) Global Market Report,” February (2024) https://www.researchandmarkets.com/reports/5896115/global-automated-machine-learning-automl
[4] D. Simian, F. Stoica and A. Bărbulescu, “Automatic optimized support vector regression for financial data prediction,” Neural Computing and Applications, vol.32, pp.2383-2396, (2020) DOI:10.1007/s00521-019-04216-7(CrossRef)(Google Scholar)
[5] M. Gubar, “Integrate, Analyze and Act on All Data using Autonomous Database,” (2024) https://apexapps.oracle.com/pls/apex/r/dbpm/livelabs/view-workshop?wid=889
[6] B. Tierney, “OML4Py – AutoML – An Example”, Oralytics, March 30, (2021) https://oralytics.com/2021/03/15/oml4py-automl-an-example/
[7] N.O. Nikitin, P. Vychuzhanin, M. Sarafanov, I.S. Polonskaia, I. Revin, I.V. Barabanova, G. Maximov, A.V. Kalyuzhnaya and A. Boukhanovsky, "Automated evolutionary approach for the design of composite machine learning pipelines," Future Generation Computer Systems, vol.127, pp. 109-125, (2022) DOI:10.1016/j.future.2021.08.022 (CrossRef)(Google Scholar)
[8] I.S. Polonskaia, N.O. Nikitin, I, Revin, P. Vychuzhanin and A.V. Kalyuzhnaya, "Multi-Objective Evolutionary Design of Composite Data-Driven Models," IEEE Congress on Evolutionary Computation (CEC), Kraków, Poland, June 28 – July 01, pp.926-933, (2021) DOI:10.1109/CEC45853.2021.9504773(CrossRef)(Google Scholar)
[9] M. Nasseri, T. Falatouri, P. Brandtner and F. Darbanian, “Applying machine learning in retail demand prediction - A comparison of tree-based ensembles and long short-term memory-based deep learning,” Applied Sciences, vol 13, no. 19:11112, (2023) DOI: 10.3390/app131911112(CrossRef)(Google Scholar)
[10] L. M. Paladino, A. Hughes, A. Perera, O. Topsakal and T. C. Akinci, “Evaluating the Performance of Automated Machine Learning (AutoML) Tools for Heart Disease Diagnosis and Prediction,” AI, vol. 4, no. 4, pp. 1036-1058, (2023) DOI:10.3390/ai4040053(CrossRef)(Google Scholar)
[11] K. J. Kramer, N. Behn and M. Schmidt, “The potential of AutoML for demand forecasting,” Proceedings of the Conference on Production Systems and Logistics: CPSL 2022, Bremen, Germany, May 4-6, (2022) DOI:10.15488/12162(CrossRef)(Google Scholar)
[12] G. Stripling and M. Abel, “Low-code AI: A practical project-driven introduction to machine learning,” O'Reilly, (2023)
[13] J. Krauß, B.M Pacheco, H.M. Zang, and R.H. Schmitt, "Automated machine learning for predictive quality in production," Procedia CIRP, vol. 93, pp. 443-448, (2020) DOI:10.1016/j.procir.2020.04.039(CrossRef)(Google Scholar)
[14] M. Schmitt, "Automated machine learning: AI-driven decision making in business analytics," Intelligent Systems with Applications, vol.18:200188, (2023) DOI:10.1016/j.iswa.2023.200188(CrossRef)(Google Scholar)
[15] E. LeDell and S. Poirier, “H2O AutoML: Scalable automatic machine learning”, 7th ICML Workshop on Automated Machine Learning (ICML 2020), July 12-18, (2020) https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf
[16] A. Yakovlev, H.F. Moghadam, A. Moharrer, K. Cai, N. Chavoshi, V. Varadarajan, S.R. Agrawal, S. Idicula, T. Karnagel, S. Jinturkar and N. Agarwal, “Oracle AutoML: A Fast and Predictive AutoML Pipeline," Proceedings of the VLDB Endowment, vol. 13, no.12, pp. 3166-3180, (2020) DOI:10.14778/3415478.3415542(CrossRef)(Google Scholar)
[17] F. Hutter, L. Kotthoff and J. Vanschoren, “Automated Machine Learning. Methods, Systems, Challenges”, Springer, Switzerland, (2019)
[18] Business Process Model and notation™ (BPMN™) Version 2.0, The Object Management Group (OMG), January, (2014) https://www.omg.org/spec/BPMN
[19] V. Saini, "Model Evaluation using Lift and Gain Analysis - Lift and Gain Charts," November 21, (2022) https://varshasaini.in/model-evaluation-using-lift-and-gain-analysis-lift-and-gain-charts/
[20] S. M. Lundberg and Su-In Lee, "A unified approach to interpreting model predictions," Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17), Long Beach, California, USA, December 4-9, pp.4768–4777, (2017)