Fintech Credit Scoring Techniques for Evaluating P2P Loan Applications – A Python Machine Learning Ensemble Approach

AUTHORS

Rekha Ramesh Shenoy,Department of Computer Science, Lakehead University, ON, Canada
Sabah Mohammed,Department of Computer Science, Lakehead University, ON, Canada
Jinan Fiaidhi,Department of Computer Science, Lakehead University, ON, Canada

ABSTRACT

Financial Technology (fintech) has been widely recognized as one of the most important innovations in the financial industry and is seen to evolving at a very rapidly. It holds the promise of reshaping the financial industry by creating a diverse financial landscape by providing stability, improving quality and most importantly reducing costs. One such fintech tool is the “Peer to Peer Lending” (also known as “P2P Lending”), which refers to companies that match lenders and borrowers without the use of the traditional banking systems. They are intermediaries that are usually online investment platforms that offer identity verification, proprietary credit models, loan approval, loan servicing and legal and compliance. This can be an attractive alternative for a borrower as loans can be applied for online, anonymously, and in a timely fashion. It is also beneficial for borrowers that do not have any previous credit history to be shown. Fintech develops a credit scoring model based on the credit risk evaluation. This model establishes itself in the use of online data sources, alternative credit models and variety of machine learning and data analytics techniques to estimate risks involved in the lending process and to minimize the operating costs. In this paper, we propose a stacking ensemble of machine learning classifiers that combines data preprocessing with different learning algorithms. We then compare the results of the bare bone classifiers with our stacking ensemble classifier The ensemble model developed gives a better performance than each of single classifiers that constitute the process of credit scoring.

 

KEYWORDS

fintech tools, credit scoring, machine learning algorithms, feature reduction, outliers, scikit-learn, regression, clustering, Bayesian, neural networks, forests, ensembles, bagging, boosting, stacking.

REFERENCES

[1]    Arjun Chandra, Xin Yao, Evolving hybrid ensembles of learning machines for better generalization, Neurocomputing, Volume 69, Issues 7–9, March (2006), Pages 686-700.
[2]    Browne, David, Carlo Manna, and Steven Prestwich. "Relevance-Redundancy Dominance: a threshold-free approach to filter-based feature selection." In 24th Irish Conference on Artificial Intelligence and Cognitive Science (2016). Sun SITE Central Europe/RWTH Aachen University, 2016.
[3]    Cheng-Lung Huang, Mu-Chen Chen, Chieh-Jen Wang, Credit scoring with a data mining approach based on support vector machines, Expert Systems with Applications, Volume 33, Issue 4, November (2007), Pages 847-856.
[4]    Fatemeh Nemati Koutanaei, Hedieh Sajedi, Mohammad Khanbabaei, A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring, Journal of Retailing and Consumer Services, Volume 27, November (2015), Pages 11-23.
[5]    Ha, Sang & Nguyen Ha, Nam & Nguyen Thi Bao, Hien. (2017). A hybrid feature selection method for credit scoring. EAI Endorsed Transactions on Context-aware Systems and Applications. 4. 152335. 10.4108/eai.6-3-2017.152335.
[6]    Heru wiryanto, Credit Scoring Machine Learning with Keras — R, medium.com blog, Feb 11, (2018), Available Online: https://medium.com/@heruwiryanto/credit-scoring-machine-learning-with-keras-r-502fc6eb451d
[7]    Iain Brown, Christophe Mues, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, Volume 39, Issue 3, 15 February (2012), Pages 3446-3453.
[8]    Jochen Kruppa, Alexandra Schwarz, Gerhard Arminger, Andreas Ziegler, Consumer credit risk: Individual probability estimates using machine learning, Expert Systems with Applications, Volume 40, Issue 13, 1 October (2013), Pages 5125-5131.
[9]    Maher Ala’raj, Maysam F. Abbod, A new hybrid ensemble credit scoring model based on classifiers consensus system approach, Expert Systems with Applications, Volume 64, 1 December (2016), Pages 36-55.
[10]  Nan-Chen Hsieh, Lun-Ping Hung, A data driven ensemble classifier for credit scoring analysis, Expert Systems with Applications, Volume 37, Issue 1, January (2010), Pages 534-545.
[11]  Serrano-Cinca, Carlos, Begoña Gutiérrez-Nieto, and Luz López-Palacios. "Determinants of default in P2P lending." PloS one 10, no. 10 (2015): e0139427.
[12]  Opitz, David W., and Richard Maclin. "Popular ensemble methods: An empirical study." J. Artif. Intell. Res.(JAIR) 11 (1999): 169-198.

CITATION

  • APA:
    Shenoy,R.R.& Mohammed,S.& Fiaidhi,J.(2018). Fintech Credit Scoring Techniques for Evaluating P2P Loan Applications – A Python Machine Learning Ensemble Approach. International Journal of Smart Business and Technology, 6(1), 19-38. 10.21742/IJSBT.2018.6.1.04
  • Harvard:
    Shenoy,R.R., Mohammed,S., Fiaidhi,J.(2018). "Fintech Credit Scoring Techniques for Evaluating P2P Loan Applications – A Python Machine Learning Ensemble Approach". International Journal of Smart Business and Technology, 6(1), pp.19-38. doi:10.21742/IJSBT.2018.6.1.04
  • IEEE:
    [1] R.R.Shenoy, S.Mohammed, J.Fiaidhi, "Fintech Credit Scoring Techniques for Evaluating P2P Loan Applications – A Python Machine Learning Ensemble Approach". International Journal of Smart Business and Technology, vol.6, no.1, pp.19-38, Jun. 2018
  • MLA:
    Shenoy Rekha Ramesh, Mohammed Sabah and Fiaidhi Jinan. "Fintech Credit Scoring Techniques for Evaluating P2P Loan Applications – A Python Machine Learning Ensemble Approach". International Journal of Smart Business and Technology, vol.6, no.1, Jun. 2018, pp.19-38, doi:10.21742/IJSBT.2018.6.1.04

ISSUE INFO

  • Volume 6, No. 1, 2018
  • ISSN(p):2288-8969
  • ISSN(e):2207-516X
  • Published:Jun. 2018

DOWNLOAD