SlideShare una empresa de Scribd logo
1 de 21
The Thera bank has a growing customer base. The bank wants to increase borrowers (asset
customers) base to bring in more loan business and earn more through the interest on loans.
So, the bank wants to convert the liability-based customers to personal loan customers.
A campaign that the bank ran last year for liability customers showed a healthy conversion
rate of over 9% success. The retail marketing department is developing campaigns with
better target marketing to increase the success rate with a minimal budget.
CAPSTONE PROJECT
Topic : Predictive Modelling for Loan Approval
Overview of the Project:
Problem statement:
Source: Kaggle
The classification goal is to predict the likelihood of a liability customer buying personal loan
In this project, I have build a predictive model to predict if an will buy personal loan or not.
The models which are used are as follows:
They want to set up a new marketing campaign hence, they need information about
the relation between the different attributes given in the data
I have implemented 3 different Models with using 1 classification algorithm, all of
them had Accuracies around 95%.
How my model will help :
Models:
Logistic Regression Classifier
Complement Naïve Bayes
K-Nearest Kneighbor (KNN)
AdaBoosting (Adaptive Boosting)
I have used confusion matrix to understand and compare model performances.
Step to follow:
• Loading, Understanding and Cleaning the Data
• EDA (Exploratory Data Analysis)
• Preprocessing
• Train Test Split
• Algorithm
• Evaluation
Loading, Understanding and Cleaning the data
1. Libraries imported:
 Numpy
 Pandas
 Matplotlib
 Seaborn
 Plotly Express
2. Loaded the Data:
3. Understanding the data:
df.shape() = There are 5000 rows and 14 colums.
df.describe().transpose() = describes statistical
details
df.duplicate().sum() = There are no duplicate values
df.isnull().sum() = There are no null values
negative_counts = (df < 0).sum() There are 52 negative values present in experience column so it is converted in
positive.
df.info() = To get information of data.
df.drop() = ID and ZIP Code are the redundant
columns which does not hold predictive power so it
can be dropped.
• EDA (Exploratory Data Analysis):
Detecting Outliers:
There are some Outliers in the above boxplots in some features which are Income, CCAvg and Mortgage.
Whereas we have no strong logic to exclude them from the dataset.
Therefore, we do not remove them from the dataset.
 Family:
 The count of Families per personal loan shows that families with 4 people have the
highest frequency of accepting the personal loan and families with 2 people have the
lowest frequency of accepting the personal loan.
 Education:
 The count of Education per personal loan shows that customers
with (Advanced / Professional) education levels have the highest frequency of
accepting personal loans and customers with (Undergrad) education levels
have the lowest frequency of accepting personal loans.
 Securities Accounts:
 The count of Securities Accounts per personal loan shows that customers
who don't have securities accounts with the bank have the highest frequency of
accepting personal loans than those who have a securities account with the bank.
 CD Accounts:
 The count of CD Accounts per personal loan shows that customers who don't have a
certificate of deposit (CD) account with the bank have the highest frequency of
accepting personal loans than those who have a certificate of deposit (CD) account
with the bank.
 Online:
 The count of Online per personal loan shows that customers who use internet
banking facilities have the highest frequency of accepting personal loans than
those who don't use internet banking facilities.
 Credit card:
 The count of Credit card per personal loan shows that customers who don't use a
credit card issued by the Bank have the highest frequency of accepting personal
loans than those who use a credit card issued by the Bank.
A normal distribution (with no skewness) is observed in the features
of Age and Experience for both customers who accept and do not
accept the personal loan.
A positive and normal skewness is observed in the distribution
of Income for customers who both don't accept and
accept personal loans, respectively.
A positive skewness is observed in the distribution of Mortgage for
both customers who accept and do not accept the personal loan.
There is a positive correlation between Personal Loan and
Income, Personal Loan and CCAvg, Personal Loan and CD
Account with the target except Experience and Age.
There is moderate correlation between CCAvg and
Income.
And, there is a strong correlation between Age and
Experience.
Multivariate Analysis: To check co-relations of attributes with target variable
Model Building
Libraries imported for building models:
• Seperating the data into x and y .
• Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the model on test
data.
Logistic Regression
Made prediction on test data and calculated the model
accuracy score which is 95%
Data Standardization using Standard Scaler and MinMaxScaler.
Training Logistic Regression Model
Confusion Matrix, has provided insights into the model's performance and
helped me in evaluating various metrics such as accuracy, precision, recall, and
F1 score which are given below.
• Because the data is imbalanced, I have used Complement Naive Bayes
Classifier using following steps:
• Transformed Continues features into Categorical features
• Convert int64 into Categorical type
• Using dummy encoding to convert categorical variables into binary variables.
Complement Naïve Bayes
• Seperating the data into x and y .
• Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also
test the model on test data.
Made prediction on test data and calculated the model accuracy
score which is around 87%
Data Standardization using MinMaxScaler.
Train Complement Naïve Bayes Model
Confusion Matrix, has provided insights into the model's performance and helped
me in evaluating various metrics such as accuracy, precision, recall, and F1 score
which are given below.
K-Nearest Kneighbor (KNN)
• Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the model on test data.
Visualizing the training and test accuracy using plot to find the best K for KNN Classifier
• Seperating the data into x and y .
Maximum neighbor = 50
Best accuracy is 0.93% for k
Made prediction on test data and calculated the model accuracy
score which is around 95%
Data Standardization using StandardScaler.
Train K-Nearest Neighbor (KNN) Model
Confusion Matrix, has provided insights into the model's performance and
helped me in evaluating various metrics such as accuracy, precision, recall, and
F1 score which are given below.
AdaBoosting (Adaptive Boosting)
• Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the
model on test data.
• Seperating the data into x and y .
• Created a new dataframe
Made prediction on test data and calculated the model accuracy
score which is around 98%
Data Standardization using MinMaxScaler
Train AdaBoost Model
Confusion Matrix, has provided insights into the model's performance and
helped me in evaluating various metrics such as accuracy, precision, recall, and
F1 score which are given below.
Conclusion
Comparing and Concatenating Performance Metrics of different Classification Models. It is shown above in
tabular format.
After analysing and calculating the performance of different classification models it has been observed that
the AdaBoost Classifier gives best result. Hence, we can use Adaboost Classifier as our model for
determine the liability of the customer buying Personal Loans from the bank.
Thankyou!!!

Más contenido relacionado

Similar a Decoding Loan Approval: Predictive Modeling in Action

Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
Eric Esajian
 

Similar a Decoding Loan Approval: Predictive Modeling in Action (20)

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Statistical Learning on Credit Data
Statistical Learning on Credit DataStatistical Learning on Credit Data
Statistical Learning on Credit Data
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
scrib.pptx
scrib.pptxscrib.pptx
scrib.pptx
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
End to-end machine learning project for beginners
End to-end machine learning project for beginnersEnd to-end machine learning project for beginners
End to-end machine learning project for beginners
 
Credit scorecard
Credit scorecardCredit scorecard
Credit scorecard
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Data mining - Machine Learning
Data mining - Machine LearningData mining - Machine Learning
Data mining - Machine Learning
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Vi sem
Vi semVi sem
Vi sem
 
The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining Process
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Wooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersWooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit Customers
 
Loan Risk Assessment & Scoring Model
Loan Risk Assessment & Scoring ModelLoan Risk Assessment & Scoring Model
Loan Risk Assessment & Scoring Model
 
Accurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms
 

Más de Boston Institute of Analytics

Más de Boston Institute of Analytics (20)

Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Detecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven ApproachDetecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven Approach
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
NLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile PricesNLP Based project presentation: Analyzing Automobile Prices
NLP Based project presentation: Analyzing Automobile Prices
 
Analyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning projectAnalyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning project
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud DetectionCombating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
Combating Fraudulent Transactions: A Deep Dive into Credit Card Fraud Detection
 
Predicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning ApproachPredicting Liver Disease in India: A Machine Learning Approach
Predicting Liver Disease in India: A Machine Learning Approach
 
Employee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationEmployee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project Presentation
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 

Último

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Decoding Loan Approval: Predictive Modeling in Action

  • 1. The Thera bank has a growing customer base. The bank wants to increase borrowers (asset customers) base to bring in more loan business and earn more through the interest on loans. So, the bank wants to convert the liability-based customers to personal loan customers. A campaign that the bank ran last year for liability customers showed a healthy conversion rate of over 9% success. The retail marketing department is developing campaigns with better target marketing to increase the success rate with a minimal budget. CAPSTONE PROJECT Topic : Predictive Modelling for Loan Approval Overview of the Project:
  • 2. Problem statement: Source: Kaggle The classification goal is to predict the likelihood of a liability customer buying personal loan In this project, I have build a predictive model to predict if an will buy personal loan or not.
  • 3. The models which are used are as follows: They want to set up a new marketing campaign hence, they need information about the relation between the different attributes given in the data I have implemented 3 different Models with using 1 classification algorithm, all of them had Accuracies around 95%. How my model will help : Models: Logistic Regression Classifier Complement Naïve Bayes K-Nearest Kneighbor (KNN) AdaBoosting (Adaptive Boosting) I have used confusion matrix to understand and compare model performances.
  • 4. Step to follow: • Loading, Understanding and Cleaning the Data • EDA (Exploratory Data Analysis) • Preprocessing • Train Test Split • Algorithm • Evaluation
  • 5. Loading, Understanding and Cleaning the data 1. Libraries imported:  Numpy  Pandas  Matplotlib  Seaborn  Plotly Express 2. Loaded the Data: 3. Understanding the data: df.shape() = There are 5000 rows and 14 colums. df.describe().transpose() = describes statistical details
  • 6. df.duplicate().sum() = There are no duplicate values df.isnull().sum() = There are no null values negative_counts = (df < 0).sum() There are 52 negative values present in experience column so it is converted in positive. df.info() = To get information of data. df.drop() = ID and ZIP Code are the redundant columns which does not hold predictive power so it can be dropped.
  • 7. • EDA (Exploratory Data Analysis): Detecting Outliers: There are some Outliers in the above boxplots in some features which are Income, CCAvg and Mortgage. Whereas we have no strong logic to exclude them from the dataset. Therefore, we do not remove them from the dataset.
  • 8.  Family:  The count of Families per personal loan shows that families with 4 people have the highest frequency of accepting the personal loan and families with 2 people have the lowest frequency of accepting the personal loan.  Education:  The count of Education per personal loan shows that customers with (Advanced / Professional) education levels have the highest frequency of accepting personal loans and customers with (Undergrad) education levels have the lowest frequency of accepting personal loans.  Securities Accounts:  The count of Securities Accounts per personal loan shows that customers who don't have securities accounts with the bank have the highest frequency of accepting personal loans than those who have a securities account with the bank.
  • 9.  CD Accounts:  The count of CD Accounts per personal loan shows that customers who don't have a certificate of deposit (CD) account with the bank have the highest frequency of accepting personal loans than those who have a certificate of deposit (CD) account with the bank.  Online:  The count of Online per personal loan shows that customers who use internet banking facilities have the highest frequency of accepting personal loans than those who don't use internet banking facilities.  Credit card:  The count of Credit card per personal loan shows that customers who don't use a credit card issued by the Bank have the highest frequency of accepting personal loans than those who use a credit card issued by the Bank.
  • 10. A normal distribution (with no skewness) is observed in the features of Age and Experience for both customers who accept and do not accept the personal loan. A positive and normal skewness is observed in the distribution of Income for customers who both don't accept and accept personal loans, respectively. A positive skewness is observed in the distribution of Mortgage for both customers who accept and do not accept the personal loan.
  • 11. There is a positive correlation between Personal Loan and Income, Personal Loan and CCAvg, Personal Loan and CD Account with the target except Experience and Age. There is moderate correlation between CCAvg and Income. And, there is a strong correlation between Age and Experience. Multivariate Analysis: To check co-relations of attributes with target variable
  • 12. Model Building Libraries imported for building models: • Seperating the data into x and y . • Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the model on test data. Logistic Regression
  • 13. Made prediction on test data and calculated the model accuracy score which is 95% Data Standardization using Standard Scaler and MinMaxScaler. Training Logistic Regression Model Confusion Matrix, has provided insights into the model's performance and helped me in evaluating various metrics such as accuracy, precision, recall, and F1 score which are given below.
  • 14. • Because the data is imbalanced, I have used Complement Naive Bayes Classifier using following steps: • Transformed Continues features into Categorical features • Convert int64 into Categorical type • Using dummy encoding to convert categorical variables into binary variables. Complement Naïve Bayes • Seperating the data into x and y . • Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the model on test data.
  • 15. Made prediction on test data and calculated the model accuracy score which is around 87% Data Standardization using MinMaxScaler. Train Complement Naïve Bayes Model Confusion Matrix, has provided insights into the model's performance and helped me in evaluating various metrics such as accuracy, precision, recall, and F1 score which are given below.
  • 16. K-Nearest Kneighbor (KNN) • Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the model on test data. Visualizing the training and test accuracy using plot to find the best K for KNN Classifier • Seperating the data into x and y . Maximum neighbor = 50 Best accuracy is 0.93% for k
  • 17. Made prediction on test data and calculated the model accuracy score which is around 95% Data Standardization using StandardScaler. Train K-Nearest Neighbor (KNN) Model Confusion Matrix, has provided insights into the model's performance and helped me in evaluating various metrics such as accuracy, precision, recall, and F1 score which are given below.
  • 18. AdaBoosting (Adaptive Boosting) • Divided the data into X_train, Y_train, X _test, Y_test and X_test to fit the model and also test the model on test data. • Seperating the data into x and y . • Created a new dataframe
  • 19. Made prediction on test data and calculated the model accuracy score which is around 98% Data Standardization using MinMaxScaler Train AdaBoost Model Confusion Matrix, has provided insights into the model's performance and helped me in evaluating various metrics such as accuracy, precision, recall, and F1 score which are given below.
  • 20. Conclusion Comparing and Concatenating Performance Metrics of different Classification Models. It is shown above in tabular format. After analysing and calculating the performance of different classification models it has been observed that the AdaBoost Classifier gives best result. Hence, we can use Adaboost Classifier as our model for determine the liability of the customer buying Personal Loans from the bank.