Loan Eligibility Prediction

This project aims to predict the eligibility of loan applicants based on various demographic and financial attributes. The dataset for this project is sourced from the Analytics Vidhya Loan Prediction competition on Kaggle.

Project Overview

The goal is to build a predictive model that classifies whether an applicant will be eligible for a loan based on features such as gender, marital status, income, and credit history.

Data Source

The dataset used in this project can be found on Kaggle: Loan Prediction Data.

Repository Structure

data/: Contains the dataset used for training the model.
notebooks/: Jupyter notebooks with detailed exploratory data analysis (EDA), feature engineering, and model training.
src/: Python scripts for data preprocessing, model training, and evaluation.
README.md: Project overview and instructions.

Data Preprocessing

Handling Missing Values: Missing values were imputed using the mode for categorical variables and the mean for numerical variables.
Outlier Removal: Outliers were detected and removed using the Interquartile Range (IQR) method.
Skewed Distribution: Features like ApplicantIncome, CoapplicantIncome, and LoanAmount were transformed using the square root transformation to handle skewed distributions.
One-Hot Encoding: Categorical variables were converted into numerical format using one-hot encoding, and unnecessary columns were dropped.

Model

The model used in this project is a Random Forest Classifier with hyperparameters tuned for optimal performance.
The dataset had an imbalanced class distribution, and the Synthetic Minority Over-sampling Technique (SMOTE) was used to balance the classes.
The model was evaluated using accuracy, ROC-AUC score, and a detailed classification report.

Model Performance

Accuracy: 83.72%
ROC-AUC: 94.20%

Feature Importance

The Random Forest model provided insights into the most important features influencing loan eligibility. A bar chart of feature importances was generated to visualize the contributions of different features.

Installation and Usage

To run this project locally:

Clone the repository:

git clone https://github.com/Ismat-Samadov/Loan_Eligiblity.git

Install the required dependencies:
```
pip install -r requirements.txt
```

Run the notebook or script to train the model:

jupyter notebook notebooks/Loan_Eligibility_Prediction.ipynb

or

python src/train_model.py

Dependencies

Python 3.7+
pandas
numpy
scikit-learn
imbalanced-learn
seaborn
matplotlib

Contributing

Feel free to open an issue or submit a pull request if you would like to contribute to this project.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
backend		backend
model		model
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Loan Eligibility Prediction

Project Overview

Data Source

Repository Structure

Data Preprocessing

Model

Model Performance

Feature Importance

Installation and Usage

Dependencies

Contributing

About

Releases

Packages

Languages

Ismat-Samadov/Loan_Eligiblity

Folders and files

Latest commit

History

Repository files navigation

Loan Eligibility Prediction

Project Overview

Data Source

Repository Structure

Data Preprocessing

Model

Model Performance

Feature Importance

Installation and Usage

Dependencies

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages