Clustering of distribution feeders with Interactive Visualizations

k-means clustering algorithm implementation: to identify 7-10 representative feeders from 700+ distribution feeders
The algorithm tests k=7, k=8, k=9 etc.. and depending on the Silhouette values for given k, the best number of representative feeders is chosen.
The code includes the Principal Component Analysis (PCA) of the dataset as well as 3-D visualizations of the data. Towards the end, scatter matrix plots are also demonstrated.

How to Run

Prerequisites

You will need a python 3 interpreter with the following packages:

jupyter notebook
scikit learn
scipy
matplotlib
pandas (optional - if you want to run 3-D visualization and interactive plot part)
chart_studio/plotly
cufflinks

Running the code

Clone (or download) the repository:

git clone https://github.com/sakshi-mishra/clustering-distribution-feeders.git

Open/Run the jupyter notebook Clustering_of _distribution_feeders_with_Interactive_Visualizations.ipynb
Open/Run the .py file Clustering_with_UserDefinedOptions_csvResults.py

User Instructions (for running Clustering_with_UserDefinedOptions_csvResults.py file):

NOTE: Check that raw values make it into the .csv (not standardardized/normalized values)

Input to the program:

1. names of the csv files (more than 1 file/dataset) can be processed at a time.
    Note: files names shouldn't have spaces in between them, you may fill the spaces with underscores.
2. user-defined transformation method: choose from normalization, standardization and minmax-scaling (based on dataset)
3. number of expected cluster: a range of k values (number of clusters to form) can be given as input. If just one value of k to be tested, then enter a = b

Output:

1. csv file named "input_file_name+transformation+date+data" : contains the labels assinged to all the samples
2. csv file named "input_file_name+transformation+date+metric" : contains various evaluation metircs
3. individual scatter plots of the features (for each cluster)
4. silhouette value plot

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
Clustering_of_distribution_feeders_with_Interactive_Visualizations.ipynb		Clustering_of_distribution_feeders_with_Interactive_Visualizations.ipynb
Clustering_with_UserDefinedOptions_csvResults.py		Clustering_with_UserDefinedOptions_csvResults.py
README.md		README.md
sample_feeder_data.csv		sample_feeder_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clustering of distribution feeders with Interactive Visualizations

How to Run

Prerequisites

Running the code

User Instructions (for running Clustering_with_UserDefinedOptions_csvResults.py file):

NOTE: Check that raw values make it into the .csv (not standardardized/normalized values)

Input to the program:

Output:

About

Releases

Packages

Languages

sakshi-mishra/clustering-distribution-feeders

Folders and files

Latest commit

History

Repository files navigation

Clustering of distribution feeders with Interactive Visualizations

How to Run

Prerequisites

Running the code

User Instructions (for running Clustering_with_UserDefinedOptions_csvResults.py file):

NOTE: Check that raw values make it into the .csv (not standardardized/normalized values)

Input to the program:

Output:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages