Skip to content

MvMukesh/Utilizing-Natural-Language-Processing-to-Detect-Abusive-Language-on-Social-Media

Repository files navigation

Utilizing-Natural-Language-Processing-to-Detect-Abusive-Language-on-Social-Media

Harnessing the Power of NLP to Cleanse the Social Media Landscape

Install Required Google Cloud CLI before run : Click

  1. Initealize first in CLI, use this command: gcloud init give GCP credentials here [it can also be checkedin while completing GCP CLI Setup]
  2. Provide Project Name, Select Compute Resion
  3. Create a Bucket with unique name (to save data)
  4. Upload data in the Bucket (simple drag and drop csv file from local to new bucket)

gcp.png

NOTE:

  • There are plethora of tutorials on how to setup GCP CLI
  • Don'r forget to add artifacts in git ignore

Data Ingestion

data ingestion.png

Imagine you're a chef preparing a delicious dish. This data pipeline is like your kitchen helpers and recipe steps, transforming raw ingredients (data) into a tasty meal (insights)!
  1. Gathering Ingredients:
  • We have various suppliers (data sources) like text files, databases, or even zipped bags (zip files) full of data.
  • We first unpack any zipped bags in the "unzip_and_clean" step to make sure everything is accessible.
  • Then, we set up our kitchen workstations (new directories) to keep things organized.
  1. Preparing the Feast:
  • This is where the real cooking starts! We clean and chop the ingredients (data preparation).
  • This might involve removing unwanted parts, cutting them into smaller pieces (e.g., words from sentences), or even bringing in special seasoning (pretrained model weights, if needed).
  • We follow a specific recipe (data ingestion) that involves tasks like transforming the data into formats our model understands and maybe even cooking with pre-trained flavors (model loading).
  1. Serving Delicious Outcome:
  • Finally, the finished dish (processed data)! This could be predictions from our model, like predicting sentiment, or neatly prepared data files ready for further analysis.
  • We serve this tasty output to hungry customers (models or analysts) who can use it to make informed decisions or create even more insights.

Remember:

  • The order of some steps might be flexible depending on our recipe (implementation).
  • Some details might be hidden like specific cleaning techniques or seasoning ingredients, but the overall flow remains the same.

Data Transformation

data transformation.png

Model Training

model training.png

Remaining Pipelines to Add: Model Evaluation and Model Push Back to GCP

About

Harnessing the Power of NLP to Cleanse the Social Media Landscape

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published