Skip to content

Natural Language Processing is the one widely used in almost every domain, Somewhat depth knowledge is required before hands-on. This Repo go through fundamentals with code

License

Notifications You must be signed in to change notification settings

Manikanta-Munnangi/Natural-Language-Processing-with-spaCy

Repository files navigation

Natural-Language-Processing-with-spaCy

Natural Language Processing is the one widely used in almost every domain, somewhat depth knowledge is required before hands-on. This repo provides Fundamentals of NLP using spacy.

You can find detailed explantion of the each task in my medium blog Here

spaCy_poster

NLP Tasks:

Task Description Applications Documentation link
1

Tokenization

A Process of converting setences or phrases into a tokens(words) with punctuations,special characters..etc

Beginning step to all NLP projects https://spacy.io/usage/linguistic-features
2

Stop words

A process of Removing most commonly spelled words like "a","the","is","but"..etc because they don't provide much meaning to context

For efficient searches in search engines https://spacy.io/usage/linguistic-features
3

Lemmatization

Downscaling same category of tree words to its root or base form

Text Normalization https://spacy.io/usage/linguistic-features
4

Part-Of-Speech tagging

Assigning properties of each individual words with part-of-speech, grammatical correcting sentences

Information filtering https://spacy.io/usage/linguistic-features
5

Dependency Parsing

Syntactic dependency parser like subject, object, verb..etc

sentence boundary detection and phrase chunking. https://spacy.io/usage/linguistic-features
6

Chunking

Follows Part-Of-Speech Tagging and that adds more structure to the sentence.

Used for Noun phrases extraction https://spacy.io/usage/linguistic-features
7

Named Entity Recognition

Grouping of Real-world objects into pre-defined classes like company names,places..etc

Text classification https://spacy.io/usage/training#ner
8

Word vector similarity

Finding similarity of words by converting word to vectors representations

Recommendation system https://spacy.io/usage/vectors-similarity

Prequisites:

=> Python3.
=> Latest version of Spacy. 
=> All English statistical models from spacy.
=> Knowledge of Lingusitics

Introduction:

1 billion gigabytes of data is generated per day. You may be wondering that’s a big number. it because data comes from everywhere sensors used to gather shopper information, posts to social media sites, digital pictures, and videos purchase transactions, and cell phone GPS signals and the list goes on. The question is how do we process, analyze and convert this data that are meaningful to us which helps in addressing our own needs and solve new technical problems.

Natural Language Processing will help to analyze text-related problems

1. What is Natural Language Processing?

With the help of a Bunch of Algorithms and rules the computer able to understand and communicate with humans in vast human languages and scales other language-related tasks. With NLP, it is possible to perform certain tasks like Automated Speech and Automated Text Writing in less time. Due to the evolving of large data (text), why not to use the computers which have high computing power, capable of working all day and ability to run several algorithms to perform tasks in no time.

NLP Pipeline:

pipeline

Conclusions:

  1. Natural Language Processing is a subset of AI which deals with teaching a computer to understand the human-level language and act upon it.
  2. It has challenges that are lacking the process and also in performance as well. NLP comprises two subsets Called Natural Language Understanding and Natural Language Generation which helps in processing the text to the next level.
  3. It involves a lot of study and understanding of linguistics.
  4. A vast number of applications like Sentiment analysis, Chatbots, virtual assistants and more that benefit us using NLP.
  5. By leveraging the power of data and high-level computing hardware we can train the models in no time.

About

Natural Language Processing is the one widely used in almost every domain, Somewhat depth knowledge is required before hands-on. This Repo go through fundamentals with code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published