Natural Language Processing is the one widely used in almost every domain, somewhat depth knowledge is required before hands-on.
This repo provides Fundamentals of NLP using spacy.
You can find detailed explantion of the each task in my medium blog Here
Task | Description | Applications | Documentation link | |
---|---|---|---|---|
1 | Tokenization |
A Process of converting setences or phrases into a tokens(words) with punctuations,special characters..etc |
Beginning step to all NLP projects | https://spacy.io/usage/linguistic-features |
2 | Stop words |
A process of Removing most commonly spelled words like "a","the","is","but"..etc because they don't provide much meaning to context |
For efficient searches in search engines | https://spacy.io/usage/linguistic-features |
3 | Lemmatization |
Downscaling same category of tree words to its root or base form |
Text Normalization | https://spacy.io/usage/linguistic-features |
4 | Part-Of-Speech tagging |
Assigning properties of each individual words with part-of-speech, grammatical correcting sentences |
Information filtering | https://spacy.io/usage/linguistic-features |
5 | Dependency Parsing |
Syntactic dependency parser like subject, object, verb..etc |
sentence boundary detection and phrase chunking. | https://spacy.io/usage/linguistic-features |
6 | Chunking |
Follows Part-Of-Speech Tagging and that adds more structure to the sentence. |
Used for Noun phrases extraction | https://spacy.io/usage/linguistic-features |
7 | Named Entity Recognition |
Grouping of Real-world objects into pre-defined classes like company names,places..etc |
Text classification | https://spacy.io/usage/training#ner |
8 | Word vector similarity |
Finding similarity of words by converting word to vectors representations |
Recommendation system | https://spacy.io/usage/vectors-similarity |
=> Python3.
=> Latest version of Spacy.
=> All English statistical models from spacy.
=> Knowledge of Lingusitics
1 billion gigabytes of data is generated per day. You may be wondering that’s a big number. it because data comes from everywhere sensors used to gather shopper information, posts to social media sites, digital pictures, and videos purchase transactions, and cell phone GPS signals and the list goes on. The question is how do we process, analyze and convert this data that are meaningful to us which helps in addressing our own needs and solve new technical problems.
Natural Language Processing will help to analyze text-related problems
With the help of a Bunch of Algorithms and rules the computer able to understand and communicate with humans in vast human languages and scales other language-related tasks. With NLP, it is possible to perform certain tasks like Automated Speech and Automated Text Writing in less time. Due to the evolving of large data (text), why not to use the computers which have high computing power, capable of working all day and ability to run several algorithms to perform tasks in no time.
- Natural Language Processing is a subset of AI which deals with teaching a computer to understand the human-level language and act upon it.
- It has challenges that are lacking the process and also in performance as well. NLP comprises two subsets Called Natural Language Understanding and Natural Language Generation which helps in processing the text to the next level.
- It involves a lot of study and understanding of linguistics.
- A vast number of applications like Sentiment analysis, Chatbots, virtual assistants and more that benefit us using NLP.
- By leveraging the power of data and high-level computing hardware we can train the models in no time.