# NLP (MOC)
## 📓Notes
NLP or "natural language processing" is the method by which we "teach" a computer to read text. Since computers don't understand words, we have to convert text into numbers. The challenge is to convert it in such a way that "somehow" not only explains the meaning of the word, but perhaps only maintains it's role within a sentence, and the connection it has to the general meaning of it.
The best way to do that is through [[TF-IDF]], a method of converting text into a vector. The other is a [[Naive Bayes classifier]] .
### Types of Analysis
With NLP, you can:
1. [[Text Classification]] - For example to identify and categorize text as either "spam" or not spam
2. [[Text Generation]] - The basis for all Chat AI models, that generate text based on a prompt
3. [[Sources/References/Sentiment Analysis]] - To analyze whether a text (perhaps a review) is either positive, negative, or neutral
4. [[Topic Modeling]] - To group text by topic, for example news articles into political, economics, etc.
### Techniques
Most common features for NLP:
1. [[Named Entity Recognition]] - To detect popular names such as companies within the text
2. [[Regex]] - To search for matches within the text based on a special pattern. Also see [[pattern matching]]
3. [[Tokenization]] - To break town a sentence into base components (which can then be converted into a vector)
### 📥Unsorted Notes
```dataview
LIST FROM [[NLP (MOC)]] AND -outgoing([[NLP (MOC)]])
AND !#Type/MOC
sort file.name asc
```
## 📧Sources
### Courses
[[natural language processing course]]
[[NLP with python]]
### Websites
## 🌐Other MOC
### Overview