Text normalization is the process of preparing a text for an [[natural language processing|NLP]] task, including [[sentence segmentation]], normalizing word formats ([[lemmatization]] or [[stemming]]), and [[tokenization]].