PadhaiTime Logo
Padhai Time

Phases in Natural Language Processing

There are phases in NLP which need to be performed in order to extract meaningful information from the text corpus. Once these phases are completed, you are ready with your refined text and then you can apply some machine learning model to predict something.

1) Lexical Analysis: In this phase, the text is broken down into paragraphs, sentences and words. Analysis is done for identification and description of the structure of words. It includes techniques as follows:

  • Stop word removal (removing ‘and’, ‘of’, ‘the’ etc. from text)
  • Tokenization (breaking the text into sentences or words)
    • Word tokenizer
    • Sentence tokenizer
    • Tweet tokenizer
  • Stemming (removing ‘ing’, ‘es’, ‘s’ from the tail of the words)
  • Lemmatization (converting the words to their base forms)

2) Syntactic Analysis: Syntactic Analysis is used to check grammar, arrangements of words, and the relationship between the words.

Example: This word does not make sense: “Truck is eating Oranges “

Hence there is a need to analyze the intent of the words in a sentence. Some of the techniques used in this phase are:

  • Dependency Parsing
  • Parts of Speech (POS) tagging

3) Semantic Analysis: Once the tagging and word dependencies are analyzed, semantic analysis extracts only meaningful information from the text and rejects/ignores the sentences that do not make sense.

Example: “Truck is eating Oranges“ will be ignored from the information summary.

4) Discourse Integration: Its scope is not only limited to a word or sentence, rather discourse integration helps in studying the whole text. 

Example: "John got ready at 9 AM. Later he took the train to California"

Here, the machine is able to understand that the word “he” in the second sentence is referring to “John”

5) Pragmatic Analysis: It is a complex phase where machines should have knowledge not only about the provided text but also about the real world. There can be multiple scenarios where the intent of a sentence can be misunderstood if the machine doesn’t have real world knowledge.


    "Thank you for coming so late, we have wrapped up the meeting" (Contains sarcasm)

    "Can you share your screen?" (here the context is about computer’s screen share during a remote meeting)

Bengaluru, India
  • We collect cookies and may share with 3rd party vendors for analytics, advertising and to enhance your experience. You can read more about our cookie policy by clicking on the 'Learn More' Button. By Clicking 'Accept', you agree to use our cookie technology.
    Our Privacy policy can be found by clicking here