In machine learning, Part of Speech Tagging or POS Tagging is a concept of natural language processing where we assign a tag to each word in a text, based on the context of the text. It helps in understanding the syntactical components of a text to perform various tasks of natural language processing. If you have never used Part of Speech Tagging, then this article is for you. In this article, I will take you through an introduction to Part of Speech Tagging and its implementation using Python.

Part of Speech Tagging

Part of Speech Tagging or POS Tagging means assigning a tag to every word of a text. This tag assignment is based on the context of the text and any associated words in the text. In the POS tagging process, a piece of text is first divided into tokens and then a label is assigned to each token based on the context and similarities to other tokens.

The NLTK library in Python has a built-in model that is trained using the Penn Treebank POS Corpus, which is nothing more than a standard English tag vocabulary. Hope you now understand what POS tagging in machine learning is. In the section below, I will walk you through its implementation using the Python programming language.

Part of Speech Tagging using Python

To implement Part of Speech Tagging using the Python programming language, you need to install the NLTK library in your Python virtual environment. If you've never used it before, you can easily install it using the pip command:

  • pip install nltk

Now below is how you can implement POS Tagging using Python:

[('I', 'PRP'), ('will', 'MD'), ('move', 'VB'), ('to', 'TO'), ('Himachal', 'NNP'), ('Pradesh', 'NNP'), ('forever', 'RB'), ('!', '.')]

In the above output, you can see that each word of the sentence is assigned a tag. Here PRP means Personal Pronoun, MD means Modal, VB means Verb, NNP means Proper noun, and RB means Adverb. You can have a look at all the tags in POS Tagging from here.

Summary

Part of Speech Tagging or POS Tagging means assigning a tag to every word of a text. This tag assignment is based on the context of the text and any associated words in the text. I hope you liked this article on POS Tagging in machine learning and its implementation using Python. Feel free to ask your valuable questions in the comments section below.