Why does NLP need a POS tag
In computational linguistics, "tagging" is generally understood to mean the annotation of corpora with linguistic information. In a narrower sense, this means automatic part-of-speech tagging, which involves assigning each word of a corpus its part of speech using a computer program. For example, the phrase "He's reading the book she recommended." annotated as follows:
- He / PPER reads / VVFIN the / ART book / NN, / $, which / PRELS she / PPER recommended to him / PPER / VVPP / VAFIN ./$.
The inventory of part-of-speech names used is called a "tag set". Depending on how finely differentiated it is and what morphosyntactic information (number, gender, case, tense etc.) is represented, the tag set can include between around 15 and over a thousand part-of-speech tags. The STTS tagset was used in the example above.
Part of speech tagging is important for many applications (information extraction, speech synthesis, automatic translation, parsing and many more).
Part of speech taggers can be classified as follows:
- rule-based tagger
- manually created rules (Constraint Grammar)
- automatically learned rules (Brill Tagger)
- statistical tagger
- based on Hidden Markow models (TnT, TreeTagger, HunPos)
- based on support vector machines (SVMTool)
- based on maximum entropy models (MXPOST, Stanford Tagger)
- based on neural networks (Morce)
All systems except those based on manually created rules require a corpus manually annotated with part of speech for training. The main difficulty with tagging is correctly disambiguating words with multiple possible parts of speech and unknown words.
Some part of speech taggers break down the input text themselves into individual words, punctuation marks, brackets, etc. This breakdown is called "tokenization". Other taggers already expect tokenized input text. Some taggers (like the TreeTagger) provide the part of speech as well as the lemma of a word.
Brants, Thorsten. 2000. TnT - A Statistical Part-of-Speech Tagger. "6th Applied Natural Language Processing Conference".
Giménez, J., and Márquez, L. 2004. SVMTool: A general POS tagger generator based on Support Vector Machines. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'04). Lisbon, Portugal.
Adwait Ratnaparkhi. (1996). A Maximum Entropy Model for Part-Of-Speech Tagging. In Proceedings of the Empirical Methods in Natural Language Processing Conference (EMNLP), University of Pennsylvania.
Toutanova, K., Klein, D., Manning, C.D., Yoram Singer, Y. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. Proceedings of HLT-NAACL 2003, pages 252-259.
Spoustová, Drahomíra "Johanka", Jan Hajic, Jan Raab and Miroslav Spousta. 2009. Semi-supervised Training for the Averaged Perceptron POS Tagger. Proceedings of the 12 EACL, pages 763-771.
Manning, Christopher D. 2011. Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? In Alexander Gelbukh (ed.), Computational Linguistics and Intelligent Text Processing, 12th International Conference, CICLing 2011, Proceedings, Part I. Lecture Notes in Computer Science 6608, pp. 171-189. Jumper.
Supervised by: Helmut Schmid, IMS, Uni Stuttgart
- What is the API app in Azure
- How do pharmaceutical companies sell drugs
- How do you win in business
- Where is the Viktualienmarkt in Munich
- What is the charge of the atomic nucleus?
- Can you dye leather furniture
- How can libraries improve their e-book services
- How many devices does my email have
- Will Black Hat SEO work in 2019
- Why does a narcissist show compassion?
- What makes cups environmentally friendly
- Why does eyeliner stay popular
- How is the climate in Mauritania
- Can humans teach animals to deal with fire?
- What is a Masters in Public Health
- What should i do with my heart
- Why are phenols good disinfectants
- Is China a revisionist state?
- Have you ever been to Jerry Springer's
- What is meant by constant alternating voltage
- Is restaurant food the best
- How does a voltage detector work
- A hard reset unlocks the bootloader
- How is India not a safe country
- What are some unsolved mysteries about psychology
- Is Barbados a good travel destination
- What is the HPAS exam
- Is the YouTube channel monetized by Sadgurus
- What is the best alternative to SmartDraw
- What are the advantages of SIFT over HOG
- What is the composition of the sun
- Are alligators dangerous
- I could get up if I'm not sleeping
- How does AI affect behavioral science?