Skip to content

Pravin Paratey

Natural Language Processing, Data mining and Information Extraction consultant based in London.

Aug 06 2007

Document Tagger

DocTagger lets you automatically classify text documents. Use this as a starting point to write apps that can sort through volumes of unorganized data.

Try it!

Enter some text (300 words max) about any topic and hit Analyze to watch the tagger in action.


How it works

In short,

  1. POS-tagging the document.
  2. Stopword removal.
  3. Construct Synset map.
  4. Analyze Hypernymy relations.
  5. Output Synsets with highest score(s).

To learn more, you can read my presentation titled Text Classification using Wordnet.

Troubleshooting

If you encounter any issues or would like to give me feedback, email me at
pravinp -at- gmail -dot- com

Latest Articles