rss
logo

I provide consulting and custom development for Natural Language Processing, Information Extraction and Search solutions.Self Picture


 learn more   get in touch 

Logo - I Build Search
Aug 05
2007

Document Tagger digg

DocTagger lets you automatically classify text documents. Use this as a starting point to write apps that can sort through volumes of unorganized data.

Try it!

Enter some text (300 words max) about any topic and hit Analyze to watch the tagger in action.


How it works

In short,

  1. POS-tagging the document.
  2. Stopword removal.
  3. Construct Synset map.
  4. Analyze Hypernymy relations.
  5. Output Synsets with highest score(s).

To learn more, you can read my presentation titled Text Classification using Wordnet.

Troubleshooting

If you encounter any issues or would like to give me feedback, email me at
pravinp -at- gmail -dot- com

One Response (rss) (trackback)

#1

diego

February 1st, 2010 at 3:23 pm

Hi, from where I can download the code for document tagger? or PHP Classes for Natural Language Processing ?
thanks

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">

Latest Articles

Feb
19

Join a list of integers in Python

How do you run a string join on a list of integers in Python? After googling for about 10 mins, I gave up and did this. I am sure there is a better way of doing it! [Read More]
Jan
21

Writing a spider in 10 mins using Scrapy

I came across Scrapy a few days back and have grown to really love it. This tutorial will illustrate how you can write a simple spider using Scrapy to scrape data off Paul Smith. All this in 10 minutes. [Read More]

Featured Projects

Document Tagger

Document Tagger

DocTagger lets you automatically classify text documents. Use this as a starting point to write apps that can sort through volumes of unorganized data.

[Read More]

Deebot

Deebot

Deeb0t is an IRC chat bot capable of making meaningful conversation with other users. It also responds to commands issued by its owner.

[Read More]

This page and its contents are copyright © 2010, Pravin Paratey.