Ben Chuanlong Du's Blog

It is never too late to learn.

Nature Language Processing Using NLTK

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

nltk.util.ngrams nltk.bigrams nltk.PorterStemmer

from nltk.util import ngrams
sentence = 'this is a foo bar sentences and i want to ngramize it'
n = 6
sixgrams = ngrams(sentence.split …

Keywords Extracting from Text

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Word Stemming

  1. existing stemming method such as NLTK.PorterStem, etc.

  2. didn't -> did not, there's -> there is, etc. Mr. -> Mister Mrs. -> ... Ms. -> ...

Other things

  1. it seems that it is hard to get …