Ben Chuanlong Du's Blog

It is never too late to learn.

Nature Language Processing Using NLTK

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

nltk.util.ngrams nltk.bigrams nltk.PorterStemmer

from nltk.util import ngrams
sentence = 'this is a foo bar sentences and i want to ngramize it'
n = 6
sixgrams = ngrams(sentence.split …

Time Series Analysis

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

In statistics, a unit root test tests whether a time series variable is non-stationary using an autoregressive model. A well-known test that is valid in large samples is the augmented Dickey …

Keywords Extracting from Text

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Word Stemming

  1. existing stemming method such as NLTK.PorterStem, etc.

  2. didn't -> did not, there's -> there is, etc. Mr. -> Mister Mrs. -> ... Ms. -> ...

Other things

  1. it seems that it is hard to get …

Make Your Model Training Reproducible in PyTorch

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The PyTorch doc Reproducibility has very detailed instructions on how to make your model training reproducible. Basically, you need the following code.

torch.manual_seed(args.seed)
np.random.seed(args.seed …

Save and Load PyTorch Models

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. PyTorch uses pickle to serialize and deserialize objects.

  2. The PyTorch convention is to use the file extension .pt or .pth for saving model (or its parameters) and use the file extension …

Tips on Deep Graph Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://github.com/dmlc/dgl