Ben Chuanlong Du's Blog

It is never too late to learn.

Understand Attention in NLP

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/

https://medium.com/@joealato/attention-in-nlp-734c6fa9d983

Tips on Word2Vec

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Word2Vec

https://code.google.com/archive/p/word2vec/

Hierarchical Softmasx

Negative Sampling

Google Word2Vec claims that hierarchical softmax is better for infrequent words while negative sampling is better for frequent words …

Compresion of Deep Learning Models

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

MobileNet

一、网络修剪

网络修剪,采用当网络权重非 …

Difference Between torch.nn.Module and torch.nn.functional

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Modules in torch.nn are internal implemented based on torch.nn.functional. Modules in torch.nn are easier to use while torch.nn.functional is more flexible. It is recommended to …

Log Softmax vs Softmax

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The difference betwen Log Softmax and Softmax should be understood together with the loss function.

References

https://discuss.pytorch.org/t/what-is-the-difference-between-log-softmax-and-softmax/11801

https://discuss.pytorch.org/t/logsoftmax-vs-softmax/21386

https …