Ben Chuanlong Du's Blog

It is never too late to learn.

Tips on Recommendation Systems

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

User-based Filtering (Memory-based Filtering)

Item-based Filtering (Content-based Filtering)

Non-negative Matrix Factorization

Neural Matrix Factorization

Variational Autoencoder

Hybrid

New methods like VAE, AE, or Deep Collaborative outperform classical methods like NMF on …

Activation Functions in Neural Network

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

GELU

GELU is the best activation function currently (at least in NLP).

$$ GELU(x) == x \Phi(x) $$

,

where \(\Phi(x)\) is the cumulative distribution function of the standard normal distribution.

ReLU …

Tips on Transformer in NLP

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

http://nlp.seas.harvard.edu/2018/04/03/attention.html

https://blog.floydhub.com/the-transformer-in-pytorch/

http://jalammar.github.io/illustrated-transformer/

https://towardsdatascience.com/transformers-141e32e69591

Understand Attention in NLP

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/

https://medium.com/@joealato/attention-in-nlp-734c6fa9d983

Tips on Word2Vec

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Word2Vec

https://code.google.com/archive/p/word2vec/

Hierarchical Softmasx

Negative Sampling

Google Word2Vec claims that hierarchical softmax is better for infrequent words while negative sampling is better for frequent words …