Ben Chuanlong Du's Blog

It is never too late to learn.

Subword Algorithms for NLP

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Classic word representation cannot handle unseen word or rare word well. Character embeddings is one of the solution to overcome out-of-vocabulary (OOV). However, it may be too fine-grained and miss some …