Ben Chuanlong Du's Blog

It is never too late to learn.

Terminologies and Concepts in NLP

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Word Embedding Character Embedding Subword Embeddling Tokenization

General Language Understanding Evaluation (GLUE)

Natural Language Generation (NLG) Natural Language Generation, as defined by Artificial Intelligence: Natural Language Processing Fundamentals, is the “process …

Use LightGBM With Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://github.com/Azure/mmlspark/blob/master/docs/lightgbm.md

MMLSpark seems to be the best option to use train models using LightGBM on a Spark cluster. Note that MMLSpark requires …