Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Libraries
SentencePiece
SentencePiece is an unsupervised text tokenizer for Neural Network-based text generation.
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
SentencePiece is an unsupervised text tokenizer for Neural Network-based text generation.
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Classic word representation cannot handle unseen word or rare word well. Character embeddings is one of the solution to overcome out-of-vocabulary (OOV). However, it may be too fine-grained and miss some …
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Word Embedding Character Embedding Subword Embeddling Tokenization
General Language Understanding Evaluation (GLUE)
Natural Language Generation (NLG) Natural Language Generation, as defined by Artificial Intelligence: Natural Language Processing Fundamentals, is the “process …
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Module.__call__
method register all hooks and call the method Module.forward
.
In short,
when you train the model you should use the method forward
,
while when you test the …Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://github.com/onnx
https://github.com/onnx/onnxmltools
https://github.com/jpmml/sklearn2pmml
PMML4S is a PMML (Predictive Model Markup …
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://github.com/Azure/mmlspark/blob/master/docs/lightgbm.md
MMLSpark seems to be the best option to use train models using LightGBM on a Spark cluster. Note that MMLSpark requires …