Ben Chuanlong Du's Blog

It is never too late to learn.

Tips on Delta Lake

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Delta Lake

Delta Table

convert to delta [db_name.]table_name [partitioned by ...] [vacuum [retain number hours]]

vaccum

describe history db_name.table_name

can select from historical snapshot can also rollback to a historical snapshot rollback …

Data Engineering Tools

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://github.com/linkedin/datahub

https://engineering.linkedin.com/blog/2019/data-hub DataHub: A generalized metadata search & discovery tool

GPU Related Issues and Solutions

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips

  1. Training a model requires significantly more CPU/GPU memories than running inference using the model.

  2. torch.cuda.empty_cache() doesn't help if memory is not enough

  3. It is suggested that you …