Ben Chuanlong Du's Blog

It is never too late to learn.

Hands on the Rust Crate Parquet

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Comments

  1. Notice that a cell in a Parquet table has a type of Field which is an enum of types.

Read and Write Parquet Files in Rust

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

There are a few crates in Rust which can help read and write Parquet files, among which Polars is the best one. As a matter of fact, polars is a DataFrame …

Handling Complicated Data Types in Python and PySpark

Tips and Traps

  1. An element in a pandas DataFrame can be any (complicated) type in Python. To save a padnas DataFrame with arbitrary (complicated) types as it is, you have to use the pickle module . The method pandas.DataFrame.to_pickle (which is simply a wrapper over pickle.dump) serialize the DataFrame to a pickle file while the method pandas.read_pickle

Tips on Apache Arrow

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

[Feather vs Parquet]https://github.com/wesm/feather/issues/188

References

https://github.com/wesm/feather