Hands on the Rust Crate Parquet

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!


  1. Notice that a cell in a Parquet table has a type of Field which is an enum of types.

Read and Write Parquet Files in Rust

There are a few crates in Rust which can help read and write Parquet files, among which Polars is the best one. As a matter of fact, polars is a DataFrame …

Handling Complicated Data Types in Python and PySpark

Tips and Traps

  1. An element in a pandas DataFrame can be any (complicated) type in Python. To save a padnas DataFrame with arbitrary (complicated) types as it is, you have to use the pickle module . The method pandas.DataFrame.to_pickle (which is simply a wrapper over pickle.dump) serialize the DataFrame to a pickle file while the method pandas.read_pickle

Tips on Apache Arrow

[Feather vs Parquet]
