Ben Chuanlong Du's Blog

It is never too late to learn.

Read CSV Files Using Polars in Rust

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

  1. LazyCsvReader is more limited compared to CsvReader. CsvReader support specifying schema while LazyCsvReader does not.

  2. An empty filed is parsed as null instead of an empty string by default. And there is no way to change this behavior at this time. Please refer to this issue

Read and Write CSV Files in Rust

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

  1. By defaut, csv::Reader requires headers.

  2. When the csv crate is used together with the serde crate for deserialization, CSV files to be parsed have to be strictly well formatted. For example, the headers in CSV files have to match the defintion in the serde struct. Otherwise, the code will panic with an error of "missing fields".

Read/Write CSV in PySpark

Load Data in CSV Format

  1. .load is a general method for reading data in different format. You have to specify the format of the data via the method .format of course. .csv (both for CSV and TSV), .json and .parquet are specializations of .load. .format is optional if you use a specific loading function (csv, json, etc.).