Ben Chuanlong Du's Blog

It is never too late to learn.

Read CSV Using Polars in Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

In [1]:
!pip3 install --user polars
Collecting polars
  Downloading polars-0.15.10-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (14.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.6/14.6 MB 38.2 MB/s eta 0:00:0000:0100:01
Installing collected packages: polars
Successfully installed polars-0.15.10
In [4]:
import polars as pl
In [5]:
df = pl.read_csv(
    "rank53_j0_j0.csv",
    has_header=False,
    dtypes={
        "column_1": pl.UInt8,
        "column_2": pl.UInt8,
        "column_3": pl.UInt8,
        "column_4": pl.UInt16,
        "column_5": pl.Utf8,
    },
    null_values=[],
)
df.columns = ["i0", "i1", "i2", "i", "ranks"]
df
Out[5]:
shape: (10, 5)
i0 i1 i2 i ranks
u8 u8 u8 u16 str
0 1 2 0 "56229711839232...
0 1 2 1 "57324928499712...
0 1 2 2 "37744977903616...
0 1 2 3 "NA"
0 1 2 4 "37882416857088...
0 1 2 5 "37951136333824...
0 1 2 6 "38019855810560...
0 1 2 7 null
0 1 2 8 "38157294764032...
0 1 2 9 "38226014240768...
In [6]:
df["ranks"].is_null().sum()
Out[6]:
1
In [8]:
(df["ranks"] == "NA").sum()
Out[8]:
1
In [9]:
(df["ranks"] == "").sum()
Out[9]:
0
In [ ]:
 

Comments