Ben Chuanlong Du's Blog

It is never too late to learn.

Cut and qcut in pandas DataFrame

In [7]:
import pandas as pd
import numpy as np

df = pd.DataFrame(
    {"x": [3, 3, 1, 10, 1, 10], "y": [1, 2, 3, 4, 5, 60], "z": [6, 5, 4, 3, 2, 1]}
)

df
Out[7]:
x y z
0 3 1 6
1 3 2 5
2 1 3 4
3 10 4 3
4 1 5 2
5 10 60 1
In [8]:
pd.cut(df.y, 3)
Out[8]:
0    (0.941, 20.667]
1    (0.941, 20.667]
2    (0.941, 20.667]
3    (0.941, 20.667]
4    (0.941, 20.667]
5     (40.333, 60.0]
Name: y, dtype: category
Categories (3, interval[float64]): [(0.941, 20.667] < (20.667, 40.333] < (40.333, 60.0]]
In [9]:
pd.cut(df.y, [0, 1.5, 4.5, 100])
Out[9]:
0      (0.0, 1.5]
1      (1.5, 4.5]
2      (1.5, 4.5]
3      (1.5, 4.5]
4    (4.5, 100.0]
5    (4.5, 100.0]
Name: y, dtype: category
Categories (3, interval[float64]): [(0.0, 1.5] < (1.5, 4.5] < (4.5, 100.0]]
In [10]:
pd.cut(df.y, [1.5, 4.5, 10])
Out[10]:
0            NaN
1     (1.5, 4.5]
2     (1.5, 4.5]
3     (1.5, 4.5]
4    (4.5, 10.0]
5            NaN
Name: y, dtype: category
Categories (2, interval[float64]): [(1.5, 4.5] < (4.5, 10.0]]
In [ ]:
 

Comments