“Pd.cut in Pandas” Ответ

Разрешение DataFrame на основе диапазона

test = pd.DataFrame({'days': [0,20,30,31,45,60]})

test['range1'] = pd.cut(test.days, [0,30,60], include_lowest=True)
#30 value is in [30, 60) group
test['range2'] = pd.cut(test.days, [0,30,60], right=False)
#30 value is in (0, 30] group
test['range3'] = pd.cut(test.days, [0,30,60])
print (test)
   days          range1    range2    range3
0     0  (-0.001, 30.0]   [0, 30)       NaN
1    20  (-0.001, 30.0]   [0, 30)   (0, 30]
2    30  (-0.001, 30.0]  [30, 60)   (0, 30]
3    31    (30.0, 60.0]  [30, 60)  (30, 60]
4    45    (30.0, 60.0]  [30, 60)  (30, 60]
5    60    (30.0, 60.0]       NaN  (30, 60]
Gifted Gecko

Pd.cut in Pandas

>>> pd.qcut(range(5), 3, labels=["good", "medium", "bad"])
... 
[good, good, medium, bad, bad]
Categories (3, object): [good < medium < bad]
Famous Flamingo

DataFrame Cut

import pandas as pd

#设置切分区域
listBins = [0, 10, 20, 30, 40, 50, 60, 1000000]

#设置切分后对应标签
listLabels = ['0_10','11_20','21_30','31_40','41_50','51_60','61及以上']

#利用pd.cut进行数据离散化切分
"""
pandas.cut(x,bins,right=True,labels=None,retbins=False,precision=3,include_lowest=False)
x:需要切分的数据
bins:切分区域
right : 是否包含右端点默认True,包含
labels:对应标签,用标记来代替返回的bins,若不在该序列中,则返回NaN
retbins:是否返回间距bins
precision:精度
include_lowest:是否包含左端点,默认False,不包含
"""
df['fenzu'] = pd.cut(df['data'], bins=listBins, labels=listLabels, include_lowest=True)
Red Team

Ответы похожие на “Pd.cut in Pandas”

Вопросы похожие на “Pd.cut in Pandas”

Больше похожих ответов на “Pd.cut in Pandas” по Python

Смотреть популярные ответы по языку

Смотреть другие языки программирования