Sana mask
kullanabilir ve mean
yerine dropna
parametreyi skipna=True
ekleyebilir düşünüyorum. Ayrıca yerine NaN
değerleri gerekirse 0
değerleri veya data.artist_hotness.isnull()
yerine gerekirse data.artist_hotness == 0
koşulu değiştirmesi gerekir: Alternatif olarak loc
kullanımı
import pandas as pd
import numpy as np
data = pd.DataFrame({'artist_hotness': [0,1,5,np.nan]})
print (data)
artist_hotness
0 0.0
1 1.0
2 5.0
3 NaN
mean_artist_hotness = data['artist_hotness'].mean(skipna=True)
print (mean_artist_hotness)
2.0
data['artist_hotness']=data.artist_hotness.mask(data.artist_hotness == 0,mean_artist_hotness)
print (data)
artist_hotness
0 2.0
1 1.0
2 5.0
3 NaN
ancak ihmal sütun adı:
data.loc[data.artist_hotness == 0, 'artist_hotness'] = mean_artist_hotness
print (data)
artist_hotness
0 2.0
1 1.0
2 5.0
3 NaN
data.artist_hotness.loc[data.artist_hotness == 0, 'artist_hotness'] = mean_artist_hotness
print (data)
IndexingError: (0 True 1 False 2 False 3 False Name: artist_hotness, dtype: bool, 'artist_hotness')
Başka bir çözüm sütunları belirtme DataFrame.replace
:
data=data.replace({'artist_hotness': {0: mean_artist_hotness}})
print (data)
aa artist_hotness
0 0.0 2.0
1 1.0 1.0
2 5.0 5.0
3 NaN NaN
Ya da tüm sütunlar tüm 0
değerlerin yerine gerekirse:
import pandas as pd
import numpy as np
data = pd.DataFrame({'artist_hotness': [0,1,5,np.nan], 'aa': [0,1,5,np.nan]})
print (data)
aa artist_hotness
0 0.0 0.0
1 1.0 1.0
2 5.0 5.0
3 NaN NaN
mean_artist_hotness = data['artist_hotness'].mean(skipna=True)
print (mean_artist_hotness)
2.0
data=data.replace(0,mean_artist_hotness)
print (data)
aa artist_hotness
0 2.0 2.0
1 1.0 1.0
2 5.0 5.0
3 NaN NaN
tüm sütunlarda NaN
DataFrame.fillna
kullanmak yerine gerekirse: bazı sütunlar Series.fillna
kullanmak sadece eğer
data=data.fillna(mean_artist_hotness)
print (data)
aa artist_hotness
0 0.0 0.0
1 1.0 1.0
2 5.0 5.0
3 2.0 2.0
Ama:
data['artist_hotness'] = data.artist_hotness.fillna(mean_artist_hotness)
print (data)
aa artist_hotness
0 0.0 0.0
1 1.0 1.0
2 5.0 5.0
3 NaN 2.0