python之pandas库的DataFrame — 数据对齐与缺失数据处理

2023年2月19日12:59:58

python之pandas库的DataFrame — 数据对齐与缺失数据处理

目录
1.基本操作
DataFrame相加
DataFrame缺失值填充 fillna
DataFrame缺失值删除 dropna

2.高级操作
缺失值删除 dropna(how = ‘’)
当一行中全部为缺失值时删除整行
当一行中任意为缺失值时删除整行
当一行中全部为缺失值时删除整列
当一行中任意为缺失值时删除整列

代码实现
1.基本操作

import pandas as pd 
import numpy as np
df1 = pd.DataFrame({'one':[1,2,3,4],'two':[5,6,7,8]},index=['a','b','c','d'])
df2 = pd.DataFrame({'one':[1,2,3,np.nan],'two':[5,6,7,8]},index=['d','c','b','a'])
df_add = df1 + df2
df_fill = df2.fillna(0) #填充缺失值
df_drop = df2.dropna() #一行中有一个缺失值时,会把整行删掉

print('df1=',df1,'\n')
print('df2=',df2,'\n')
print('df_add=',df_add,'\n')
print('df_fill=',df_fill,'\n')
print('df_drop=',df_drop,'\n')

df1=    one  two
a    1    5
b    2    6
c    3    7
d    4    8 

df2=    one  two
d  1.0    5
c  2.0    6
b  3.0    7
a  NaN    8 

df_add=    one  two
a  NaN   13
b  5.0   13
c  5.0   13
d  5.0   13 

df_fill=    one  two
d  1.0    5
c  2.0    6
b  3.0    7
a  0.0    8 

df_drop=    one  two
d  1.0    5
c  2.0    6
b  3.0    7 

df3=    one  two
d  1.0  5.0
c  2.0  6.0
b  3.0  NaN
a  NaN  NaN 

df3_drop=    one  two
d  1.0  5.0
c  2.0  6.0
b  3.0  NaN 

df3_drop2=    one  two
d  1.0  5.0
c  2.0  6.0 

2.高级操作

#缺失值删除的高级操作#
df3 = pd.DataFrame({'one':[1,2,3,np.nan],'two':[5,6,np.nan,np.nan]},index=['d','c','b','a'])
df3_drop = df3.dropna(how='all') #当一行中全部为缺失值时删除整行#
df3_drop2 = df3.dropna(how='any') #当一行中有任意缺失值时删除整行#
df4 = pd.DataFrame({'one':[1,2,3,4],'two':[5,6,np.nan,np.nan],'three':[np.nan,np.nan,np.nan,np.nan]},index=['d','c','b','a'])
df4_drop = df4.dropna(how='any',axis=1) #当一列中有任意缺失值时删除整行#
df4_drop2 = df4.dropna(how='all',axis=1) #当一列中全部为缺失值时删除整行#

print('df3=',df3,'\n')
print('df3_drop=',df3_drop,'\n')
print('df3_drop2=',df3_drop2,'\n')
print('df4=',df4,'\n')
print('df4_drop=',df4_drop,'\n')
print('df4_drop2=',df4_drop2,'\n')

df3=    one  two
d  1.0  5.0
c  2.0  6.0
b  3.0  NaN
a  NaN  NaN 

df3_drop=    one  two
d  1.0  5.0
c  2.0  6.0
b  3.0  NaN 

df3_drop2=    one  two
d  1.0  5.0
c  2.0  6.0 

df4=    one  two  three
d    1  5.0    NaN
c    2  6.0    NaN
b    3  NaN    NaN
a    4  NaN    NaN 

df4_drop=    one
d    1
c    2
b    3
a    4 

df4_drop2=    one  two
d    1  5.0
c    2  6.0
b    3  NaN
a    4  NaN 

  • 作者:在夏天冬眠啦
  • 原文链接:https://blog.csdn.net/qq_43165890/article/details/128117769
    更新时间:2023年2月19日12:59:58 ,共 1661 字。