pandas.DataFrame—构建二维、尺寸可变的表格数据结构

发布时间 2023-04-25 13:30:10作者: yayagogogo

语法格式

pandas.DataFrame(data=Noneindex=Nonecolumns=Nonedtype=Nonecopy=None)

常用的几个参数解释:

  • data: 一系列数据,包括多种类型;
  • index: 索引值,行标签,默认值为RangeIndex(0, 1, 2, …, n);
  • columns: 列标签,默认值为RangeIndex(0, 1, 2, …, n);
  • dtype: 设置数据类型;
  • copy: 布尔值或None,表示是否拷贝数据。

代码示例

import pandas as pd
import numpy as np

#利用列表创建DataFrame
d1 = [[3,"negative",2],[4,"negative",6],[11,"positive",0],[12,"positive",2]]
df1 = pd.DataFrame(d1, columns=["xuhao","result","value"])
print(df1)
print(df1.dtypes)

#利用字典创建DataFrame
d2  = {'xuhao': [3,4,11,12], 'result': ["negative","negative","positive",
    "positive"],"value":[2,6,0,2]}
df2 = pd.DataFrame(d2, dtype=np.int8)
print (df2)
print(df2.dtypes)

#利用包含Series的字典创建DataFrame
d3  = {'xuhao': [3,4,11,12], 'result': ["negative","negative","positive",
"positive"],"value": pd.Series([2,3], index=[2,3])}
df3 = pd.DataFrame(d3,index=[0, 1, 2, 3])
print (df3)

#利用numpy ndarray创建DataFrame
df4 = pd.DataFrame(np.array([[3,"negative",2],[4,"negative",6],[11,"positive",0],\
    [12,"positive",2]]), columns=["xuhao","result","value"])
print(df4)

#利用包含标签列的numpy ndarray创建DataFrame
d5 = np.array([(1,3,2),(2,4,6),(3,1,0),(4,3,2)],
    dtype=[("xuhao", "i4"), ("result", "i4"), ("value", "i4")])
df5 = pd.DataFrame(d5)
df6 = pd.DataFrame(d5, columns=["result","value"])
print(df5)
print(df6)

#利用dataclass创建DataFrame
from dataclasses import make_dataclass
mydata = make_dataclass("mydata", [("result", str), ("value", int)])
df7 = pd.DataFrame([mydata("positive", 0), mydata("negative", 3), mydata("positive", 3)])
print(df7)

运行结束后,输出结果:

#df1
xuhao    result  value
0      3  negative      2
1      4  negative      6
2     11  positive      0
3     12  positive      2
xuhao      int64
result    object
value      int64
dtype: object
#df2
   xuhao    result  value
0      3  negative      2
1      4  negative      6
2     11  positive      0
3     12  positive      2
xuhao       int8
result    object
value       int8
dtype: object
#df3
   xuhao    result  value
0      3  negative    NaN
1      4  negative    NaN
2     11  positive    2.0
3     12  positive    3.0
#df4
  xuhao    result value
0     3  negative     2
1     4  negative     6
2    11  positive     0
3    12  positive     2
#df5
   xuhao  result  value
0      1       3      2
1      2       4      6
2      3       1      0
3      4       3      2
#df6
   result  value
0       3      2
1       4      6
2       1      0
3       3      2
#df7
     result  value
0  positive      0
1  negative      3
2  positive      3