melt()函数是一个数据重塑工具,用于将宽格式数据转换为长格式数据(Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.)
1. 基本语法
pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index=True)
- frame : 需要重塑的Dataframe
- id_vars : 保留不变的列,即不进行重塑的列(Column(s) to use as identifier variables)
- value_vars : 要进行重塑的列,若没有为该变量赋值则默认为所有未包含在id_vars中的列(Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.)
- var_name : 新列中变量名一列的列名(Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.)
- value_name : 新列中变量值一列的列名(Name to use for the ‘value’ column.)
- col_level : Name to use for the ‘value’ column.
- ignore_index : If True, original index is ignored. If False, the original index is retained. Index labels will be repeated as necessary.
2. 示例
创建一个简单的DataFrame
X1 = pd.DataFrame(
dict(
Person=["Bob", "Alice", "Steve"],
Age=[32, 24, 64],
Weight=[128, 86, 95],
Height=[180, 175, 165],
)
)
| Person | Age | Weight | Height | |
|---|---|---|---|---|
| 0 | Bob | 32 | 128 | 180 |
| 1 | Alice | 24 | 86 | 175 |
| 2 | Steve | 64 | 95 | 165 |
1) 利用id_vars保留列
当只有id_vars被赋值时,未出现在id_vars中的列均会被重塑
X1.melt(id_vars=["Person"])
| Person | variable | value | |
|---|---|---|---|
| 0 | Bob | Age | 32 |
| 1 | Alice | Age | 24 |
| 2 | Steve | Age | 64 |
| 3 | Bob | Weight | 128 |
| 4 | Alice | Weight | 86 |
| 5 | Steve | Weight | 95 |
| 6 | Bob | Height | 180 |
| 7 | Alice | Height | 175 |
| 8 | Steve | Height | 165 |
2) 利用value_vars选择要重塑的列
当id_vars和value_vars均被赋值时,未出现在这两个参数中的列不会在新生成的DataFrame中出现
X1.melt(id_vars=["Person"], value_vars=["Weight", "Height"], var_name="Type", value_name="value")
| Person | Type | value | |
|---|---|---|---|
| 0 | Bob | Weight | 128 |
| 1 | Alice | Weight | 86 |
| 2 | Steve | Weight | 95 |
| 3 | Bob | Height | 180 |
| 4 | Alice | Height | 175 |
| 5 | Steve | Height | 165 |
cf : TP01 Q7, TP02 Q17