通过匹配的 ID 号合并两个 Pandas 数据帧

在本文中，我们将看到如何根据匹配的 ID 号合并两个数据框。

方法

创建第一个数据框
创建第二个数据框
选择要匹配的列
使用合并函数合并

Syntax : DataFrame.merge(parameters)

编程需要懂一点英语

显示结果

下面给出了使用具有适当值的所需参数来产生所需结果的实现。

例子：

Python3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8], 
                    'Name': ['Sam', 'John', 'Bridge',
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get
# merged i.e. {1,2,5,8}
df = pd.merge(df1, df2, on="ID")
print(df)

Python3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8], 
                    'Name': ['Sam', 'John', 'Bridge',
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's of left dataframe i.e. df1
# and NaN for columns of df2 where ID do not match
df = pd.merge(df1, df2, on="ID", how="left")
print(df)

Python3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8], 
                    'Name': ['Sam', 'John', 'Bridge', 
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's of right dataframe i.e. df2
# and NaN values for df1 columns where ID do not match
df = pd.merge(df1, df2, on="ID", how="right")
print(df)

Python3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8],
                    'Name': ['Sam', 'John', 'Bridge',
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's that match in both the Dataframe
df = pd.merge(df1, df2, on="ID", how="inner")
print(df)

Python3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8],
                    'Name': ['Sam', 'John', 'Bridge',
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's of both the dataframes
# and NaN values for the columns where the ID's 
# do not match
df = pd.merge(df1, df2, on="ID", how="outer")
print(df)

输出：

合并数据框

使用 ID 列合并两个数据帧，以及左侧数据帧的所有 ID，即合并函数的第一个参数。 df2 中不存在的 ID 为该行的列获取 NaN 值。

示例 2：

蟒蛇3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8], 
                    'Name': ['Sam', 'John', 'Bridge',
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's of left dataframe i.e. df1
# and NaN for columns of df2 where ID do not match
df = pd.merge(df1, df2, on="ID", how="left")
print(df)

输出：

合并数据框

使用 ID 列合并两个数据帧，以及正确数据帧的所有 ID，即合并函数的第二个参数。与 df1 不匹配的 ID 获取该列的 NaN 值。

示例 3：

蟒蛇3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8], 
                    'Name': ['Sam', 'John', 'Bridge', 
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's of right dataframe i.e. df2
# and NaN values for df1 columns where ID do not match
df = pd.merge(df1, df2, on="ID", how="right")
print(df)

输出：

合并数据框

将两个数据帧与 ID 列合并，两个数据帧中的所有数据帧都匹配。

示例 4：

蟒蛇3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8],
                    'Name': ['Sam', 'John', 'Bridge',
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's that match in both the Dataframe
df = pd.merge(df1, df2, on="ID", how="inner")
print(df)

输出：

合并数据框

将两个数据帧与 ID 列合并，其中包含两个数据帧的所有 ID 和在两个数据帧中都找不到 ID 的列的 NaN 值。

示例 5：

蟒蛇3

# import pandas as pd
import pandas as pd
  
# creating dataframes as df1 and df2
df1 = pd.DataFrame({'ID': [1, 2, 3, 5, 7, 8],
                    'Name': ['Sam', 'John', 'Bridge',
                             'Edge', 'Joe', 'Hope']})
  
df2 = pd.DataFrame({'ID': [1, 2, 4, 5, 6, 8, 9],
                    'Marks': [67, 92, 75, 83, 69, 56, 81]})
  
# merging df1 and df2 by ID
# i.e. the rows with common ID's get merged
# with all the ID's of both the dataframes
# and NaN values for the columns where the ID's 
# do not match
df = pd.merge(df1, df2, on="ID", how="outer")
print(df)

输出：

合并数据帧