📌  相关文章
📜  在 R 中查找不在其他数据框中的行

📅  最后修改于: 2022-05-13 01:55:32.941000             🧑  作者: Mango

在 R 中查找不在其他数据框中的行

查找一个数据帧中存在而另一个数据帧中不存在的行称为集差。在本文中,我们将看到执行相同操作的不同方法。

方法一:使用sqldf()

在此方法中,只需传递查找 set-difference 的 sql 查询

句法:

我们的查询将是 sqldf('SELECT * FROM df1 EXCEPT SELECT * FROM df2')。它将排除 df1 中也存在于 df2 中的所有行,并将仅返回仅存在于 df1 中的行。



示例 1:

R
require(sqldf)
df1 <- data.frame(a = 1:5, b=letters[1:5])
df2 <- data.frame(a = 1:3, b=letters[1:3])
  
print("df1 is ")
print(df1)
  
print("df2 is ")
print(df2)
  
res <- sqldf('SELECT * FROM df1 EXCEPT SELECT * FROM df2')
print("rows from df1 which are not in df2")
print(res)


R
require(sqldf)
df1 <- data.frame(name = c("kapil","sachin","rahul"), age=c(23,22,26))
df2 <- data.frame(name = c("kapil"), age = c(23))
  
print("df1 is ")
print(df1)
  
print("df2 is ")
print(df2)
  
res <- sqldf('SELECT * FROM df1 EXCEPT SELECT * FROM a2')
print("rows from df1 which are not in df2")
print(res)


R
df1 <- data.frame(a = 1:5, b=letters[1:5], c= c(1,3,5,7,9))
df2 <- data.frame(a = 1:5, b=letters[1:5], c = c(2,4,6,8,10))
  
print("df1 is ")
print(df1)
  
print("df2 is ")
print(df2)
  
res <-setdiff(df1, df2)
print("rows from df1 which are not in df2")
print(res)


R
df1 <- data.frame(name = c("kapil","sachin","rahul"), age=c(23,22,26))
df2 <- data.frame(name = c("kapil","rahul", "sachin"), age = c(23, 22, 26))
  
print("df1 is ")
print(df1)
  
print("df2 is ")
print(df2)
  
res <- setdiff(df1, df2)
print("rows from df1 which are not in df2")
print(res)


示例 2:

电阻

require(sqldf)
df1 <- data.frame(name = c("kapil","sachin","rahul"), age=c(23,22,26))
df2 <- data.frame(name = c("kapil"), age = c(23))
  
print("df1 is ")
print(df1)
  
print("df2 is ")
print(df2)
  
res <- sqldf('SELECT * FROM df1 EXCEPT SELECT * FROM a2')
print("rows from df1 which are not in df2")
print(res)

方法 2:使用setdiff()

这是一个 R 内置函数,用于查找两个数据帧的集差。

句法:



它将返回 df1 中不存在于 df2 中的行。

示例 1:

电阻

df1 <- data.frame(a = 1:5, b=letters[1:5], c= c(1,3,5,7,9))
df2 <- data.frame(a = 1:5, b=letters[1:5], c = c(2,4,6,8,10))
  
print("df1 is ")
print(df1)
  
print("df2 is ")
print(df2)
  
res <-setdiff(df1, df2)
print("rows from df1 which are not in df2")
print(res)

输出:

示例 2:

电阻

df1 <- data.frame(name = c("kapil","sachin","rahul"), age=c(23,22,26))
df2 <- data.frame(name = c("kapil","rahul", "sachin"), age = c(23, 22, 26))
  
print("df1 is ")
print(df1)
  
print("df2 is ")
print(df2)
  
res <- setdiff(df1, df2)
print("rows from df1 which are not in df2")
print(res)

输出: