📜  如何找到 R 中两个数据帧之间的差异?

📅  最后修改于: 2022-05-13 01:55:49.784000             🧑  作者: Mango

如何找到 R 中两个数据帧之间的差异?

在本文中,我们将讨论如何在 R 编程语言中找到两个数据帧之间的差异或比较两个数据帧或数据集。

方法一:使用Intersect函数

R 中的 Intersect函数有助于获取两个数据集中的公共元素。

句法:

例子:



R
first <-
       data.frame(
             "1" = c('0.44','0.554','0.67','0.64'),
             "2" = c('0.124','0.22','0.82','0.994'),
             "3" = c('0.82','1.22','0.73','1.23')
         )
  
second <-
     data.frame(
            "1" = runif(4),
             "2" = runif(4),
             "3" = runif(4),
             "d" = runif(4),
             "e" = runif(4)
         )
  
second[intersect(names(first), names(second))]


R
first <-
       data.frame(
             "1" = c('0.44','0.554','0.67','0.64'),
             "2" = c('0.124','0.22','0.82','0.994'),
             "3" = c('0.82','1.22','0.73','1.23')
         )
  
second <-
     data.frame(
            "1" = runif(4),
             "2" = runif(4),
             "3" = runif(4),
             "d" = runif(4),
             "e" = runif(4)
         )
  
second[setdiff(names(second), names(first))]


R
library("dplyr")
  
first <-
       data.frame(
             "1" = c('0.44','0.554','0.67','0.64'),
             "2" = c('0.124','0.22','0.82','0.994'),
             "3" = c('0.82','1.22','0.73','1.23')
         )
  
second <-
     data.frame(
            "1" = runif(4),
             "2" = runif(4),
             "3" = runif(4),
             "d" = runif(4),
             "e" = runif(4)
         )
  
second%>%select(which(!(colnames(second) %in% colnames(first))))


输出:

方法二:使用setdiff()

与 intersect 不同,此函数有助于查看第一个数据框中缺少的列。



句法:

例子:

电阻

first <-
       data.frame(
             "1" = c('0.44','0.554','0.67','0.64'),
             "2" = c('0.124','0.22','0.82','0.994'),
             "3" = c('0.82','1.22','0.73','1.23')
         )
  
second <-
     data.frame(
            "1" = runif(4),
             "2" = runif(4),
             "3" = runif(4),
             "d" = runif(4),
             "e" = runif(4)
         )
  
second[setdiff(names(second), names(first))]

输出:

方法 3:使用 colnames 和 dplyr

我们将从 dplyr 中选择以获取将对其执行某些操作的数据帧的列,以获得两个数据帧之间所需的差异。

例子:

电阻

library("dplyr")
  
first <-
       data.frame(
             "1" = c('0.44','0.554','0.67','0.64'),
             "2" = c('0.124','0.22','0.82','0.994'),
             "3" = c('0.82','1.22','0.73','1.23')
         )
  
second <-
     data.frame(
            "1" = runif(4),
             "2" = runif(4),
             "3" = runif(4),
             "d" = runif(4),
             "e" = runif(4)
         )
  
second%>%select(which(!(colnames(second) %in% colnames(first))))

输出: