Skip to contents

Perform a modified dplyr::full_join() on x and y, returning all columns from x, non-matching rows from x, and all rows from y. Essentially replace_join() replaces matching rows in x with corresponding rows from y and adds all unmatched rows from y.

Usage

replace_join(
  x,
  y,
  replace_col,
  by = dplyr::intersect(names(x), names(y)) %>% dplyr::setdiff(replace_col),
  copy = FALSE,
  suffix = c(".x", ".y"),
  ...
)

Arguments

x

object on which replace_join will be performed. x is the data frame in which rows will be replaced by matching rows from y.

y

object on which replace_join will be performed. y is the data frame from which replacement rows are obtained when matching rows are found and from which unmatching rows are added to the outgoing data frame.

replace_col

the string name of the column (common to both x and y) whose values in y will be inserted into x where row matches are found for the by columns. replace_col should not be in by. The default value of by ensures that replace_col is not in by.

by

the string names of columns (common to x and y) on which matching rows will be determined. Default is dplyr::intersect(names(x), names(y)) %>% dplyr::setdiff(replace_col). This default ensures that replace_col is not in by, as required.

copy

passed to dplyr::left_join(). Default value is FALSE.

suffix

appended to replace_col to form the names of columns created in x during the internal dplyr::left_join() operation. Default is c(".x", ".y"), same as the default for dplyr::full_join().

...

passed to dplyr::full_join()

Value

a copy of x in which matching by rows are replaced by matched rows from y and unmatched rows from y are added to x.

Details

If x contains multiple matching rows, matching rows in y are inserted into x at each matching location. If y contains multiple matching rows, all are inserted into x at each matching location. See examples.

Columns of x and y named in by and replace_col should not be factors.

If replace_col is not in both x and y, x is returned, unmodified.

Examples

DFA <- data.frame(x = c(1, 2), y = c("A", "B"), stringsAsFactors = FALSE)
DFB <- data.frame(x = c(2, 3), y = c("C", "D"), stringsAsFactors = FALSE)
replace_join(DFA, DFB, replace_col = "y")
#>   x y
#> 1 1 A
#> 2 2 C
#> 3 3 D
replace_join(DFB, DFA, replace_col = "y")
#>   x y
#> 1 2 B
#> 2 3 D
#> 3 1 A
DFC <- data.frame(x = c(2, 2), y = c("M", "N"), stringsAsFactors = FALSE)
replace_join(DFA, DFC, replace_col = "y")
#>   x y
#> 1 1 A
#> 2 2 M
#> 3 2 N
replace_join(DFC, DFA, replace_col = "y")
#>   x y
#> 1 2 B
#> 2 2 B
#> 3 1 A
DFD <- data.frame(x = c(2, 2), y = c("A", "B"), stringsAsFactors = FALSE)
replace_join(DFC, DFD, replace_col = "y")
#>   x y
#> 1 2 A
#> 2 2 B
#> 3 2 A
#> 4 2 B