Skip to content

Non-equi join showing columns in a not so perfect style #5593

@hope-data-science

Description

@hope-data-science

In the latest version of dplyr, non-equi joins could be realized by:

library(dplyr)

transactions <- tibble(
  company = c("A", "A", "B", "B"),
  year = c(2019, 2020, 2021, 2023),
  revenue = c(50, 4, 10, 12)
)
transactions
#> # A tibble: 4 × 3
#>   company  year revenue
#>   <chr>   <dbl>   <dbl>
#> 1 A        2019      50
#> 2 A        2020       4
#> 3 B        2021      10
#> 4 B        2023      12

companies <- tibble(
  id = c("A", "B", "B"),
  since = c(1973, 2009, 2022),
  name = c("Patagonia", "RStudio", "Posit")
)

companies
#> # A tibble: 3 × 3
#>   id    since name     
#>   <chr> <dbl> <chr>    
#> 1 A      1973 Patagonia
#> 2 B      2009 RStudio  
#> 3 B      2022 Posit


transactions |>
  inner_join(companies, join_by(company == id, year >= since))
#> # A tibble: 5 × 5
#>   company  year revenue since name     
#>   <chr>   <dbl>   <dbl> <dbl> <chr>    
#> 1 A        2019      50  1973 Patagonia
#> 2 A        2020       4  1973 Patagonia
#> 3 B        2021      10  2009 RStudio  
#> 4 B        2023      12  2009 RStudio  
#> 5 B        2023      12  2022 Posit

Using data.table, I yield:

library(data.table)
#> 
#> Attaching package: 'data.table'
#> The following objects are masked from 'package:dplyr':
#> 
#>     between, first, last
setDT(transactions)
setDT(companies)
transactions[companies, on = .(company == id, year >= since)]
#>    company year revenue      name
#> 1:       A 1973      50 Patagonia
#> 2:       A 1973       4 Patagonia
#> 3:       B 2009      10   RStudio
#> 4:       B 2009      12   RStudio
#> 5:       B 2022      12     Posit

The operation is the same, but the results is not desirable as that of dplyr's, and dplyr also supports to keep all the columns using argument keep = TRUE. How can I get something similar in data.table?

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    non-equi joinsrolling, overlapping, non-equi joins

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions