-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
non-equi joinsrolling, overlapping, non-equi joinsrolling, overlapping, non-equi joins
Description
In the latest version of dplyr, non-equi joins could be realized by:
library(dplyr)
transactions <- tibble(
company = c("A", "A", "B", "B"),
year = c(2019, 2020, 2021, 2023),
revenue = c(50, 4, 10, 12)
)
transactions
#> # A tibble: 4 × 3
#> company year revenue
#> <chr> <dbl> <dbl>
#> 1 A 2019 50
#> 2 A 2020 4
#> 3 B 2021 10
#> 4 B 2023 12
companies <- tibble(
id = c("A", "B", "B"),
since = c(1973, 2009, 2022),
name = c("Patagonia", "RStudio", "Posit")
)
companies
#> # A tibble: 3 × 3
#> id since name
#> <chr> <dbl> <chr>
#> 1 A 1973 Patagonia
#> 2 B 2009 RStudio
#> 3 B 2022 Posit
transactions |>
inner_join(companies, join_by(company == id, year >= since))
#> # A tibble: 5 × 5
#> company year revenue since name
#> <chr> <dbl> <dbl> <dbl> <chr>
#> 1 A 2019 50 1973 Patagonia
#> 2 A 2020 4 1973 Patagonia
#> 3 B 2021 10 2009 RStudio
#> 4 B 2023 12 2009 RStudio
#> 5 B 2023 12 2022 Posit
Using data.table, I yield:
library(data.table)
#>
#> Attaching package: 'data.table'
#> The following objects are masked from 'package:dplyr':
#>
#> between, first, last
setDT(transactions)
setDT(companies)
transactions[companies, on = .(company == id, year >= since)]
#> company year revenue name
#> 1: A 1973 50 Patagonia
#> 2: A 1973 4 Patagonia
#> 3: B 2009 10 RStudio
#> 4: B 2009 12 RStudio
#> 5: B 2022 12 Posit
The operation is the same, but the results is not desirable as that of dplyr's, and dplyr also supports to keep all the columns using argument keep = TRUE. How can I get something similar in data.table?
Thanks.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
non-equi joinsrolling, overlapping, non-equi joinsrolling, overlapping, non-equi joins