-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
non-equi joinsrolling, overlapping, non-equi joinsrolling, overlapping, non-equi joins
Description
Suppose we have the following data.tables:
library(data.table)
p1 <- data.table(
symbol = c("A", "A", "A", "B", "B", "B"),
date = c(1L, 2L, 3L, 1L, 2L, 3L)
)
p2 <- data.table(
symbol = c("A", "A", "B"),
start_date = c(1L, 2L, 1L),
end_date = c(1L, 3L, 2L),
id = 1:3
)> p1
symbol date
<char> <int>
1: A 1
2: A 2
3: A 3
4: B 1
5: B 2
6: B 3
> p2
symbol start_date end_date id
<char> <int> <int> <int>
1: A 1 1 1
2: A 2 3 2
3: B 1 2 3
In-place modification with non-equi-join works as expected:
p1[p2, id := id, on = .(symbol, date >= start_date, date <= end_date)]> p1
symbol date id
<char> <int> <int>
1: A 1 1
2: A 2 2
3: A 3 2
4: B 1 3
5: B 2 3
6: B 3 NA
However, using the non-equi-join syntax to subset p1 with p2 does not seem to work consistently:
p1[p2, .(symbol, date), on = .(symbol, date >= start_date, date <= end_date)] symbol date
<char> <int>
1: A 1
2: A 2
3: A 2
4: B 1
5: B 1
or
p1[p2, on = .(symbol, date >= start_date, date <= end_date)] symbol date id date.1 i.id
<char> <int> <int> <int> <int>
1: A 1 1 1 1
2: A 2 2 3 2
3: A 2 2 3 2
4: B 1 3 2 3
5: B 1 3 2 3
But I expect the result to be like
p2 <- p1[!is.na(id)]
p2 symbol date id
<char> <int> <int>
1: A 1 1
2: A 2 2
3: A 3 2
4: B 1 3
5: B 2 3
I wonder if this is a known behavior? Or is there an explanation why it is not the expected result like above? Am I missing something about the definition of the non-equi-join?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
non-equi joinsrolling, overlapping, non-equi joinsrolling, overlapping, non-equi joins