fcase support scalar condition, vectorize default and lazy-eval default#4264
fcase support scalar condition, vectorize default and lazy-eval default#4264MichaelChirico merged 52 commits intomasterfrom
Conversation
|
would be nice to measure overhead as I see some of the constant values declaration to be moved into loops. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #4264 +/- ##
==========================================
- Coverage 97.53% 97.53% -0.01%
==========================================
Files 80 80
Lines 14915 14909 -6
==========================================
- Hits 14547 14541 -6
Misses 368 368 ☔ View full report in Codecov by Sentry. |
|
The impact should be minimal when comparing to the scalar default. library(data.table)
set.seed(100)
n <- 1e5
cond1 <- sample(c(T, F, NA), n, replace = TRUE)
cond2 <- sample(c(T, F, NA), n, replace = TRUE)
val1 <- rnorm(n)
val2 <- rnorm(n)
default <- rnorm(n)
val1_str <- as.character(val1)
val2_str <- as.character(val2)
val1_list <- as.list(val1)
val2_list <- as.list(val2)
microbenchmark::microbenchmark(
fcase(cond1, val1, cond2, val2, default = 0.0),
fcase(cond1, val1_str, cond2, val2_str, default = 'abc'),
fcase(cond1, val1_list, cond2, val2_list, default = list(1:2)),
times = 1e3
)Results (updated)Before the PRAfter the PR |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Not sure if it's desired since we've already had the |
|
@shrektan , yes to your first question. I can send you the code tonight or commit something if you want. I just need to write it but it is easy. Regarding your second question, yes it is for long type. Look at line 43 how it is done for fifelse. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Yes. There's an overhead of always copying the whole vector of the In addition, after adding the supports for the scalar condition, I just realized the I think we should just convert the R function's default into a
How do you think? UPDATE: I think it makes sense so I'm starting to implement this. |
|
DONE. The one thing I'm not satisfied with is kind of the lines below. I have to declare a pointer w/o initialized it. The code is OK but if there's a better way to avoid this, I'm happy to make the change. const int *restrict pouts;
if (!naout) pouts = LOGICAL(outs); // the content is not useful if out is NA_LOGICAL scalar |
|
I'll be honest I don't really like this scalar |
|
But I can’t see any drawback for providing this, either. We can still use default. Anyway, my point is less about the scalar condition, but by providing this, we can naturally have the lazy evaluated default, which I believe is a nice feature to have. |
|
Generated via commit a653106 Download link for the artifact containing the test results: ↓ atime-results.zip Time taken to finish the standard R installation steps: 11 minutes and 48 seconds Time taken to run |
|
@ben-schwen think you can take a pass at review here? |
Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com>
|
Basically LGTM. However, there are many things I would want to rename/remodel for clarity purposes. Right now, I find this piece of our codebase quite hard to grasp in comparison to its complexity. |
I think the same can be said of a lot of our codebase TBH 😆 I think adding a lot of performance regression tests is the most important piece here, I always worry about accidentally introducing a slowdown when refactoring, even if our great test suite is passing. Filed #6322. |
|
Thanks @shrektan!! |

Closes #4258
Three features:
In short, the
defaultshould be regarded as just a special condition-value pair argument. So it should be vectorized and lazily evaluated.I believe with the three features the
fcase()interface will have a closer-natural-fits to what our users expect.