Refactor expression runner so it can be used via the C and JS APIs#2702
Refactor expression runner so it can be used via the C and JS APIs#2702kripken merged 30 commits intoWebAssembly:masterfrom
Conversation
|
Last commit adds a mechanism to support evaluating expressions like runner.runAndDispose(
module.i32.add(
module.block(null, [
module.local.set(0, module.i32.const(5)),
module.local.get(0, binaryen.i32)
], binaryen.i32),
module.i32.const(1)
)
);where a value is |
|
This now also implements what I mentioned earlier, in that constant local and global values known beforehand can be set explicitly. NFCI for existing code using the runner, but will open up new possibilities in that a generator does not have to inline these values manually, but can instead use locals as long as their constant value is known, e.g. Another interesting feature would be to utilize a sub-runner to traverse into lightweight function calls, so something like |
|
Last commit now also traverses into (simple) functions, which has some minor but beneficial effects on existing tests. However, for this to work I had to add a |
|
Hmm, appears this leads to non-determinism now, depending on the order that functions become optimized in, so sometimes a function is simple enough to evaluate and sometimes it's not (yet) :( |
tlively
left a comment
There was a problem hiding this comment.
Looks great! I just have a few questions and small suggestions, but otherwise LGTM.
src/binaryen-c.h
Outdated
There was a problem hiding this comment.
I would like to see more explanation of what preserving side effects means. Where/how are the side effects preserved?
There was a problem hiding this comment.
This now reads
// Be very careful to preserve any side effects. For example, if we are
// intending to replace the expression with a constant afterwards, even if we
// can technically evaluate down to a constant, we still cannot replace the
// expression if it also sets a local, which must be preserved in this scenario
// so subsequent code keeps functioning.
src/binaryen-c.h
Outdated
There was a problem hiding this comment.
| // might or might not have been optimized already to something we can traverse | |
| // successfully, in turn leading to non-deterministic behavior. | |
| // might be concurrently modified, leading to undefined behavior. |
The problem here is not so much that we don't know what state the other function is in, but that it's state could be changing and inconsistent when we try to traverse it. It would also be good to mention how this flag interacts with the PreserveSideEffects flag, if at all.
There was a problem hiding this comment.
Used your suggested change and mentioned that traversing another function uses this runner's flags, which implies PreserveSideEffects.
src/binaryen-c.h
Outdated
There was a problem hiding this comment.
Are the side effects of these expressions preserved even without the PreserveSideEffects flag?
There was a problem hiding this comment.
This now reads
// Sets a known local value to use. Order matters if expressions have side
// effects. For example, if the expression also sets a local, this side effect
// will also happen (not affected by any flags). Returns `true` if the
// expression actually evaluates to a constant.
| // Check if a constant value has been set in the context of this runner. | ||
| auto iter = localValues.find(curr->index); | ||
| if (iter != localValues.end()) { | ||
| return Flow(std::move(iter->second)); |
There was a problem hiding this comment.
Why is this a std::move? What if the same local is gotten twice?
There was a problem hiding this comment.
My understanding here is that creating the std::pair<const wasm::Index, wasm::Literals> will make a copy of the wasm::Literals value, so using a std::move here hints that we can move that volatile copy instead of copying twice and dumping one. Perfectly possible that I don't actually know what I'm doing. Please advise :)
| } | ||
| // Otherwise remember the constant value set, if any, for subsequent gets. | ||
| if (!setFlow.breaking()) { | ||
| setLocalValue(curr->index, setFlow.values); |
There was a problem hiding this comment.
Couldn't there be subsequent gets if this is a tee, too?
There was a problem hiding this comment.
Good catch, updated the code accordingly
| module.i32.const(1) | ||
| ) | ||
| ); | ||
| assert(expr === 0); |
There was a problem hiding this comment.
Would it be more idiomatic for the JS API to turn this into null or something like that?
There was a problem hiding this comment.
Figured that one would typically test this as !expr anyway, just doing an overly precise check here for testing purposes. Would imagine that not mixing 0 and null has benefits for the JIT.
There was a problem hiding this comment.
An alternative here is to return the unmodified original expression. Would that be an improvement?
test/binaryen.js/expressionrunner.js
Outdated
There was a problem hiding this comment.
I think it would make these tests easier to understand if the JSON were not hardcoded with the raw numbers for id and type. Would it be possible to explicitly construct the expected expressions instead?
There was a problem hiding this comment.
Used the respective constants now and added a little assertDeepEqual to make it more easily readable.
| ) | ||
| (func $reftype-test (; 18 ;) (result nullref) | ||
| (func $loop-precompute (; 18 ;) (result i32) | ||
| (i32.const 1) |
| } | ||
|
|
||
| BinaryenExpressionRef | ||
| ExpressionRunnerRunAndDispose(ExpressionRunnerRef runner, |
There was a problem hiding this comment.
If the runner fails, it seems like it would be useful to expose more information to the caller about why it failed. That way the user could choose to increase the depth or loop count, if applicable. What do you think?
There was a problem hiding this comment.
Hmm, good question. Seems like this might be a bit too much, considering how it complicates the API. For instance, on the AssemblyScript side I expect to always use a reasonable maxDepth (or none) and give up otherwise as there is no reason to make an exception using larger limits. Would have used that limit right away then.
src/binaryen-c.cpp
Outdated
There was a problem hiding this comment.
It's unfortunate that this will only work correctly if there is at most one Runner created at a time, but it's probably not worth fixing urgently. Perhaps you could at least leave a TODO about it?
There was a problem hiding this comment.
Implemented something working, but the code for it turned out to be a bit unattractive since, other than expressions etc., runners can be deleted leading to undefined behavior in tracing. Added comments.
|
Is this ready to land, or still waiting for review from @tlively ? |
|
I'll take a final look now. |
There was a problem hiding this comment.
This should probably be static, too. You could get extra fancy by making both of these helpers static variables inside of noteExpressionRunner to limit their scope, but I'll leave that up to you. OTOH, it would probably be better to just say we don't support making more than max size_t expression runners and get rid of all this logic, especially since it is literally impossible to have than many expression runners recorded in expressionRunners.
There was a problem hiding this comment.
Mostly thinking in terms of a very long lived process using Binaryen, let's say where modules are being created as-a-service. While we can't store max size_t in the structure, we might at some point overflow, where the likely scenario is that this is just fine, yet guarding for not reusing something left over (i.e. from a module created and never disposed) seems like a good precaution to have. Unlikely that someone will do this with tracing enabled, ofc.
There was a problem hiding this comment.
Aha, I had missed that the ExpressionRunners were removed from the expressionRunners map when they were destroyed 👍
test/binaryen.js/expressionrunner.js
Outdated
There was a problem hiding this comment.
Can we do var i here, too, or would that be unidiomatic or bad? Seeing the variable reused like this gives me the heebie jeebies.
|
On it, will fix these two real quick :) |
|
@dcodeIO How is this for a commit message? (This just copied from the opening description) Refactors most of the precompute pass's expression runner into its base class so it can also be used via the C and JS APIs. Also adds the option to populate the runner with known constant local and global values upfront, and remembers assigned intermediate values as well as traversing into functions if requested. C-API: JS-API: |
|
Looks good :) (have been trying for a while to keep the first post good for a commit message, sometimes divided by a horizontal line to indicate where additional comments start) |
|
@dcodeIO Looks like there is a merge conflict to resolve now :( |
|
Ok great, merging! Thanks for all the work here, and sorry it took this long, but sometimes more complex changes end up that way... |
|
I saw this now, and sorry for late questions. It looks this PR duplicates many of functionalities of |
|
Was under the impression that |
|
Thanks for the answer. Yes, I now think the functionalities duplicated are not in |
Tackles the concerns raised in #2797 directly related to #2702 by reverting merging all of `PrecomputeExpressionRunner` into the base `ExpressionRunner`, instead adding a common base for both the precompute pass and the new C-API to inherit. No functional changes. --- ### Current hierarchy after #2702 is ``` ExpressionRunner ├ [PrecomputeExpressionRunner] ├ [CExpressionRunner] ├ ConstantExpressionRunner └ RuntimeExpressionRunner ``` where `ExpressionRunner` contains functionality not utilized by `ConstantExpressionRunner` and `RuntimeExpressionRunner`. ### New hierarchy will be: ``` ExpressionRunner ├ ConstantExpressionRunner │ ├ [PrecomputeExpressionRunner] │ └ [CExpressionRunner] ├ InitializerExpressionRunner └ RuntimeExpressionRunner ``` with the precompute pass's and the C-API's shared functionality now moved out of `ExpressionRunner` into a new `ConstantExpressionRunner`. Also renames the previous `ConstantExpressionRunner` to `InitializerExpressionRunner` to [better represent its uses](https://webassembly.org/docs/modules/#initializer-expression) and to make its previous name usable for the new intermediate template, where it fits perfectly. Also adds a few comments answering some of the questions that came up recently. ### Old hierarchy before #2702 for comparison: ``` ExpressionRunner ├ [PrecomputeExpressionRunner] ├ ConstantExpressionRunner └ RuntimeExpressionRunner ```
Refactors most of the precompute pass's expression runner into its base class so it can also be used via the C and JS APIs. Also adds the option to populate the runner with known constant local and global values upfront, and remembers assigned intermediate values as well as traversing into functions if requested.
C-API:
JS-API: