[Strings] Add a string lowering pass using magic imports#6497
Conversation
The latest idea for efficient string constants is to encode the constants in the import names of their globals and implement fast paths in the engines for materializing those constants at instantiation time without needing to parse anything in JS. This strategy only works for valid strings (i.e. strings without unpaired surrogates) because only valid strings can be used as import names in the WebAssembly syntax. Add a new configuration of the StringLowering pass that encodes valid string contents in import names, falling back to the JSON custom section approach for invalid strings. To test this chang, update the printer to escape import and export names properly and update the legacy parser to parse escapes in import and export names properly. As a drive-by, remove the incorrect check in the parser that the import module and base names are non-empty.
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
|
||
| Pass* createStringGatheringPass() { return new StringGathering(); } | ||
| Pass* createStringLoweringPass() { return new StringLowering(); } | ||
| Pass* createStringLoweringMagicImportPass() { return new StringLowering(true); } |
There was a problem hiding this comment.
| Pass* createStringLoweringMagicImportPass() { return new StringLowering(true); } | |
| Pass* createMagicStringLoweringPass() { return new StringLowering(true); } |
/jk
| } | ||
| } else if (!allowWTF && 0xDC00 <= *u && *u < 0xE000) { | ||
| // Unpaired low surrogate. | ||
| return std::nullopt; |
There was a problem hiding this comment.
No, this is newly necessary to catch strings that are valid WTF-16 but not valid UTF-16.
| (assert_invalid | ||
| (module (import "" "" (table 10 funcref)) (table 10 funcref)) | ||
| "multiple tables" | ||
| ) |
There was a problem hiding this comment.
We of course have supported multiple tables for a long time, and this test should have been removed when that support was introduced. But it was not noticed and continued failing successfully until now because of the empty import names.
There was a problem hiding this comment.
Are empty import names not valid? I thought they were.
There was a problem hiding this comment.
They are valid, but the legacy text parser was incorrectly rejecting them, which caused this assert_invalid check to pass. Now that I've fixed the legacy parser to allow empty names, this test started failing.

The latest idea for efficient string constants is to encode the constants in
the import names of their globals and implement fast paths in the engines for
materializing those constants at instantiation time without needing to parse
anything in JS. This strategy only works for valid strings (i.e. strings without
unpaired surrogates) because only valid strings can be used as import names in
the WebAssembly syntax.
Add a new configuration of the StringLowering pass that encodes valid string
contents in import names, falling back to the JSON custom section approach for
invalid strings.
To test this chang, update the printer to escape import and export names
properly and update the legacy parser to parse escapes in import and export
names properly. As a drive-by, remove the incorrect check in the parser that the
import module and base names are non-empty.