[browser][coreCLR] Loading WebCIL 1.0#124268
[browser][coreCLR] Loading WebCIL 1.0#124268pavelsavara wants to merge 20 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @agocke, @jeffschwMSFT, @elinor-fung |
There was a problem hiding this comment.
Pull request overview
This PR adds support for loading WebCIL version 1.0 assemblies in CoreCLR on browser/WASM targets. WebCIL is an alternative container format for ECMA-335 assemblies that strips PE headers and packages assemblies as WebAssembly modules with a .wasm extension, helping avoid issues with firewalls/antivirus software that block .dll files.
Changes:
- Added
WebCILImageLayoutC++ class that extendsPEImageLayoutto handle WebCIL format, parsing WebCIL headers and mapping RVAs to file offsets - Implemented
instantiateWebCILModulein TypeScript that instantiates WebCIL .wasm modules and extracts assembly payloads into aligned memory - Enabled WebCIL support for CoreCLR by removing
WasmEnableWebcil=falserestrictions and updating test configurations
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/native/libs/Common/JavaScript/types/public-api.ts | Added "webcil1" asset behavior type |
| src/native/libs/Common/JavaScript/types/exchange.ts | Added instantiateWebCILModule to BrowserHost exports |
| src/native/libs/Common/JavaScript/cross-module/index.ts | Updated export table to include instantiateWebCILModule |
| src/native/corehost/browserhost/loader/dotnet.d.ts | Added "webcil1" type definition |
| src/native/corehost/browserhost/loader/assets.ts | Added WebCIL detection and loading logic, mapping .wasm to .dll paths |
| src/native/corehost/browserhost/host/index.ts | Exported instantiateWebCILModule function |
| src/native/corehost/browserhost/host/host.ts | Updated assembly path mapping to replace .wasm with .dll |
| src/native/corehost/browserhost/host/assets.ts | Implemented instantiateWebCILModule to extract WebCIL payload and register assemblies |
| src/mono/wasm/Wasm.Build.Tests/Common/BuildEnvironment.cs | Removed CoreCLR restriction for WebCIL in tests |
| src/mono/browser/build/WasmApp.InTree.props | Removed WasmEnableWebcil=false for CoreCLR, updated TODO comment |
| src/coreclr/vm/webcildecoder.cpp | New file implementing WebCILImageLayout class for parsing and validating WebCIL format |
| src/coreclr/vm/peimagelayout.h | Added WebCIL header structures and WebCILImageLayout class declaration |
| src/coreclr/vm/peimagelayout.cpp | Added WebCIL format detection in LoadFlat |
| src/coreclr/vm/peassembly.cpp | Excluded 32-bit NT headers check for TARGET_BROWSER, added braces for consistency |
| src/coreclr/vm/coreassemblyspec.cpp | Added braces around single-statement if bodies |
| src/coreclr/vm/CMakeLists.txt | Added webcildecoder.cpp to browser build |
| src/coreclr/inc/pedecoder.h | Made specific PEDecoder methods virtual for browser target to enable WebCILImageLayout overrides |
| eng/testing/tests.browser.targets | Removed WasmEnableWebcil=false for CoreCLR, updated TODO comment |
|
Please have @elinor-fung look at this |
5adef6d to
102345a
Compare
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
src/tasks/Microsoft.NET.WebAssembly.Webcil/WebcilWasmWrapper.cs
Outdated
Show resolved
Hide resolved
|
|
||
| const assemblyPaths = loaderConfig.resources!.assembly.map(asset => asset.virtualPath); | ||
| const coreAssemblyPaths = loaderConfig.resources!.coreAssembly.map(asset => asset.virtualPath); | ||
| const assemblyPaths = loaderConfig.resources!.assembly.map(asset => asset.virtualPath.replace(/\.wasm$/, ".dll")); |
There was a problem hiding this comment.
Assemblies need to end in .dll for compatibility reasons?
There was a problem hiding this comment.
I think there are many places in the codebase that assume .DLL or .EXE
runtime/src/coreclr/binder/utils.cpp
Lines 182 to 183 in f6777ec
runtime/src/coreclr/hosts/corerun/corerun.cpp
Lines 136 to 137 in f6777ec
Is it worth adding another extension ?
There was a problem hiding this comment.
That I'm not sure about. If there are a lot of places that make this assumption then it might not be so clean to add another extension. @jkotas do you have any thoughts on this?
There was a problem hiding this comment.
Right now we do not support loading Webcil .wasm files from Virtual File System (VFS), they are all in-memory only.
There was a problem hiding this comment.
Hmm I'm not super familiar with VFS. Can you explain a little more about what the loading workflow looks like right now? Say I've built the runtime + browser host, then I compile a hello-world.dll assembly with crossgen into hello-world.wasm, and I want to load it and run it. What would that look like?
There was a problem hiding this comment.
That's a lecture request 🤣
For browserhost running in browser, this is heavily async code that downloads all necessary assets and installs them.
You can't use WASM memory or the VFS until emscripten POSIX emulator is up and running.
After we have POSIX "process" and CoreLib in memory, we can start the CoreCLR VM.
After we have the rest of assemblies, we can start the application (managed Main).
For corerun running in NodeJS, this is much simpler, but similar. It uses real file system of the host to load DLLs. I didn't implement WebCIL loader for corerun. It doesn't implement many other features.
There was a problem hiding this comment.
@jkotas do you have any thoughts on this?
Right now we do not support loading Webcil .wasm files from Virtual File System (VFS), they are all in-memory only.
Do we simulate file paths in browser? For example, is Assembly.Location non-empty in browser?
If the assembly file path is user visible, it makes sense to keep using .dll as the suffix.
If the assembly file path is just an internal detail, it does not harm to use .dll as the suffix.
|
What would the RiuJIT team workflow look like for |
I don't know if we're entirely sure of this yet. I think right now, our priority is whatever workflow is easiest to implement so that we can test codegen as soon as possible. What has your testing workflow been @pavelsavara? I'd be curious about @kg's perspective here as well. |
I'm used to test with full WASM SDK +http server + browser. The WASM SDK is MSBuild scripts that we ship. It prepares all necessary assets into a form that could be hosted on HTTP server. It also creates manifest - configuration for the browserhost loader, with list of all assets to download. For developers outside of my team, this seems too much. So, I'm thinking, what's the current R2R inner dev loop for your team, when you test R2R for windows or iOS ? I suggest that I could implement loading of R2R.wasm into corerun in next PR, if corerun is the way ... |
# Conflicts: # src/native/corehost/browserhost/loader/assets.ts # src/native/libs/System.Native.Browser/native/index.ts
| byte[] ulebSectionSize = ULEB128Encode(dataSectionSize); | ||
|
|
||
| if (putativeULEBDataSectionSize.Length != ulebSectionSize.Length) | ||
| throw new InvalidOperationException ("adding padding would cause data section's encoded length to chane"); // TODO: fixme: there's upto one extra byte to encode the section length - take away a padding byte. |
There was a problem hiding this comment.
Typo in error message: "chane" should be "change".
I haven't done any testing of R2R so far for windows or ios and so I don't have as much context there. However, it sounds like automated tests are generally executed with If we implement support in corerun, would we then be able to do something like, // No external dependencies besides corlib
class WasmTest
{
public static void Main()
{
CodegenTestEntrypoint();
}
} |
Enable WebCIL support for CoreCLR on Browser/WASM
Fixes #120248
WebCIL format v1.0 with 16-byte section alignment and IMAGE_SECTION_HEADER format
Bumps WebCIL version from 0.0 to 1.0.
WebcilConverternow emits section data at 16-byte-aligned offsets, ensuring RVA static fields backingReadOnlySpan<T>over types up toVector128<T>retain their natural alignment.IMAGE_SECTION_HEADERthat makes the code of reading much simpler.CoreCLR PEDecoder: native WebCIL parsing
New
src/coreclr/inc/webcil.hdefinesWebcilHeaderandWebcilSectionHeaderC structs withstatic_assertsize checks.PEDecodergains:FLAG_WEBCILflag;HasWebcilHeaders()validates magic, version, section count/bounds/ordering and synthesizesIMAGE_SECTION_HEADER[]from WebCIL sections soRvaToSection()/RvaToOffset()work uniformly.HasHeaders()/CheckHeaders()as unified entry points that dispatch to WebCIL or NT paths.HasDirectoryEntry(),GetDirectoryEntryData()(handlesCOMHEADERandDEBUGentries viaWebcilHeaderfields with explicit range validation against section data).CheckCorHeader()andCheckILOnly()WebCIL branches (WebCIL images are always IL-only by definition).GetPEKindAndMachine()reportspeILonly/IMAGE_FILE_MACHINE_I386for WebCIL.GetNumberOfRvaAndSizes()moved from public to private API;GetNumberOfSections()/FindFirstSection()delegate to synthesized headers for WebCIL.CheckNTHeaders()toCheckHeaders()/HasHeaders().PEImage / PEAssembly integration
PEImageexposesHasHeaders()forwarding toPEDecoder::HasHeaders().PEAssemblyusesHasHeaders()instead ofHasNTHeaders()for metadata access.Browser host loader: WebCIL module instantiation
instantiateWebCILModule()in host assets:.wasmwrapper as a WebAssembly modulegetWebcilSize/getWebcilPayloadexports to extract the raw WebCIL.dllvirtual path.getWasmMemory/getWasmTableWasmEnableWebcil.DAC support gap
All WebCIL code paths in
PEDecoderare gated behind#ifdef TARGET_BROWSER, which is mutually exclusive withDACCESS_COMPILE.Should we drop
#ifdef TARGET_BROWSER? Or build target specific DAC binaries ?Follow up in #124467