Hardware instruction set support for crossgen2#33274
Hardware instruction set support for crossgen2#33274davidwrighton merged 48 commits intodotnet:masterfrom
Conversation
|
@dotnet/crossgen-contrib This PR is a work in progress, and I haven't actually tested a large swath of code through this. Please review for general structure if you have the time, but I don't expect this to be the exact state of what I check in. |
There was a problem hiding this comment.
JitConfigProvider is a process-wide configuration (due to RyuJIT hosting limitations and no other reason, really). Putting more things in it than strictly necessary ties our hands when we e.g. want to add multi-versioning later.
I would add a InstructionSetSupport GetInstructionSetSupport(MethodWithGCInfo) method on Compilation and have CorInfoImpl call into that when computing the flags in getJitFlags. It would always return the same thing right now, but it will be an obvious point to update if we ever add multi-versioning. I like to leave obvious points like that because it avoids "creative" solutions like adding policies to JitConfigProvider.
(JitConfigProvider stores a bunch CORJIT_FLAGS because I didn't clean it up when we were moving it around in the past few months. I should have, to remove the precedent. It should only store the COMPlus flags for RyuJIT.)
There was a problem hiding this comment.
:) Yeah, agreed, I'll admit I ran out of energy for threading this all around the compiler and wanted to see if it all worked.
There was a problem hiding this comment.
Could you move this file to tools\Common\Compiler?
There was a problem hiding this comment.
Once this is in Common, it will trigger the rule that says "stuff in Common shouldn't reference the ReadyToRun namespace".
Instead of implementing ICompilationRootProvider, could you move the piece of logic in that method to the Compilation constructor where we can just _dependencyGraph.AddRoot it? It will also avoid having to expose NodeFactory on IRootingServiceProvider.
src/coreclr/src/inc/readytorun.h
Outdated
There was a problem hiding this comment.
Should there be a note about keeping this in sync (generally speaking) with https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/instr.h#L295?
I'd imagine we wouldn't want to add support for a new ISA in one place and forget about adding it in the other.
There was a problem hiding this comment.
I agree, there is a problem here. There are a thicket of different ways in which various closely related portions of the system treat these instruction sets. I currently count 5 different representations (The string one I introduced throughout crossgen2, the ReadyToRunInstructionSet enum, the CORJIT_Flags, the jit internal instruction set enum, and the managed hardware intrinsics classes.) At the moment, I have logic which mostly enforces that the managed hardware intrinsics and the string names are kept in sync, the logic mapping from jit internal instruction sets to strings is setup to produce noway asserts if there isn't a mapping, and the logic mapping string names to CORJIT flags, produces errors if there is no mapping. Thus there is a decent amount of enforcement, but its really messy, and its likely that any actual attempt to add a new instruction set will be really frustrating as a developer finds more and more enums and such to update.
There was a problem hiding this comment.
Is it fundamentally necessary that these be distinct? Why can't they be shared in some way?
There was a problem hiding this comment.
All constants and tables used in the ReadyToRun file format are described in the ReadyToRun file format headers. (And nothing else is.) By policy, the common portions of the compiler do not use any ReadyToRun enums or data structures. (Fortunately, the translation between R2R enum and the strings in crossgen is straightforward and transparent.)
More complex is the handling of CORJIT_Flags, and the jit internal instruction set enums. I'd be willing to code up consolidating them on the R2R enum, by removing all of the CORJIT_Flags that are instruction set specific, and either using the R2R enum internally in the JIT, or by making sure the R2R enum values match up with the jit internal enum. (This would involve removing some enum values that have already been defined, under the theory that they cannot be used without feeding them through the rest of the system in any case.) I'd be happy to do this, as the current morass of translations is extremely confusing and risky.
src/coreclr/src/jit/compiler.h
Outdated
There was a problem hiding this comment.
So this one asks "is the ISA supported" without changing any flags, etc
compSupports asks but also marks the ISA as being used
compSupportsOptional asks and marks it as used if it is considered part of the baseline?
There was a problem hiding this comment.
Not quite.
compSupportsNoReporting asks is the ISA supported without reporting anything to the VM side. Its extremely dangerous to use, and really only valuable for setting up the set of instruction sets useable internally in the jit. Auditing usage of this is quite easy. If the result of this has any direct impact on codegen, it is a bug.
compSupportsOptional asks is the ISA supported. If it returns true, it will report the instruction set usage to the VM, but if false, it does no reporting. This function is only to be used when returning false will not cause the JIT to generate code that will not work correctly if the code is used on a machine which does support the instruction set. Use of this function should be audited with significant care, as incorrect usage of the return value may cause use of illegal instructions or other incorrect behavior at runtime. Unfortunately, I couldn't see a way to usefully do CPU feature usage detection without implementing an api like this, or more significant restructuring of the jit codebase.
compSupports asks is the ISA supported. The fact that the jit asked such a question is recorded and the generated code will not be used on hardware which does not exactly match the result of the call. For instance, if compSupports(Avx) is called, then if it returns false, the generated code will never be executed on hardware which supports Avx. This is a safe api to use at all times. Over-usage of this api may cause unnecessary prevention of functions from being used at runtime.
There was a problem hiding this comment.
I wonder if we could come up with names that make their usage more clear as to the intended usage 🤔
There was a problem hiding this comment.
I'm notoriously bad at naming (the names make sense to me...), so I'd welcome suggestions, but really I'd rather someone from the JIT team chime in on that, as they are the ones that will have to maintain them going forward, and I want it to make sense to them. @dotnet/jit-contrib
There was a problem hiding this comment.
I totally agree with @tannergooding - these are extremely confusing, and the idea of having anything that's "extremely dangerous to use" in the JIT just sounds like a bad idea. I think we really need to think this through carefully and figure out how to add the appropriate validation.
This statement is quite confusing with the triple negation: "This function is only to be used when returning false will not cause the JIT to generate code that will not work correctly if the code is used on a machine which does support the instruction set." I had to look through the code before I understood what it meant. AFAICT the only reason for this is to prevent mixing code with different ideas of the size of Vector<T>.
How about:
bool compCanUse(InstructionSet isa)- this tells the JIT that it's allowed to use a given ISA, but reports no usage to the VM.bool compSetUsed(InstructionSet isa)- this checks whether the JIT is allowed to use a given ISA and if so, tells the VM that it is doing so.void compDisallow(InstructionSet isa)- this tells the VM not to use this code on a machine that supports the given isa. This would handle theVector<T>case.
There was a problem hiding this comment.
An interesting scenario I thought of that this wouldn't cover is if you did a Sse41 check for Math.Round, the JIT said "supported with check", and we didn't emit it (we treat it as "unsupported"); but some other part of the method had its own Sse41.IsSupported check and did emit an Sse41 instruction; then we would miss out on that information (which is that "some path" could have benefited from this).
This wouldn't be an issue if we treat it as "supported", but it is something to consider. Likewise, I think we could potentially miss cases with the other proposal if we only rely on compExactlyDependsOn and compOpportunisticallyDependsOn. Such as if some code path emits an Sse41 instruction without the appropriate check.
There was a problem hiding this comment.
@tannergooding - I don't understand how the above captures the scenario where the JIT asked if XXX is supported, the answer was "no", but the code is still valid even if the answer had been "yes" (versus the case where it would be incorrect code for the "yes"/supported case).
There was a problem hiding this comment.
@CarolEidt, Could you give a small example of such a scenario in psuedo code?
There was a problem hiding this comment.
@tannergooding, @CarolEidt, also from the discussion, I am entirely unconvinced that the compSupports SupportedWithCheck return value is actually useful. It might be, but only as a possible optimization we could use in the future if we find that a true/false result is insufficient.
A scenario where there is a check for an instruction set, and the result was false, but the code should still be used is the Math.Round case, or Vector4.Dot case. More generally, while these cases are somewhat rare, they do exist. In our codebase, they can be seen in the spots where I used compSupportsOptional in my change, or where the code uses a construct like getSIMDSupportLevel() and has a viable fallback path that isn't that inefficient. Remember that R2R code isn't really required to be best in class performance, it is instead intended to be pretty good, and sometimes non-ideal output is ok if it works.
A pseudo code example might look like..
float ComplexMathThing(float a, float b, float c)
{
return MathF.Round(/*Some very complex expression involving a, b, and c*/ );
}
For all of those sorts of uses, I suspect we'd need to have a compSupportsLike function that doesn't record that an instruction is expected to be used. I suspect it be used in exactly the same locations and way as the compSupportsOptional function that started this whole discussion. Thus I believe the actual impact of following Tanner's model of how to think about the problem is just that we might get a better assert message, and it might be easier to explain the model, but the practical effect is much the same.
I still prefer the approach that I outlined about annotating the questions that are asked, but Tanner's approach has the benefit of allowing us to add an assert to the product that verifies that an instruction set was queried for before it was used by the jit.
There was a problem hiding this comment.
I am entirely unconvinced that the compSupports SupportedWithCheck return value is actually useful
I might be inclined to agree. The initial reasoning around this was from the perspective of S.P.Corelib in that we could trust all code paths were "equivalent", and therefore IsSupported could be a cached CPUID check.
But, if we are saying we can't trust user code and therefore we must reJIT if the method could target AVX and the actual hardware supports AVX, then it isn't important.
It might be impactful to some AOT scenarios, but not without some external switch allowing users to opt into that behavior.
| #if READYTORUN | ||
| _methodCodeNode.InitializeInliningInfo(_inlinedMethods.ToArray()); | ||
|
|
||
| // Detect cases where the instruction set support used is a superset of the baseline instruction set specification |
There was a problem hiding this comment.
Is this saying it will detect AVX as a superset of SSE2 or it only detects instruction sets that are a superset and share the same encoding?
There was a problem hiding this comment.
Its more subtle than that.
The default baseline instruction set is Sse, Sse2. However, the jit compiler is told to generate code assuming that it may use Sse, Sse2, Sse3, Ssse3, Sse41, Sse42, Lzcnt, Popcnt, and Pclmqdq. But if the compiler actually uses any of Sse3, Ssse3, Sse41, Sse42, Lzcnt, Popcnt, or Pclmqdq it will notify the crossgen2 managed code. That notification feeds into this logic and if so, we will then mark the method such that the generated code can only be used if the actual hardware supports the instruction set. That's all well and good, and considering that just about all hardware does support these instruction sets, we get crossgen R2R to generate ideal code for all these cases.
The less optimal case is where code attempts to use an Avx, or Avx2 feature, or such. In thoses cases, we will generate code as if the processor does NOT support the feature. In corelib, the idea is that we will audit our code reviews such that such a decision does not make the logic wrong (just slow) (This code auditing is already necessary for our current crossgen1 intrinsics support for corelib), but in the general case we cannot assume that the code will behave correctly if it runs on hardware that does support Avx, Avx2, etc. In that case, what this logic does is record the decision that Avx use was attempted and forbidden. Then the generated code will not be used at runtime, and the JIT will be invoked, if the hardware actually supports Avx/Avx2.
In addition, the --instruction-set argument allows adjusting the baseline instruction set. For instance, if the compiler is run with --instruction-set Sse42, --instruction-set Popcnt --instruction-set Lzcnt, --instruction-set Pclmqdq, then this logic will never kick in, and do anything, as the generated code will never exceed that of the baseline.
There was a problem hiding this comment.
Before this is merged, I'd really like to see copious documentation of the behavior. It is potentially quite confusing, and we need to be sure that the precise behavior is discoverable and understandable. Also, we should make sure that the JITDUMP indicates the information it has communicated to the VM, and that the log files at execution time include information about when and why methods are rejected for reasons such as this.
There was a problem hiding this comment.
Agreed that this needs documentation somewhere, the existing scheme is very nearly undescribed, and this is a substantial difference (and when this feature is complete, I don't expect to change the crossgen1 model, and this more advanced handling of intrinsics and such will remain restricted to crossgen2. I expect I'll put a document in the BoTR about this.
Adding JITDUMP data should be easy.
For logging, what sort of thing are you looking for? That's a feature all by itself, as we don't currently have any reasonable form of logging around R2R load behavior for functions. I could add an event, or possibly a STRESSLOG entry or an addition the the debug only LOG. @CarolEidt @tannergooding I'd like some feedback here.
There was a problem hiding this comment.
There's a method SetReadyToRunRejectedPrecompiledCode on PrepareCodeConfig that we already use to note when the precompiled R2R version of a method is not usable because of changes in dependent assemblies. This makes it into the ETL so jit event clients can see how many precompiled methods ended up not being usable.
Presumably you could extend this concept.
There was a problem hiding this comment.
Ah, that's convenient. I wasn't aware of that bit, but it should be fairly straightforward to add one bit that indicates that it was dropped due to an instruction set mismatch, and from there looking at the r2rdump output to figure out which exact instruction set was at fault should be straightforward.
There was a problem hiding this comment.
Also, the SetReadyToRunRejectedPrecompiledCode bit would be set for this condition as well. That bit is not tied to changes in dependent assemblies, its tied to changes in type layout, which are most likely triggered by changes in dependent assemblies, but there are other ways it could be different.
There was a problem hiding this comment.
For reference, this is what the x86 emulator for Windows 10 on ARM reports (for both my Samsung GalaxyBook 2 and my Microsoft Surface Pro X):
Notably, SSE4.2, POPCNT, and LZCNT aren't supported.
- SSE4.2 and POPCNT were introduced in November 2008, so 11 years old
- LZCNT is newer than both and is June 2013, so 6 years old
- AMD introduced LZCNT and POPCNT in 2007 (this was part of their ABM instruction set and uses the same CPUID bits, respectively, that Intel later used)
- LZCNT, for Intel, is actually newer than AVX (2011) and same time frame as AVX2 (2013)
- BMI1/BMI2 are the same time frame as AVX2 (2013, Haswell)
There was a problem hiding this comment.
Oh darn, I had thought that SSE4.2 was supported on the emulator, the general strategy here is optimized based on supporting a set that is extremely widely available. As the Arm emulation system only supports SSE4.1, what are your thoughts around changing the SIMDSupportLevel of SIMD_SSE4_Supported to only require SSE4.1? As far as I can tell, the only difference that the jit takes is around support for the pcmpgtq instruction. If we change line 215 in simdcodegenxarch.cpp to check for compSupports(Sse42) instead of using getSIMDSupportLevel() >= SIMD_SSE4_Supported, and changing getSIMDSupportLevel to only require Sse41, I think that would be the only change needed.
There was a problem hiding this comment.
One detail is that supporting a few extra hardware instruction sets isn't problematic, and causes only minimal jit behavior at runtime, but the difference between Sse41 and Sse42 support is very significant as the jit has the concept of simdsupport level which is quite impactful.
There was a problem hiding this comment.
I think that would be fine, the biggest improvements (with regards to the S.N.Vector types) are likely to come from SSE4.1 anyways, but I would appreciate hearing @CarolEidt's thoughts as well.
There was a problem hiding this comment.
Honestly, I don't feel strongly about this, but it seems reasonable to me. One wonders if there are any plans to update the emulator.
There was a problem hiding this comment.
I find it interesting that QEMU's CPU emulator has exactly the same restriction to only supporting SSE4.1. I don't intend to put the change into this PR (as it is too large already), but will prepare something separate when this PR is complete.
|
I think that @AndyAyersMS is going in the right direction with |
6e546b4 to
8ad656f
Compare
|
@tannergooding @CarolEidt @AndyAyersMS Could you take a read through the BotR chapter I wrote up for this? Its located at https://github.com/dotnet/runtime/blob/8ad656f3ea7fbad01165da80bb1c760c65a2fdc1/docs/design/coreclr/botr/vectors-and-intrinsics.md |
8ad656f to
619553e
Compare
AndyAyersMS
left a comment
There was a problem hiding this comment.
Some initial feedback -- need more time to look at the rest of the jit changes.
There was a problem hiding this comment.
| The notifyInstructionSetUsage api is used to notify the AOT compiler infrastructure that the code may only execute if the runtime environment of the code is exactly the same as the boolean parameter indicates it should be. For instance, if `notifyInstructionSetUsage(Avx, false)` is used, then the code generated must not be used if the `Avx` instruction set is useable. Similarly `notifyInstructionSetUsage(Avx, true)` will indicate that the code may only be used if the `Avx` instruction set is available. | |
| The notifyInstructionSetUsage api is used to notify the AOT compiler infrastructure that the code may only execute if the runtime environment of the code is exactly the same as the boolean parameter indicates it should be. For instance, if `notifyInstructionSetUsage(Avx, false)` is used, then the code generated must not be used if the `Avx` instruction set is available. Similarly `notifyInstructionSetUsage(Avx, true)` will indicate that the code may only be used if the `Avx` instruction set is available. |
…roughout the compiler
…e is properly loadable
There was a problem hiding this comment.
The
Ttype variable may be any of the primitive types
Maybe this could be reworded? It isn't allowed to be char or bool for example. Only the 8 basic integer types (signed and unsigned int8, int16, int32, and int64) and the 2 basic floating-point types (float32 and float64)
There was a problem hiding this comment.
The length and alignment of
Vector<T>is unknown to the developer, andVector<T>may not exist in any interop signature
This isn't quite true, it is known at Runtime by checking Vector<T>.Count which specifies it is a struct of exactly Count T fields.
There was a problem hiding this comment.
I agree with @tannergooding's comments, and would reword the last phrase (after the comma) as: although they are only hardware accelerated on platforms where the Vector.IsHardwareAccelerated property returns true.
|
|
||
| private static int UseAvx2(uint value) | ||
| { | ||
| // THIS IS A BUG!!!!! |
There was a problem hiding this comment.
Is it worth documenting that this is a bug because PopCount might have been rejitted to know Avx2.IsSupported but UseAvx2 might have not?
There was a problem hiding this comment.
Likewise, is this "ok" to do if UseAvx2 is marked AggressiveOptimization?
There was a problem hiding this comment.
I wouldn't think that this would be ok even if it is marked AggressiveOptimization - I don't think we should ever rely for correctness on the behavior of optimization.
There was a problem hiding this comment.
I agree with Carol here. At some point we may change the policy on AggressiveOptimization, and that shouldn't break the correctness of applications. We do parse another attribute, the Ssytem.Runtime.BypassReadyToRunAttribute, which can be used to simply declare that a method shouldn't be ready to runned. This attribute isn't actually in the BCL, the application must define it itself, but it can be used to safely forbid a method from being precompiled. I've updated the document, and added a comment that this pattern IS acceptable when applied to a subset of the x86 and x64 instruction sets, as we have special treatment for those instruction sets in the aot compiler when compiling CoreLib.
|
Reviewed the botr section on vectors-and-intrinsics and overall LGTM 👍 |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Did one more pass.
@CarolEidt please look over this PR again when you get a chance.
| bool isaSupported = (opts.compSupportsISA & (1ULL << isa)) != 0; | ||
| if ((opts.compSupportsISAReported & isaBit) == 0) | ||
| { | ||
| notifyInstructionSetUsage(isa, isaSupported); |
There was a problem hiding this comment.
Maybe add a jitdump here? Eg
| notifyInstructionSetUsage(isa, isaSupported); | |
| JITDUMP("Notifying runtime that codegen for this method is %susable if runtime HW supports %s\n", | |
| isaSupported ? "" : "NOT ", InstructionSetToString(isa)); | |
| notifyInstructionSetUsage(isa, isaSupported); |
There was a problem hiding this comment.
notifyInstructionSetUsage does a jitdump internally. (All calls to the api are dumped)
| simdBaseType = TYP_UNKNOWN; | ||
| } | ||
| #endif // TARGET_XARCH | ||
|
|
There was a problem hiding this comment.
Seems odd that we didn't need this before but need this now -- is it just for the R2R case because you modified largestEnregisterableStructSize ?
There was a problem hiding this comment.
Yes, this only happens in the R2R case, and only because of my trickery in largestEnregisterableStructSize. It has the nice property of making the getBaseTypeAndSizeOfSIMDType safe to call at all times, which I thought was a nice upgrade.
src/coreclr/src/jit/importer.cpp
Outdated
There was a problem hiding this comment.
These changes seem fairly subtle... would be easy enough to think the upper check should still be maxSIMDStructBytes().
I think we may need to make it clearer why maxSIMDStructBytes and largestEnregisterableStructSize differ and which one should be called when.
There was a problem hiding this comment.
Looking at this it would make more sense to me to build a helper function which encapsulates the call to minSIMDStructBytes and largestEnregisterableStructSize. Something like structSizeMightRepresentSIMDType(size_t structSize) ?
There was a problem hiding this comment.
Yes, I think that might make the intention clearer.
- Change command line to take the same parameters as the mono aot compiler - Properly handle references to the Vector128 and Vector256 apis
- Add structSizeMightRepresentSIMDType instead of using largestEnresterableStructSize directly - Leaves a good spot for a comment
CarolEidt
left a comment
There was a problem hiding this comment.
I've only reviewed the document so far; will review the code next.
There was a problem hiding this comment.
I agree with @tannergooding's comments, and would reword the last phrase (after the comma) as: although they are only hardware accelerated on platforms where the Vector.IsHardwareAccelerated property returns true.
There was a problem hiding this comment.
I'd suggest reword starting with "very few features" as "very few features are supported directly on these types, other than creation. They are primarily used in the processor specific hardware intrinsic apis."
There was a problem hiding this comment.
This is a bit confusing as it tends to gloss over the fact that the developer must ensure that there are IsSupported checks for all apis being used.
There was a problem hiding this comment.
This tends to imply that the compiler will not recognize the intrinsics when compiling tier 0 code, but that's not the case. The recognition of intrinsics is independent of tier.
There was a problem hiding this comment.
I would suggest to use "ahead-of-time" to make it clear that this term is basically an adjective applied to compilation. In fact, I'd suggest to define it early, along with the acronym, and then use AOT in the rest of the document.
There was a problem hiding this comment.
This is a bit confusing, perhaps a little elaboration would be useful.
There was a problem hiding this comment.
after "concerns" add "because it is never precompiled".
|
|
||
| private static int UseAvx2(uint value) | ||
| { | ||
| // THIS IS A BUG!!!!! |
There was a problem hiding this comment.
I wouldn't think that this would be ok even if it is marked AggressiveOptimization - I don't think we should ever rely for correctness on the behavior of optimization.
There was a problem hiding this comment.
Suggest "out of the" => "from the"
There was a problem hiding this comment.
I think the "will be recorded" belongs after the "or any failure" clause.
CarolEidt
left a comment
There was a problem hiding this comment.
A question and a comment; otherwise I think this is getting close.
| case READYTORUN_INSTRUCTION_Popcnt: return InstructionSet_POPCNT; | ||
| #endif // TARGET_AMD64 | ||
| #ifdef TARGET_X86 | ||
| case READYTORUN_INSTRUCTION_Sse: return InstructionSet_SSE; |
There was a problem hiding this comment.
I think you may have addressed this in an earlier discussion, but I believe that these need to be the same as the TARGET_AMD64 ones - is there a reason they can't share code here?
There was a problem hiding this comment.
These are autogenerated, and it would add a fair amount of complexity to have the autogenerator write them out once. In this particular part of the code, they don't actually have to be the same (that's restricted to a small bit of logic that is in crossgen2, which have explicit asserts for the details that must remain the same.) At the same time, the way they are authored in the InstructionSetDesc.txt, the autogenerator will happen to keep things the same and as long as new code follows the pattern in there, stuff will remain the same.
There was a problem hiding this comment.
Ah yes, I recall that now. Seems reasonable.
src/coreclr/src/jit/compiler.h
Outdated
There was a problem hiding this comment.
This really feels uncomfortable. Might it be worth adding a compExactlyForbids? I know we've been over this before, but this looks like an assertion that it is present, not an assertion that it is not.
There was a problem hiding this comment.
We haven't been over this particular usage before, as it might have been obscured by some of the other changes you were looking at. I share your discomfort here, Perhaps
// Ensure that code will not execute if an instruction set is useable. Call only
// if the instruction set has previously reported as unuseable, but when
// that that status has not yet been recorded to the AOT compiler
void compVerifyInstructionSetUnuseable(CORINFO_InstructionSet isa)
{
// use compExactlyDependsOn to ensure that the exact state of the useability of isa is recorded
bool isaUseable = compExactlyDependsOn(isa);
// Assert that the useability of the isa is false. If true, this function should never be called.
assert(!isaUseable);
}
There was a problem hiding this comment.
@davidwrighton - that sounds like a great alternative; wordy but certainly a context where a little wordiness is probably justified.
CarolEidt
left a comment
There was a problem hiding this comment.
LGTM - just a couple of minor rewording suggestions on the doc.
There was a problem hiding this comment.
Needs a "returns true at the end, I think.
- This will work for now, and we can fix this later with an easier to review issue
|
Congratulations @davidwrighton on getting this completed! |
|
|
||
|
|
||
| InstructionSetFlags _actualInstructionSetSupported; | ||
| InstructionSetFlags _actualInstructionSetUnsupported; |
There was a problem hiding this comment.
@davidwrighton I was digging through this over the weekend and this will always have the default value of 0. Should we actually set it somewhere?
There was a problem hiding this comment.
Sigh. Yes. Just below the !supportEnabled path should be setting this.
Vector<T>becomes useable by crossgen'd code.- As a tool to control the behavior of crossgen2 and test some of the behavior around hardware intrinsics, add a command line parameter--compile-aggressive-optimization-methodswhich changes the default R2R compiler suppression of compiling code marked with theMethodImplOption.AggressiveOptimization.Work items remaining
[x] Finalize on internal JIT api to query/specify the instruction sets used
[x] Reduce the plethora of different representations of instruction set in the product
[ ] AddReadyToRunRejectedPrecompiledCode_InstructionSetMismatchflag to ETW Method event.[ ] Adjust SIMDSupportLevel to be dependent on Sse4.1 instead of 4.2 Also, change handling of intrinsic that generates pcmpgtq instruction to query on support for Sse42 instead of SIMDSupportLevel, so that change is legal to make.[x] Update
JITEEVersionIdentifier[x] BotR chapter on using hardware intrinsics in AOT code which covers both crossgen1, crossgen2, and runtime behaviors, as well as how the JIT internal logic is supposed to work.