This fixes emitGetVexPrefixSize to support detecting 2-byte prefix support #79478

tannergooding · 2022-12-09T23:36:18Z

As raised on #79363 and in various past PR/issues, we did not correctly estimate the size of the VEX prefix when used. This had negative side effects such as allocating more memory than necessary and when loop alignment support was added, it meant that we could no longer use the 2-byte prefix when also aligning loops.

This resolves that by updating emitGetVexPrefixSize to check the relevant instrDesc inputs to determine if the 2-byte or 3-byte prefix will be used.

As a side effect, this also removes some code that was dead and does some other minor cleanup to improve the general handling of the VEX prefix.

…tSimdPrefixIfNeeded to emitOutputRexOrSimdPrefixIfNeeded

…xAware to emitGetAdjustedSize

…rDesc

… can be computed correctly

ghost · 2022-12-09T23:36:27Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

As raised on #79363 and in various past PR/issues, we did not correctly estimate the size of the VEX prefix when used. This had negative side effects such as allocating more memory than necessary and when loop alignment support was added, it meant that we could no longer use the 2-byte prefix when also aligning loops.

This resolves that by updating emitGetVexPrefixSize to check the relevant instrDesc inputs to determine if the 2-byte or 3-byte prefix will be used.

As a side effect, this also removes some code that was dead and does some other minor cleanup to improve the general handling of the VEX prefix.

Author:	tannergooding
Assignees:	tannergooding
Labels:	`area-CodeGen-coreclr`
Milestone:	-

tannergooding · 2022-12-09T23:37:45Z

CC. @kunalspathak, @BruceForstall

This passed the full HardwareIntrinsics_r and HardwareIntrinsics_ro tests locally for "default", TieredCompilation=0, and ReadyToRun=0.

We'll want to also run the JitStress tests once CI shows there aren't any missed edge cases.

tannergooding · 2022-12-09T23:41:37Z

src/coreclr/jit/emitxarch.cpp

 //
-unsigned emitter::emitOutputSimdPrefixIfNeeded(instruction ins, BYTE* dst, code_t& code)
+emitter::code_t emitter::emitExtractEvexPrefix(instruction ins, code_t& code)


I initially extracted these methods as I thought the easiest thing was going to be to just pass the code_t through to emitGetVexPrefixSize and then do something like:

code_t vexPrefix = emitExtractVexPrefix(ins, code); assert(vexPrefix != 0); if ((vexPrefix & 0xFFFF7F80) == 0x00C46100) { return 2; } return 3;

However, there is quite a bit more logic that goes into correctly building up code and since we don't cache it anywhere, it was going to negatively impact throughput.

I ended up leaving the helper method here as it may still be useful in the future and it isolates a large chunk of complex logic that was just splatted inline before.

tannergooding · 2022-12-09T23:42:55Z

src/coreclr/jit/emitxarch.cpp

+unsigned emitter::emitGetEvexPrefixSize(instrDesc* id)
+{
+    instruction ins = id->idIns();
+    assert(IsEvexEncodedInstruction(ins));
+    return 4;


We only called this from emitGetAdjustedSize and only under an existing IsEvexEncodedInstruction check, so I simplified it to just assert and return the constant.

tannergooding · 2022-12-09T23:44:19Z

src/coreclr/jit/emitxarch.cpp

 //    code  -- The current opcode and any known prefixes
 //
 // Returns:
 //    Updated size.
 //
-unsigned emitter::emitGetAdjustedSizeEvexAware(instruction ins, emitAttr attr, code_t code)


When emitGetAdjustedSizeEvexAware was added emitGetAdjustedSize became fully dead code.

It was duplicating quite a bit of complex logic and if we really need it again, we can grab it from the git history, so I removed the dead code and renamed the EvexAware method to the same as the original name.

tannergooding · 2022-12-09T23:46:21Z

src/coreclr/jit/emitxarch.cpp

+// Returns:
+//    Prefix size in bytes.
+//
+unsigned emitter::emitGetVexPrefixSize(instrDesc* id)


This function had to be moved "down" so it could access hasCodeMR.

Like I mentioned above, I was originally going to go a different route but settled on the simpler approach here where we switch on the insFmt and do the couple minor checks rather than trying to build up the full code_t.

If we were caching the code_t somewhere so we didn't need to rebuild it 2-3 times, then another approach would be better. That is also a much more involved/complex change, but one that may be worthwhile long term.

tannergooding · 2022-12-09T23:47:23Z

src/coreclr/jit/emitxarch.cpp

+
+    if (EncodedBySSE38orSSE3A(ins))
+    {
+        // When the prefix is 0x0F38 or 0x0F3A, we must use the 3-byte encoding


This filters out a majority of the complex instructions, particularly those that take 3 inputs or do other "special things" with register representation.

tannergooding · 2022-12-09T23:59:31Z

src/coreclr/jit/emitxarch.cpp

+    if ((regForSibBits != REG_NA) && IsExtendedReg(regForSibBits))
+    {
+        // When the REX.X bit is present, we must use the 3-byte encoding
+        return 3;
+    }
+
+    if ((regFor012Bits != REG_NA) && IsExtendedReg(regFor012Bits))
+    {
+        // When the REX.B bit is present, we must use the 3-byte encoding
+        return 3;
+    }


insEncodeReg345 uses the REX.R bit and is always available
insEncodeReg3456 uses the vvvv field and is always available.

insEncodeReg012 uses the REX.B bit for extended registers and is only available in the 3-byte encoded
insEncodeRegSIB uses the REX.X bit for extended registers and is only available in the 3-byte encoded

Since SIB is only used for address encodings, we typically don't need to worry about it. Likewise, we normally only have to worry about the 012 case for scenarios where an operand can come from register or memory.

For VEX encoded binary instructions, like vaddps, this is normally the second operand:

ins tgt, op1, op2/mem scenario.

However, there are also some unary instructions, like vmovd, where this can be the destination or first operand:

ins tgt/mem, op1

ins tgt, op1/mem

tannergooding · 2022-12-10T00:06:57Z

src/coreclr/jit/emitxarch.cpp

+            break;
+        }
+
+        case IF_RRW_RRW_CNS:


I wanted to call out this format in particular.

It seems like we have some cases where the IF_* defined doesn't quite "make sense". This one, for example, should probably be IF_RWR_RRD_CNS.

It is currently used by emitIns_R_R_I and applies to instructions like:

pextrb

pextrd

pextrq

pextrw_sse41

extractps

vextractf128

vextracti128

shld

shrd

psrldq

pslldq

There are other formats I saw as well that don't exactly match the semantics of the instruction that's using them.

We should definitely identify and fix cases where the IF_ formats are incorrectly used; they are complicated enough as-is, without some being wrong.

tannergooding · 2022-12-10T04:36:48Z

Diffs are hugely positive, with similar diffs on Linux x64

Overall (-22,919 bytes)

Collection	Base size (bytes)	Diff size (bytes)
benchmarks.run.windows.x64.checked.mch	25,002,891	-1,718
coreclr_tests.run.windows.x64.checked.mch	362,743,027	-14,183
libraries.pmi.windows.x64.checked.mch	52,033,070	-4,440
libraries_tests.pmi.windows.x64.checked.mch	114,282,976	-2,578

MinOpts (-4,778 bytes)

Collection	Base size (bytes)	Diff size (bytes)
benchmarks.run.windows.x64.checked.mch	1,717,456	+0
coreclr_tests.run.windows.x64.checked.mch	266,521,814	-4,774
libraries.pmi.windows.x64.checked.mch	1,500,480	+0
libraries_tests.pmi.windows.x64.checked.mch	6,882,261	-4

FullOpts (-18,141 bytes)

Collection	Base size (bytes)	Diff size (bytes)
benchmarks.run.windows.x64.checked.mch	23,285,435	-1,718
coreclr_tests.run.windows.x64.checked.mch	96,221,213	-9,409
libraries.pmi.windows.x64.checked.mch	50,532,590	-4,440
libraries_tests.pmi.windows.x64.checked.mch	107,400,715	-2,574

There is a throughput regression, which is to be expected since we now have to do more checks/computation:

Overall (+0.03%)

Collection	PDIFF
benchmarks.run.windows.x64.checked.mch	+0.05%
coreclr_tests.run.windows.x64.checked.mch	-0.02%
libraries.crossgen2.windows.x64.checked.mch	+0.12%
libraries.pmi.windows.x64.checked.mch	+0.08%
libraries_tests.pmi.windows.x64.checked.mch	+0.07%

MinOpts (-0.05%)

Collection	PDIFF
benchmarks.run.windows.x64.checked.mch	+0.37%
coreclr_tests.run.windows.x64.checked.mch	-0.05%
libraries.crossgen2.windows.x64.checked.mch	+0.42%
libraries.pmi.windows.x64.checked.mch	+0.16%
libraries_tests.pmi.windows.x64.checked.mch	+0.20%

FullOpts (+0.05%)

Collection	PDIFF
benchmarks.run.windows.x64.checked.mch	+0.04%
coreclr_tests.run.windows.x64.checked.mch	+0.01%
libraries.crossgen2.windows.x64.checked.mch	+0.12%
libraries.pmi.windows.x64.checked.mch	+0.08%
libraries_tests.pmi.windows.x64.checked.mch	+0.07%

tannergooding · 2022-12-10T04:37:41Z

Could probably always estimate the 3-byte encoding in min-opts to save time and reduce the min-opts impact.

…sabled

… code

tannergooding · 2022-12-10T05:58:50Z

src/coreclr/jit/emitxarch.cpp

+            // prefix if optimizations are enabled or we know we won't negatively impact the
+            // estimated alignment sizes.
+
+            if (emitComp->opts.OptimizationEnabled() || (emitCurIG->igNum > emitLastAlignedIgNum))


Talked with @kunalspathak and this is currently needed as we still try to do alignment when optimizations are disabled.

We may want to revisit that since OSR + TC should handle all the important cases and aligning debug code likely isn't worth the cycles required.

…ate the emitter

tannergooding · 2022-12-10T14:35:40Z

-- Don't try to estimate the 2-byte VEX prefix when optimizations are disabled

This commit actually regressed throughput even more, which was unexpected. For example, MinOpts throughput changed from the above to

MinOpts (+0.25%)

Collection	PDIFF
benchmarks.run.windows.x64.checked.mch	+0.41%
coreclr_tests.run.windows.x64.checked.mch	+0.25%
libraries.crossgen2.windows.x64.checked.mch	+0.42%
libraries.pmi.windows.x64.checked.mch	+0.32%
libraries_tests.pmi.windows.x64.checked.mch	+0.23%

I've pushed a new commit that tries just MinOpts rather than OptimizationsDisabled to see if that helps at all. I'd guess, but haven't actually investigated yet, that the emitComp->opts.OptimizationsDisabled() call wasn't being inlined...

tannergooding · 2022-12-10T15:20:03Z

Turns out the OptimizationsDisabled call prevented MSVC from reordering the emitAttr size = id->idOpSize();

I manually reordered it and the codegen is a lot better. That being said, idOpSize is unnecessarily expensive and is accessing an array/lookup table to compute what is effectively 1 << _idOpSize. Going to submit a separate PR to fix that -- see #79493

tannergooding · 2022-12-10T17:42:55Z

Better but still regressed throughput for minopts, more so than for full-opts which doesn't really make sense since the check should mean we're doing "less work".

I'd guess its negatively interacting with something else, like the alignment support, and so the 4k savings we get makes up for the difference in time

tannergooding · 2022-12-10T20:12:10Z

/azp run runtime-coreclr jitstress, Fuzzlyn

azure-pipelines · 2022-12-10T20:12:54Z

Azure Pipelines successfully started running 2 pipeline(s).

tannergooding · 2022-12-11T05:24:20Z

Fuzzlyn failures are happening in morph and are unrelated. They are:

JIT assert failed:
Assertion failed '((tree->gtDebugFlags & GTF_DEBUG_NODE_MORPHED) == 0) && "ERROR: Already morphed this node!"' in 'S0:M7():short:this' during 'Morph - Global' (IL size 1578; hash 0xb09b63d3; FullOpts)

    File: /__w/1/s/src/coreclr/jit/morph.cpp Line: 12875

tannergooding · 2022-12-12T21:51:47Z

CC. @dotnet/jit-contrib this is ready for review.

Good size savings in known hot code at the cost of a small TP regression. Resolving the TP regression would require some non-trivial work/refactorings in the emitter.

jakobbotsch · 2022-12-13T19:15:34Z

Have you looked at a detailed throughput trace (e.g. using @SingleAccretion's tool)? +0.2% to +0.4% in MinOpts is quite a bit (e.g. it is more than we spent in FullOpts on tail merging recently, which was a rather large optimization).

tannergooding · 2022-12-13T19:22:49Z

Have you looked at a detailed throughput trace (e.g. using @SingleAccretion's tool)?

What is the tool and where is the documentation for running it, etc?

SingleAccretion · 2022-12-13T19:26:22Z

What is the tool and where is the documentation for running it, etc?

The general tool is the pintool, building documented here: https://github.com/SingleAccretion/Dotnet-Runtime.Dev#dotnet-runtimedev.

The particular part which Jakob refers to is this script: https://github.com/SingleAccretion/Dotnet-Runtime.Dev#analyze-pin-trace-diffps1---diff-the-traces-produced-by-the-pin-tool, which compares two traces captured using PIN and prints statistics on which methods are most responsible for regressions / improvements.

tannergooding · 2022-12-13T22:34:48Z

Numbers for 4d0c099 show the following (noting some methods were renamed and one method is new, so I tried to break it apart slightly):

Base: 99141678103, Diff: 99192173532, +0.0509%

?emitIns_R_I@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@_J@Z  : 17705226   : +17.85%  : 1.71%  : +0.0179%
memset                                                                         : 4455056    : +0.68%   : 0.43%  : +0.0045%
?TakesRexWPrefix@emitter@@SA_NW4instruction@@W4emitAttr@@@Z                    : 3581705    : +2.37%   : 0.35%  : +0.0036%
?emitIns_R_R@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@2@Z   : -1244214   : -4.76%   : 0.12%  : -0.0013%
?emitEndCodeGen@emitter@@QEAAIPEAVCompiler@@_N11IPEAI2PEAPEAX33@Z              : -1310818   : -0.58%   : 0.13%  : -0.0013%
?emitIns_Mov@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@2_N@Z : -1994274   : -0.78%   : 0.19%  : -0.0020%
?genAllocLclFrame@CodeGen@@IEAAXIW4_regNumber_enum@@PEA_NI@Z                   : -2293818   : -46.75%  : 0.22%  : -0.0023%
?emitInsSizeSVCalcDisp@emitter@@QEAAIPEAUinstrDesc@1@_KHH@Z                    : -10589308  : -30.60%  : 1.02%  : -0.0107%
?emitFindOffset@emitter@@IEAAIPEAUinsGroup@@I@Z                                : -51499536  : -23.77%  : 4.96%  : -0.0519%

?EncodedBySSE38orSSE3A@emitter@@QEBA_NW4instruction@@@Z                        : 14487091   : NA       : 1.40%  : +0.0146%
?EncodedBySSE38orSSE3A@emitter@@QEAA_NW4instruction@@@Z                        : -14487091  : -100.00% : 1.40%  : -0.0146%

?emitInsSize@emitter@@QEAAIPEAUinstrDesc@1@_K_N@Z                              : 56890150   : NA       : 5.48%  : +0.0574%
?emitInsSize@emitter@@QEAAI_K_N@Z                                              : -36429961  : -100.00% : 3.51%  : -0.0367%

?emitGetAdjustedSize@emitter@@QEBAIPEAUinstrDesc@1@_K@Z                        : 110127564  : NA       : 10.61% : +0.1111%
?emitGetAdjustedSizeEvexAware@emitter@@QEAAIW4instruction@@W4emitAttr@@_K@Z    : -66319632  : -100.00% : 6.39%  : -0.0669%

?emitInsSizeRR@emitter@@QEAAIPEAUinstrDesc@1@@Z                                : 103959389  : NA       : 10.02% : +0.1049%
?emitInsSizeRR@emitter@@QEAAIW4instruction@@W4_regNumber_enum@@1W4emitAttr@@@Z : -85630059  : -100.00% : 8.25%  : -0.0864%

?emitOutputRexOrSimdPrefixIfNeeded@emitter@@QEAAIW4instruction@@PEAEAEA_K@Z    : 225521710  : NA       : 21.73% : +0.2275%
?emitOutputSimdPrefixIfNeeded@emitter@@QEAAIW4instruction@@PEAEAEA_K@Z         : -220829033 : -100.00% : 21.28% : -0.2227%

?emitGetVexPrefixSize@emitter@@QEBAIPEAUinstrDesc@1@@Z                         : 7245402    : NA       : 0.70%  : +0.0073%

A slight refactoring (57d5725) changes it instead to be:

Base: 99141678103, Diff: 99179548037, +0.0382%

?emitIns_R_I@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@_J@Z  : 17705226   : +17.85%  : 1.73%  : +0.0179%
?TakesRexWPrefix@emitter@@SA_NW4instruction@@W4emitAttr@@@Z                    : 3581705    : +2.37%   : 0.35%  : +0.0036%
?emitIns_R_R@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@2@Z   : -1244214   : -4.76%   : 0.12%  : -0.0013%
?emitEndCodeGen@emitter@@QEAAIPEAVCompiler@@_N11IPEAI2PEAPEAX33@Z              : -1310818   : -0.58%   : 0.13%  : -0.0013%
?emitIns_Mov@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@2_N@Z : -1994274   : -0.78%   : 0.19%  : -0.0020%
?genAllocLclFrame@CodeGen@@IEAAXIW4_regNumber_enum@@PEA_NI@Z                   : -2293818   : -46.75%  : 0.22%  : -0.0023%
?emitInsSizeSVCalcDisp@emitter@@QEAAIPEAUinstrDesc@1@_KHH@Z                    : -10589308  : -30.60%  : 1.03%  : -0.0107%
?emitFindOffset@emitter@@IEAAIPEAUinsGroup@@I@Z                                : -51499536  : -23.77%  : 5.02%  : -0.0519%

?EncodedBySSE38orSSE3A@emitter@@QEBA_NW4instruction@@@Z                        : 14487091   : NA       : 1.41%  : +0.0146%
?EncodedBySSE38orSSE3A@emitter@@QEAA_NW4instruction@@@Z                        : -14487091  : -100.00% : 1.41%  : -0.0146%

?emitInsSize@emitter@@QEAAIPEAUinstrDesc@1@_K_N@Z                              : 56890150   : NA       : 5.55%  : +0.0574%
?emitInsSize@emitter@@QEAAI_K_N@Z                                              : -36429961  : -100.00% : 3.55%  : -0.0367%

?emitGetAdjustedSize@emitter@@QEBAIPEAUinstrDesc@1@_K@Z                        : 105018266  : NA       : 10.24% : +0.1059%
?emitGetAdjustedSizeEvexAware@emitter@@QEAAIW4instruction@@W4emitAttr@@_K@Z    : -66319632  : -100.00% : 6.47%  : -0.0669%

?emitInsSizeRR@emitter@@QEAAIPEAUinstrDesc@1@@Z                                : 101120612  : NA       : 9.86%  : +0.1020%
?emitInsSizeRR@emitter@@QEAAIW4instruction@@W4_regNumber_enum@@1W4emitAttr@@@Z : -85630059  : -100.00% : 8.35%  : -0.0864%

?emitOutputRexOrSimdPrefixIfNeeded@emitter@@QEAAIW4instruction@@PEAEAEA_K@Z    : 225521710  : NA       : 21.99% : +0.2275%
?emitOutputSimdPrefixIfNeeded@emitter@@QEAAIW4instruction@@PEAEAEA_K@Z         : -220829033 : -100.00% : 21.53% : -0.2227%

?emitGetVexPrefixSize@emitter@@QEBAIPEAUinstrDesc@1@@Z                         : 7245402    : NA       : 0.71%  : +0.0073%

It doesn't look to be profitable to skip the exact size estimation, even if the optimizationDisabled checks are outlined/etc. This looks to be because of the indirect benefits in genAllocLclFrame, emitFindOffset, and others where having smaller code sizes estimated results in less overall work.

Like I mentioned in Discord, the three biggest regressions are:

emitIns_R_I is more expensive because every instruction now has 1 more branch so that the id can be passed through to emitInsSize. This one could probably be refactored to avoid that branch with a goto. The function in general needs some cleanup though as its doing work that is normally handled by functions like emitInsSize* helpers and is doing work in a non-optimal way.

emitIns_R_R is going through emitInsSizeRR(instrDesc*) that probably needs to be merged in with emitInsSizeRR(instrDesc*, code_t) and which in general is doing things it probably needs some cleanup around (also pre-existing)

TakesRexWPrefix is doing a non-inlined jump table that is just a static bit of data for a few instructions. It should probably be a flag which would be quite a bit more efficient.

All three of these are really unrelated to this change and are pre-existing issues. They are showing regressions because they aren't doing the same thing as all the other paths and so the VEX only changes are showing up where you wouldn't expect them to. We should ideally work on cleaning these up so that the only real impact is in the new code paths.

BruceForstall · 2022-12-14T16:42:39Z

[unrelated to PR]

The general tool is the pintool, building documented here: https://github.com/SingleAccretion/Dotnet-Runtime.Dev#dotnet-runtimedev.

@SingleAccretion Looks like an awesome set of scripts for working with .NET and JIT. I'm sure everyone on the CodeGen team has their own similar set. It's too bad we don't share these kind of scripts more broadly, e.g., in jitutils where they could end up in jitutils/bin that will (likely) be on our PATH.

BruceForstall

LGTM. Thanks for doing this.

tannergooding added 7 commits December 9, 2022 09:57

Remove the dead emitOutputRexOrVexPrefixIfNeeded and rename emitOutpu…

ec19c12

…tSimdPrefixIfNeeded to emitOutputRexOrSimdPrefixIfNeeded

Remove the dead emitGetAdjustedSize and rename emitGetAdjustedSizeEve…

4d75d05

…xAware to emitGetAdjustedSize

Simplify the emitGetEvexPrefixSize and emitGetVexPrefixSize methods

0932ac1

Create helper emitExtractVexPrefix and emitExtractEvexPrefix functions

7bd14be

Update emitGetVexPrefixSize and emitGetEvexPrefixSize to take an inst…

b20a6d5

…rDesc

Update emitGetPrefixSize to take the instrDesc so the VEX prefix size…

4d94897

… can be computed correctly

Update emitGetVexPrefixSize to support returning 2 or 3

ae744eb

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 9, 2022

ghost assigned tannergooding Dec 9, 2022

tannergooding commented Dec 9, 2022

View reviewed changes

tannergooding commented Dec 10, 2022

View reviewed changes

Ensure IF_RWR_CNS is handled in emitGetVexPrefixSize

f2ff3c4

build-analysis bot mentioned this pull request Dec 10, 2022

Precondition failure: File has not had execution verified #79439

Closed

Don't try to estimate the 2-byte VEX prefix when optimizations are di…

e131b6b

…sabled

tannergooding marked this pull request as ready for review December 10, 2022 05:24

Ensure we don't negatively impact estimated alignment sizes for debug…

ea56d5f

… code

tannergooding commented Dec 10, 2022

View reviewed changes

tannergooding added 2 commits December 10, 2022 06:11

Just check MinOpts not OptimizationsDisabled

5014355

Mark some of the modified methods as const to indicate they don't mut…

989c1ec

…ate the emitter

Resolve the throughput issue by not getting optSize untill necessary

0f2514d

Switch back to doing accurate vex prefix size estimation always

4d0c099

BruceForstall self-requested a review December 12, 2022 22:26

Small cleanup to reduce regression

57d5725

BruceForstall approved these changes Dec 14, 2022

View reviewed changes

BruceForstall mentioned this pull request Dec 14, 2022

Enable AVX512 Additional 16 SIMD Registers #79544

Merged

tannergooding merged commit 58e82be into dotnet:main Dec 14, 2022

tannergooding mentioned this pull request Dec 14, 2022

Ensure that TryGetContainableHWIntrinsicOp is non-mutating #79363

Merged

kunalspathak mentioned this pull request Jan 9, 2023

[Perf] Linux/arm64: 13 Regressions on 12/15/2022 1:59:35 AM #79973

Closed

ghost locked as resolved and limited conversation to collaborators Jan 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This fixes emitGetVexPrefixSize to support detecting 2-byte prefix support #79478

This fixes emitGetVexPrefixSize to support detecting 2-byte prefix support #79478

tannergooding commented Dec 9, 2022

ghost commented Dec 9, 2022

tannergooding commented Dec 9, 2022

tannergooding Dec 9, 2022

tannergooding Dec 9, 2022

tannergooding Dec 9, 2022

tannergooding Dec 9, 2022

tannergooding Dec 9, 2022

tannergooding Dec 9, 2022

tannergooding Dec 10, 2022 •

edited

Loading

BruceForstall Dec 14, 2022

tannergooding commented Dec 10, 2022

tannergooding commented Dec 10, 2022

tannergooding Dec 10, 2022

tannergooding commented Dec 10, 2022

tannergooding commented Dec 10, 2022 •

edited

Loading

tannergooding commented Dec 10, 2022

tannergooding commented Dec 10, 2022

azure-pipelines bot commented Dec 10, 2022

tannergooding commented Dec 11, 2022

tannergooding commented Dec 12, 2022

jakobbotsch commented Dec 13, 2022

tannergooding commented Dec 13, 2022

SingleAccretion commented Dec 13, 2022

tannergooding commented Dec 13, 2022 •

edited

Loading

BruceForstall commented Dec 14, 2022

BruceForstall left a comment

This fixes emitGetVexPrefixSize to support detecting 2-byte prefix support #79478

This fixes emitGetVexPrefixSize to support detecting 2-byte prefix support #79478

Conversation

tannergooding commented Dec 9, 2022

ghost commented Dec 9, 2022

tannergooding commented Dec 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding Dec 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding commented Dec 10, 2022

tannergooding commented Dec 10, 2022

Choose a reason for hiding this comment

tannergooding commented Dec 10, 2022

tannergooding commented Dec 10, 2022 • edited Loading

tannergooding commented Dec 10, 2022

tannergooding commented Dec 10, 2022

azure-pipelines bot commented Dec 10, 2022

tannergooding commented Dec 11, 2022

tannergooding commented Dec 12, 2022

jakobbotsch commented Dec 13, 2022

tannergooding commented Dec 13, 2022

SingleAccretion commented Dec 13, 2022

tannergooding commented Dec 13, 2022 • edited Loading

BruceForstall commented Dec 14, 2022

BruceForstall left a comment

Choose a reason for hiding this comment

tannergooding Dec 10, 2022 •

edited

Loading

tannergooding commented Dec 10, 2022 •

edited

Loading

tannergooding commented Dec 13, 2022 •

edited

Loading