CFG Part 0: Max Stack Size & Peephole Optimizer Fixes/Improvements#2561
CFG Part 0: Max Stack Size & Peephole Optimizer Fixes/Improvements#2561ike709 wants to merge 16 commits into
Conversation
|
I'm calling this PR done barring any review comments. I did another testing pass without any of my other non-committed experimental changes, and confirmed that TG and goon both compile, while also making sure I can start a round on TG. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
| public bool LastInstructionTransfersControl() { | ||
| List<IAnnotatedBytecode> bytecode = AnnotatedBytecode.GetAnnotatedBytecode(); | ||
| HashSet<string>? referencedLabels = null; | ||
| // Man sometimes it'd be nice if we had a list of just instructions and didn't need loops like this just to grab the last actual instruction |
There was a problem hiding this comment.
You could also save the last opcode written by WriteOpcode()
| continue; | ||
| case AnnotatedBytecodeLabel label: | ||
| referencedLabels ??= GetReferencedLabels(bytecode); | ||
| if (referencedLabels is not null && referencedLabels.Contains(label.LabelName)) |
There was a problem hiding this comment.
Wouldn't this not work if the label is only referenced later on in the code?
| ProcessBlockInner(statement.Body); | ||
| proc.EndScope(); | ||
| proc.Jump(endLabel); | ||
| if (!proc.LastInstructionTransfersControl()) |
There was a problem hiding this comment.
Could LastInstructionTransfersControl() be put into Jump() instead of littering these checks everywhere?
| /// Recomputes the maximum possible stack size from the current annotated bytecode. | ||
| /// Used after optimization because peephole rewrites can change the max stack size. | ||
| /// </summary> | ||
| public int RecalculateMaxStackSize() { |
There was a problem hiding this comment.
Why bother tracking stack size while generating opcodes if we just recalculate it at the end?
| foreach (var optPass in _passes) { | ||
| changed |= RunPass((byte)optPass, input); | ||
| } | ||
| } while (changed); |
There was a problem hiding this comment.
This adds a second pass to practically every proc at minimum. I'd like to avoid that if possible, what additional optimizations is this catching?
| if (currentOpt.Optimization?.CheckPreconditions(input, i - optSize) is true) { | ||
| currentOpt.Optimization.Apply(_compiler, input, i - optSize); | ||
| int startIndex = currentOptStartIndex; | ||
| var skippedVariables = RemoveTransparentVariables(input, startIndex); |
There was a problem hiding this comment.
Removing every AnnotatedBytecodeVariable surrounding every instruction, checking for a possible optimization, then inserting those variables back, seems ridiculous. Is there no better way to do this?
Not to mention all the O(n) operations this introduces.
| RewriteLabelAliases(input, labelAliases); | ||
| } | ||
|
|
||
| private Dictionary<string, string>? RemoveJumpsAfterTerminalInstructions(List<IAnnotatedBytecode> input, Dictionary<string, string>? labelAliases) { |
There was a problem hiding this comment.
Could we just avoid emitting these in DMProc in the first place, instead of looping through every opcode and potentially running multiple O(n) removals?
This PR does a few things:
Jumpoptimizations from Control Flow Graph #2558AttemptCurrentOpt()switch()cases no longer emit aJumpinstruction if the last operation is a terminal change in control flow (e.g.return)spawn()blocks no longer emitPushNullandReturnif the last operation is a terminal change in control flow (e.g.return)AnnotatedBytecodeVariableduring opt scanningJumpinstructions are more aggressively pruned by the bytecode optimizerI did before & after diffs on /tg/station compilation & disassembly to confirm that this resulted in further optimizations to the bytecode without introducing any apparent regressions. The
NullRefopt in particular was being left on the table in thousands of cases, while the other peephole optimizer changes are less dramatic but still present.The fixes to max stack size are also a huge improvement. Thousands of procs have either had their max stack size reduced to 0 or have seen a reduction from double to single digits. I ran TG through round init to confirm nothing exploded.
This sufficiently beats the codegen and optimizer into submission to the point that we can now implement dead code detection (after CFG is finished) and run it on /tg/station and goonstation without getting false positives.