Skip to content

Compiler optimisation #18

@rkalis

Description

@rkalis

Currently the output of a CashScript contract is quite inefficient when compared to hand-written Bitcoin Script. It is likely that CashScript contracts will always result in less efficient bytecode, but there optimisations that can be implemented to reduce the size of the output somewhat. I started with optimisations in branch v0.2.0. There are several low-hanging optimisations that I want to add in this version.

  • Using OP_ROLL instead of OP_PICK for the final use of a variable. This results in not needing to OP_DROP these variables off the stack at the end of the script. This results in one byte saved for each OP_ROLLed variable and more legible output. It also causes a smaller average stack-depth, which results in more byte savings when variables need to be replaced in the stack.
  • Removing the final occurrence of OP_VERIFY. Currently CashScript adds OP_TRUE as a final opcode to all execution paths since each script ends with an implicit OP_VERIFY that expects one truthy on the stack. However, the final use of OP_VERIFY in the code could be removed to save one to two additional bytes per execution path.
  • Merging OP_VERIFY with preceding opcode (e.g. OP_EQUAL OP_VERIFY -> OP_EQUALVERIFY). This results in one byte saved for every merged opcode.
  • Using more specific stack opcodes. Currently, OP_PICK / OP_ROLL is used to retrieve values from the stack. For shallow depths, this can be replaced with hardcoded opcodes (e.g. OP_OVER, OP_ROT, OP_SWAP). This results in saving one byte per hardcoded stack operation.

Besides this low-hanging fruit, there are other optimisations that can be implemented, but require more research.

  • Improved stack ordering. By looking at the script we know when values are needed by certain opcodes. If we are able to order the initial stack correctly, there is no need to OP_PICK or OP_ROLL these values when they're needed, as they are already available at the top of the stack.
  • Simplification. If a contract contains e.g. 2 + 2 == 4, we already know beforehand that this is going to evaluate to true, so this whole expression can be simplified to just true. However, most well-implemented contracts will likely not have many of these instances, as the implementer can manually do a lot of this simplification themselves. So the benefit of implementing this optimisation is questionable.
  • Improved deep replacement. When a variable is reassigned inside a conditional block, we replace the variable at its original position in the stack. This uses an algorithm that sort of digs into the stack with OP_SWAP OP_TOALTSTACK, and then puts all other values back where they belong with OP_FROMALTSTACK. This can be improved by using more specific stack operations for specific depths rather than OP_SWAP.

Additionally, these optimisations were proposed earlier, but have been dropped because they aren't viable.

  • Merging default case with final execution path. Multi-function contracts use if-else statements with function selectors to encode functions, where the final "default" case (no function selected) just fails the script by pushing OP_FALSE to the stack. Instead this default case can be merged with the final execution path to save one additional if-statement and the additional OP_FALSE push.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cashc-compilerRelates to the cashc compiler

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions