This behavior, among others, was analyzed by Tynan as part of his ETH Zurich thesis in collaboration with ChainSecurity. In the following we are describing issues with undefined behavior in Solidity and how it was used to craft a benign-looking but malicious AMM smart contract for the 2022 Underhanded Solidity contest.
Undefined Behavior
Undefined behavior is a term many developers have likely encountered before. But what exactly does it mean? When using a programming language that has multiple competing compilers, it is vital to have a language specification in order for the same program to yield the same output regardless of which compiler is used. However, if the language specification is not precise enough, or intentionally leaves some edge cases out, this means that each compiler can decide for itself how to deal with certain situations. For example, in C the result of integer division by 0 is undefined.
This means that each C compiler can decide to compile this code however it likes. Usually, this means either choosing the most convenient way, or throwing an error. However, the compiler could also do something completely different and still be a ‘correct’ implementation of the C language.
As a developer, it is important to avoid such undefined behavior. It means you no longer have complete control over the compiled code. While most of the time, one would expect that only obscure edge cases are left unspecified, in some cases it’s easier to trigger undefined behavior than you might expect.
Undefined Behavior in Solidity
The situation regarding Solidity is slightly different than C. There is only one Solidity compiler, and the language specification is evolving alongside it. This is not necessarily unusual, but it has lead to some quirky behaviors.
For example, when using contract inheritance, it is not specified whether first all state variables are initialized, or whether the state variables are simply initialized just before running the constructor of the contract they belong to. This is well demonstrated by this example from the Solidity docs:
Here, y can either be 0 or 42, depending on whether y is initialized before A’s constructor is executed.
Of course, this example is quite contrived; it’s unlikely someone would actually create a contract with this structure. Additionally, as long as the compiler doesn’t change how it handles this behavior, it doesn’t actually matter if you rely on its handling of this case.
Unfortunately, depending on whether you use the default compiler settings, or the new experimental compilation which goes through the Yul intermediate representation, you can get either behavior. Given the Yul pipeline is eventually supposed to replace the current one, it is important that developers do not rely on these behaviors.
Evaluation Order
Why does undefined behavior matter if we only run into it in specifically constructed examples? Alas, as we can see in the Solidity docs, the evaluation order of expressions is unspecified. What this means is that sub-expressions within another expression can be evaluated in any order. For example, let’s take a look at the expression f(g(...), h(...)). Of course, g and h need to be evaluated before f , since it depends on their outputs. However, the evaluation order could be g -> h -> f or h -> g -> f.
If g and h do not have side effects, everything is fine. If they read/write memory, storage, or make an external call, we can quickly run into issues.
uint256 i = 0;
f(i, i++);
In this example, intuitively we might expect the result to be f(0, 0). However, the compiler could also choose to evaluate i++ before i, so the result could be f(1, 0) while still be following the language specification. Naturally, this doesn’t matter too much unless the Solidity compiler actually does sometimes evaluate expressions in an unexpected order.
addmod and mulmod
The first case we will look at are addmod and mulmod. These are not functions you see very often in Solidity code, but they are globally available in any Solidity contract. addmod(a, b, N) simply calculates (a + b) mod N , whereas mulmod(a, b, N) calculates (a * b) mod N. But what happens when N = 0? At the EVM level, the respective opcodes simply return 0. The Solidity team didn’t want this potentially unexpected result to be exposed to the developer, so they assert that N != 0 and revert otherwise. If this check fails, there is no need to evaluate a and b, so the compiler evaluates the arguments in right-to-left order. Hence, the following example results in 1, as a++ is evaluated first.
uint256 a = 1;
addmod(a, a++, 2);
Events
The situation with events is more complicated. An event can have indexed and non-indexed parameters. Again, when emitting an event, the evaluation order of nested expressions is unspecified. In this case, the compiler chooses to first evaluate the indexed parameters in right-to-left order, followed by the non-indexed parameters in left-to-right order. Thus, the following example evaluates in the order h -> f -> g -> i.
event Hello(uint indexed a, uint b, uint indexed c, uint d);
emit Hello(f(...), g(...), h(...), i(...))
To complicate things further, compilation via the Yul IR tries to evaluate everything in left-to-right order, but does not guarantee it. Therefore, the examples mentioned above do yield the expected behavior if you specify the --experimental-via-ir compiler option. In fact, there are many small differences between the default compilation and the experimental Yul compiler. The Solidity team has compiled a list of such differences here.
Underhanded Solidity
While it’s unlikely that undefined behavior leads to an accidental exploit in a smart contract, this doesn’t stop bad-faith developers from intentionally introducing them. This was demonstrated by the winning submission to the Underhanded Solidity Contest 2022.
The submission is a relatively simple decentralized exchange. It implements a constant product trading system. Each trade accrues some fees, which are put into the liquidity pool. The owner of the exchange can claim an admin fee, which is calculated based on the increase in liquidity since the last time the fee was claimed.
Importantly, this calculation means that admin fees can be claimed retroactively — if the owner changes the fees and then claims them, the liquidity accrued before the claim will still be included in the calculation.In order to mitigate this, the admin fee changing process forces the owner to claim fees when changing them. Additionally, there is a 7 day waiting period where the admin can’t claim fees, in order to allow liquidity providers to withdraw their funds before the higher funds are claimed.
However, as we have seen, indexed event parameters are evaluated right-to-left. This means that setNewAdminFee is executed before retireOldAdminFee. As a consequence, the fees for the previous period are actually claimed with the newly set one! In fact, in this particular contract there is no maximum admin fee, so the owner can set an arbitrarily high fee in order to drain the entire underlying balances.
Closing remarks
For a dive into what the other winners were able to use, take a look at the excellent summary from the Solidity team. If you are interested to work with us on uncovering quirks in blockchain-based systems, reach out to jobs@chainsecurity.com and if you require a smart contract audit or other assurance that your blockchain project is secure, get in touch with us at info@chainsecurity.com.