CoW DAO

closedEnded 3 years ago · Snapshot (Offchain)

CIP-14: Risk-adjusted solver rewards

In CoW Protocol, solvers bear the execution risk: the possibility that their solution may revert. Currently the protocol rewards solvers on average R COW tokens per order, regardless of the execution risk, even though execution risk varies between orders and depends on network volatility. In some cases, the reward may not offset execution risk, discouraging solvers from submitting solutions, and leading to orders not being processed. In other cases, solvers might be earning relatively high rewards for bearing little to no risk (see appendix for an example of a no-risk batch and an example of a batch that first reverted, incurring a large cost to the winning solver, before eventually getting mined).

The goal of this post is to introduce a method for redistributing the COW rewards across user orders so that they better reflect execution risk, but still on average be equal to a parameterizable R amount of COW.

Note that this is most likely a provisional measure - it can be implemented immediately via the weekly solver payouts, and can also serve as a baseline that solvers may optionally use for pricing their solutions if we later progress towards a model where solvers bid for the right to settle the batch.

Even with these adjustments we foresee that some batches will continue to pay too much or little rewards, however we believe they can significantly reduce the margin of error compared to the status quo.

Risk-adjusted rewards

On expectation, solver profits for a user order are given by the following expression:

E(profits) = (1 - p) * rewards - p * costs
           = (1 - p) * (rewards + costs) - costs

where

p is the probability of the transaction reverting,
costs=gas_units * gas_price is the cost incurred if the transaction reverts,
rewards is the amount we reward the solver with for this order.

We would like to find rewards that on expectation give out a profit of T COW to solvers:

    E(profits) = T COW
<-> (1 - p) * (rewards + costs) - costs = T COW
<-> rewards = (T COW + costs) / (1 - p) - costs

So the rewards a solver should get for an order is a function of the probability of revert and the costs of reverting:

rewards(p, costs) = (T COW + costs) / (1 - p) - costs         [1]

Interesting values for T are T=0 COW, where rewards will be enough only to cover costs, and T=37 COW, the average net profit solvers are getting today for the average reward of R=73 COW (see [1] in appendix on how these values were computed).

Computing the probability p of revert

We can model the probability of a batch to revert as a function of costs, or more concretely as a function of gas_units and gas_price, since cost is equal to the product of these two quantities:

 p = 1/(1 + exp(-β - ɑ1 * gas_units - ɑ2 * gas_price)))

where β, ɑ1, and ɑ2, are obtained by (logistic) regression. See appendix for the full analysis including the data exploration justifying the choice of these predictors, and the regression computation.

Capping rewards

With this model for p, whenever the probability of revert approaches one, the risk adjusted rewards (eq. 1) goes to infinity. To account for possible model inaccuracies (training data can be too thin and noisy) we suggest to further cap the maximum amount of rewards that a solver can earn, as well as the maximum gas_units:

gas_units_capped(gas_units) = min(1250K, gas_units)

rewards_capped(p, gas_units, gas_price) = 
 = min(
       2500 COW,
       rewards(p, gas_units_capped(gas_units) * gas_price)
  )
[2]

The cap constants were empirically selected by looking at points where rewards would blow up, on a 5-month period of data, and are further partially validated by the outcome analysis at the very end of the modeling notebook (the long version, see appendix).

Pictorially, this looks as follows:

Implementation

This proposal can be implemented by updating the driver to compute eq. [2] for every user order of every batch, and include the result in a “rewards” key for the order in the instance json file sent to the solvers. Given that the average number of orders per batch, and the gas price and gas limit distribution can be expected to change over time, the regression model will need to be recalibrated periodically. As a starting point, we propose to set T=37.

Note that liquidity orders are suggested to carry no COW rewards (a liquidity order should only be included in the solution if this improves the objective function).

The case of internal settlements

In the case where a settlement is (potentially) using internal buffers and zero external interactions, then the risk of revert is essentially zero. In this case, we propose that we ignore the “rewards” value specified in the input json per order. Instead, this value will be replaced by the value T (per executed user order contained in the batch); note that this is equivalent to setting p, the probability of revert, to p=0. We also clarify that liquidity orders provided in the input json that have no atomic execution, i.e., where the corresponding field “has_atomic_execution” is set to FALSE, will be treated as internal interactions. This, for example, means that a perfect CoW between a user order and such a liquidity order will be treated as a purely internal settlement, and thus, will lead to a reward of value T. As an additional example, a perfect CoW between 2 user orders is also a purely internal settlement, and will lead to a total reward of value 2T (since 2 user orders were involved).

Conclusion

This proposal introduces a method for distributing rewards among solvers as a function of indicators of risk of revert. Its main limitations are:

Model is sensitive to calibration period. Calibrating the model on e.g. a non volatile period and then using it in a volatile period will provide inaccurate results. Ideally, the training historic period should reflect future conditions.
We are not considering slippage risk, only revert risk. Our data contains different kinds of solvers and their approach to the trade-off between revert risk and slippage risk is different.
We are not aiming at a full model that captures probability of revert. Solvers set the slippage tolerance differently which is a huge driving factor, but this is unobservable to us.

Potential extensions:

The DAO might wish to revise the estimated solver profit (variable T above).

Appendix

Please refer to the following notebooks:

Computing current rewards and profits per order. https://colab.research.google.com/drive/16mEkj827Mr0sj5o8vGU3568RW3LoRs0T?usp=sharing
Modeling COW rewards – Short version: this contains reward logic derivation, final model specification, and calculation of COW rewards under the proposed mechanism https://colab.research.google.com/drive/1OWJghPykB0Ix2f2v0JYzMi9fye3Ywl6F?usp=sharing – Full version: (superset of the above): includes feature selection and additional analysis regarding final suggested COW rewards https://colab.research.google.com/drive/1JThTE9jnL4vWbMaBEuUsCa1n4t-A6nuk?usp=sharing
Accompanying exploratory data analysis https://colab.research.google.com/drive/1fiJ98-iiv0dnTosaiNg6Lg6LE-rnKTvY?usp=sharing

Examples of no-risk and high-risk batches

No-risk batch: In this case, there was a single user order being settled, selling a small amount of USDC for ETH. Quasimodo (the winning solver) identified one of the baseline pools (i.e., the pools provided by the driver) as the pool to match this order, but also realized that internal buffers suffice, and so it ended up internalizing the trade. In other words, this ended up being an almost zero-risk trade (order could still be canceled), where the winning solver only needed to look at liquidity provided in the input json. Here is the settlement itself: https://etherscan.io/tx/0x1c5c3a663250a3ad6465a26381925c7263e2e637dfbfd88103280f9787f4fbaf
High-risk batch: Example of high gas costs and volatility. In this case, there was a single user order being settled, that was selling a substantial amount of BTRFLY for USDC. The winning solver was Gnosis_Paraswap and in its first attempt to settle on-chain, the transaction reverted, incurring a cost of ~0.062 ETH. Eventually, the second attempt succeeded. Here are the corresponding logs:

Connect Wallet to Add Note

Votes 665

Voter	Cast Power	Vote & Rationale
0xF7AC...D6EAf5	14.286M	For
0x15fE...E2a1FE	7.229M	For
0x0B8d...Ef19CF	6.235M	For
0x9Dbc...21C199	3.571M	For
0x21e6...1EcbC5	3.571M	For

VOTE POWER

Connect Wallet

Proposal Status

Mon October 10 2022, 03:42 pmVoting Period Starts
Mon October 17 2022, 03:42 pmEnd Voting Period

Current Results

1-For

42.589M

98.48%

2-Against

644,328.175

1.49%

3-Abstain

12,153.98

0.03%

Documentation Branding Contact Us

Proposals

Members

Information