Cardano Improvement Proposals


CIP 2 - Coin Selection Algorithms for Cardano

Contents

Abstract

This article describes, in human-readable terms, the coin selection algorithms used by Cardano Wallet and other parts of the Cardano ecosystem.

In the context of this article, coin selection refers to the process of choosing unspent coins from a wallet (or UTxO set) in order to pay money to one or more recipients.

Motivation

This document was written to help people gain an understanding of the coin selection algorithms used in Cardano without having to read through and understand Cardano source code.

We aim to provide descriptions of algorithms that:

Where appropriate, we also provide mathematical descriptions, for added clarity.

Scope

Coin selection is a large, complex topic, with many difficult problems to solve. However, all software that performs coin selection must ultimately deal with at least the following pair of problems:

This article concerns itself with the former problem of how to generate a coin selection.

Problems relating to network fees, and how to adjust coin selections to pay for such fees, are outside the scope of this article.

Structure

The Background section introduces the fundamental concepts behind coin selection, provides a discussion of why coin selection is a non-trivial problem, and describes the goals of coin selection algorithms.

The Interface section gives a description of the common interface unifying all coin selection algorithms used within Cardano Wallet, the standard parameter types, result types, and error types used by this interface, and a description of the properties that all conforming implementations are expected to satisfy.

The Algorithms section gives detailed descriptions of each of the individual coin selection algorithms used in Cardano Wallet, along with step-by-step descriptions of the computations involved.

The Reference Implementations section provides links to reference implementations of each algorithm in various languages.

Contents

Background

This section introduces the fundamental concepts behind coin selection, provides a discussion of why coin selection is a non-trivial problem, and describes important goals of coin selection algorithms.

What is Coin Selection?

Coin selection is the process of choosing unspent coins from a wallet in order to pay money to one or more recipients.

Coin Selection in the Physical World

In the familiar world of physical money, our wallets hold value in the form of coins and banknotes.

When making a payment to someone, we typically select a combination of coins and banknotes from a wallet that, when added together, have enough value to cover the amount required.

Ideally, we'd always be able to select just enough to cover the exact amount. However, given that coins and banknotes have fixed values (and cannot be subdivided without destroying their value), it's often impossible to select the exact amount required. In such cases, we typically give the recipient more than the required amount, and then receive the excess value back as change.

:bulb: Example

Alice wishes to pay for her lunch.

The total price comes to €2.50 (2 euros and 50 cents). In her wallet, she has five one-euro coins, and one ten-euro note.

Note that there is no combination of coins (or notes) in her wallet that when added together give a total of €2.50, but there are several possible combinations that exceed the total.

To solve this problem, Alice selects one of these combinations: three one-euro coins. She uses the coins to make the payment, and then receives one 50-cent coin as change.

Coin Selection in Cardano

Similarly to how a physical wallet holds value in the form of unspent coins and banknotes, a Cardano wallet holds value in the form of unspent transaction outputs. An unspent transaction output is the result of a previous transaction that transferred money to the wallet, where the value has not yet been spent by another transaction. Each unspent transaction output has an associated coin value, and the total value of a wallet is the sum of these coin values. Collectively, the set of unspent transaction outputs is known as the UTxO set.

When using a Cardano wallet to make a payment, the wallet software must select a combination of unspent outputs from the wallet's UTxO set, so that the total value of selected outputs is enough to cover the target amount.

Just as with physical coins and notes, unspent outputs from the UTxO set cannot be subdivided, and must either be spent completely in a given transaction, or not be spent at all. Similarly to a transaction with physical money, the wallet software must select a combination of unspent outputs whose total value is greater than the target amount, and then arrange that change is paid back to the wallet.

Coin selection refers to the process of selecting a combination of unspent outputs from a wallet's UTxO set in order to make one or more payments, and computing the set of change to be paid back to the wallet.

Why is Coin Selection Non-Trivial?

There are a number of issues which make the problem of coin selection more complicated than would initially appear.

The Transaction Size Limitation

Each transaction has a maximum size, as defined by the protocol. The size of a transaction increases as we add more inputs or outputs.

Therefore, there's a practical limit on the number of coins we can select for any given transaction.

The Problem of Dust

One simple strategy for selecting coins might be to mimic what we do when making a payment with coins and banknotes in the physical world. By giving the recipient an amount that's as close as possible to the amount they're expecting, we can minimize the amount of change they need to return to us.

However, trying to replicate this strategy with a UTxO-based wallet has an undesirable effect: minimizing the total value of selected coins will also minimize the value of change returned to the wallet. When repeated over time, this strategy will tend to cause an accumulation of tiny outputs in the wallet's UTxO set known as dust.

Dust outputs are a problem, because even if the total value of dust in a wallet is more than enough to cover a given target amount, if we attempt to include that dust in a given transaction, we may run out of space (by reaching the transaction size limit) before we can cover the target amount.

For more information on dust avoidance, see Self Organisation in Coin Selection.

The Problem of Concurrency

One simple strategy for generating change might be to mimic what a shop assistant does when accepting a payment in the real world, which is to minimize the number of coins and banknotes that they return to the customer. This is beneficial for the shop, as it reduces the chances of them running out of change, and beneficial for the customer, as it reduces the amount of change that they have to carry around in their wallet.

Analogously, when generating change for a UTxO-based wallet, we might be tempted to use the simple strategy of just creating a single change output with the exact excess value.

However, using this strategy has an undesirable effect: the repeated act of minimizing the number of change outputs will tend (over time) to reduce the number of entries in a wallet's UTxO set. This is bad for two reasons:

  1. Having a small UTxO set limits the number of future payments that we can make in parallel.

  2. The approach of coalescing all change into a single output is widely considered to have negative privacy implications.

Goals of Coin Selection Algorithms

In light of the issues described above, we'd ideally like for our coin selection algorithms to be able to:

Definitions

This section defines common terms that are used throughout this document.

Address

An address is a unique identifier that represents a payment recipient, a destination for a payment.

Addresses are typically owned (and generated) by individual wallets.

In general, coin selection algorithms are agnostic to the type of addresses used to identify payment recipients. Any address type may be used, so long as the set of possible addresses is totally-ordered.

Coin Value

A coin value is a non-negative integer value that represents a number of Lovelace.

One Ada is exactly equal to one million Lovelace.

Transaction

In a UTxO-based blockchain, a transaction is a binding between inputs and outputs.

input #1  >---+          +---> output #1
                   \        /
    input #2  >-----+------+
                   /        \
    input #3  >---+          +---> output #2
    

Transaction Input

A transaction input is a unique reference to a single output from a previous transaction.

In general, coin selection algorithms are agnostic to the type of references used to identify outputs from previous transactions. Any type may be used, so long as the set of possible references is totally-ordered, and so long as it is possible to determine the coin value associated with any given reference.

In the case of Cardano and other UTxO-based blockchains, this reference generally consists of a pair of values (h, n), where:

Transaction Output

A transaction output consists of a pair of values (a, v), where:

Spent Transaction Output

A spent transaction output is an output from an existing transaction that has already been referenced as an input within a later transaction on the blockchain.

In effect, the coin value associated with that transaction output has been spent, and cannot be reused.

Unspent Transaction Output

An unspent transaction output is an output from an existing transaction that has not yet been referenced as an input within a later transaction.

In effect, the coin value associated with that transaction output has not yet been spent, and is still available.

UTxO Set

A UTxO set is a set of unspent transaction outputs.

This term is commonly used in two ways:

From the point of view of a coin selection algorithm, each member of a UTxO set can be represented as a pair of the form (u, v), where:

In general, coin selection algorithms are agnostic to the type of references used to identify unspent outputs from previous transactions. Any type may be used, so long as the set of possible references is totally-ordered.

In practice however, the type of each unique reference u is equivalent to the type of a transaction input, as transaction inputs are simply references to unspent outputs from previous transactions.

Change Output

In the context of a wallet, a change output is a transaction output that transfers value back to the wallet, rather than to an external payment recipient. The address associated with a change output is generated by the wallet, and belongs to the wallet.

Change ouputs are necessary in a UTxO-based blockchain, as the value associated with any given transaction input must be spent entirely by the transaction that includes it.

When selecting entries from a UTxO set to include as inputs in a transaction, a coin selection algorithm will generally not be able to select inputs that precisely match the total value of all payments to external recipients, and will therefore need to select more than is strictly required. To avoid the destruction of value, selection algorithms create change outputs to return the excess value back to the wallet.

Dust Output

A dust output is a transaction output with an associated coin value that is:

Dust outputs are a problem, because even if the total value of dust in a wallet is more than enough to cover a given payment amount, if we attempt to include that dust in a given transaction, we may run out of space (by reaching the transaction size limit) before we can cover the target amount.

Interface

All coin selection algorithms used by Cardano Wallet implement a common interface.

At the most fundamental level, a coin selection algorithm is a mathematical function that when applied to a requested output set and an initial UTxO set, will produce a coin selection: the basis for a transaction in a UTxO-based blockchain.

This section describes:

In this section, the terms coin selection algorithm and coin selection function will be used interchangeably.

Parameters

All coin selection functions accept the following parameters:

  1. Requested Output Set

    A list of payments to be made to recipient addresses, encoded as a list of transaction outputs.

  2. Initial UTxO Set

    A UTxO set from which the coin selection algorithm can select entries, to cover payments listed in the requested output set.

    In the context of a wallet, this parameter would normally be assigned with the wallet's complete UTxO set, giving the coin selection algorithm access to the total value associated with that wallet.

  3. Maximum Input Count

    An upper bound on the number of UTxO entries that the coin selection algorithm is permitted to select from the initial UTxO set.

    This parameter is necessary for blockchains that impose an upper limit on the size of transactions.

Results

All coin selection functions produce the following result values:

  1. Coin Selection

    A coin selection is the basis for a transaction in a UTxO-based blockchain.

    It is a record with three fields:

    • A set of inputs, equivalent to a subset of the initial UTxO set.

      From the point of view of a wallet, this represents the value that has been selected from the wallet in order to cover the total payment value.

    • A set of outputs (see transaction output).

      Represents the set of payments to be made to recipient addresses.

    • A set of change values (see change output), where each change value is simply a coin value.

      From the point of view of a wallet, this represents the change to be returned to the wallet.

  2. Remaining UTxO Set

    The remaining UTxO set is a subset of the initial UTxO set.

    It represents the set of values that remain after the coin selection algorithm has removed values to pay for entries in the requested output set.

    In the context of a wallet, if a coin selection algorithm is applied to the wallet's complete UTxO set, then the remaining UTxO set represents the updated UTxO set of that wallet.

Properties

All coin selection algorithms satisfy a common set of properties: general rules that govern the relationship between the parameters supplied to coin selection functions and the results they are allowed to produce.

Coverage of Payments

This property states that the total value of inputs in the resulting coin selection result is sufficient to cover the total value of the requested output set.

In particular:

Where:

Correctness of Change

This property states that the correct amount of change was generated.

In particular:

Where:

Conservation of UTxO

This property states that every entry in the initial UTxO set is included in either the inputs set of the generated coin selection, or in the remaining UTxO set, but not both.

The following laws hold:

And:

Where:

Conservation of Outputs

This property states that the requested output set is conserved in the coin selection result.

In particular, the outputs field of the coin selection result should be equal to the requested output set.

Failure Modes

There are a number of ways in which a coin selection algorithm can fail:

Algorithms

This section describes the coin selection algorithms used by Cardano Wallet, along with step-by-step descriptions of the computations involved.

All algorithms implement a common interface, as described in the Interface section.

There are two main algorithms used by Cardano Wallet:

In general, Cardano Wallet gives priority to the Random-Improve algorithm, as experimental evidence shows that it performs better at minimising dust and maintaining a UTxO set with useful outputs. (See Self Organisation in Coin Selection for more details.)

However, in rare cases, the Random-Improve algorithm may fail to produce a result. In such cases, Cardano Wallet will fall back to the Largest-First algorithm.

Largest-First

The Largest-First coin selection algorithm considers UTxO set entries in descending order of value, from largest to smallest.

When applied to a set of requested outputs, the algorithm repeatedly selects entries from the initial UTxO set until the total value of selected entries is greater than or equal to the total value of requested outputs.

The name of the algorithm is taken from the idea that the largest UTxO entry is always selected first. Specifically:

A given UTxO entry u1 with value v1 can be selected if and only if there is no other unselected entry u2 with value v2 where v2 > v1.

State

At all stages of processing, the algorithm maintains the following pieces of state:

  1. Available UTxO List

    This is initially equal to the initial UTxO set, sorted into descending order of coin value.

    The head of the list is always the remaining UTxO entry with the largest coin value.

    Entries are incrementally removed from the head of the list as the algorithm proceeds, until enough value has been selected.

  2. Selected UTxO Set

    This is initially equal to the empty set.

Computation

The algorithm proceeds according to the following sequence of steps:

Random-Improve

The Random-Improve coin selection algorithm works in two phases:

After the above phases are complete, for each output of value voutput and accompanying UTxO selection of value vselection, the algorithm generates a single change output of value vchange, where:

vchange = vselectionvoutput

Since the goal of the second phase was to expand each selection to the point where its total value is approximately twice the value of its associated output, this corresponds to a change output whose target value is approximately equal to the value of the output itself:

vchange = vselectionvoutput

vchange2voutputvoutput

vchangevoutput

Motivating Principles

There are several motivating principles behind the design of the algorithm.

Principle 1: Dust Management

The probability that random selection will choose dust entries from a UTxO set increases with the proportion of dust in the set.

Therefore, for a UTxO set with a large amount of dust, there's a high probability that a random subset will include a large amount of dust.

Over time, selecting entries randomly in this way will tend to limit the amount of dust that accumulates in the UTxO set.

Principle 2: Change Management

As mentioned in the Goals section, it is desirable that coin selection algorithms, over time, are able to create UTxO sets that have useful outputs: outputs that will allow us to process future payments with a reasonably small number of inputs.

If for each payment request of value v we create a change output of roughly the same value v, then we will end up with a distribution of change values that matches the typical value distribution of payment requests.

:bulb: Example

Alice often buys bread and other similar items that cost around €1.00 each.

When she instructs her wallet software to make a payment for around €1.00, the software attempts to select a set of unspent transaction outputs with a total value of around €2.00.

As she frequently makes payments for similar amounts, transactions created by her wallet will also frequently produce change coins of around €1.00 in value.

Over time, her wallet will self-organize to contain multiple coins of around €1.00, which are useful for the kinds of payments that Alice frequently makes.

Principle 3: Performance Management

Searching the UTxO set for additional entries to improve our change outputs is only useful if the UTxO set contains entries that are sufficiently small enough. But it is precisely when the UTxO set contains many small entries that it is less likely for a randomly-chosen UTxO entry to push the total above the upper bound.

Cardinality

The Random-Improve algorithm imposes the following cardinality restriction:

As a result of this restriction, the algorithm will fail with a UTxO Not Fragmented Enough error if the number of entries in the initial UTxO set is smaller than the number of entries in the requested output set.

State

At all stages of processing, the algorithm maintains the following pieces of state:

  1. Available UTxO Set

    This is initially equal to the initial UTxO set.

  2. Accumulated Coin Selection

    The accumulated coin selection is a coin selection where all fields are initially equal to the empty set.

Computation

The algorithm proceeds in two phases.

Phase 1: Random Selection

In this phase, the algorithm iterates through each of the requested outputs in descending order of coin value, from largest to smallest.

For each output of value v, the algorithm repeatedly selects entries at random from the available UTxO set, until the total value of selected entries is greater than or equal to v. The selected entries are then associated with that output, and removed from the available UTxO set.

This phase ends when every output has been associated with a selection of UTxO entries.

Phase 2: Improvement

In this phase, the algorithm attempts to improve upon each of the UTxO selections made in the previous phase, by conservatively expanding the selection made for each output in order to generate improved change values.

During this phase, the algorithm:

For each output of value v, the algorithm:

  1. Calculates a target range for the total value of inputs used to pay for that output, defined by the triplet:

    (minimum, ideal, maximum) = (v, 2v, 3v)

  2. Attempts to improve upon the existing UTxO selection for that output, by repeatedly selecting additional entries at random from the available UTxO set, stopping when the selection can be improved upon no further.

    A selection with value v1 is considered to be an improvement over a selection with value v0 if all of the following conditions are satisfied:

    • Condition 1: we have moved closer to the ideal value:

      abs (idealv1) < abs (idealv0)

    • Condition 2: we have not exceeded the maximum value:

      v1maximum

    • Condition 3: when counting cumulatively across all outputs considered so far, we have not selected more than the maximum number of UTxO entries specified by Maximum Input Count.

  3. Creates a change value for the output, equal to the total value of the improved UTxO selection for that output minus the value v of that output.

  4. Updates the accumulated coin selection:

    • Adds the output to the outputs field;
    • Adds the improved UTxO selection to the inputs field;
    • Adds the change value to the change values field.

This phase ends when every output has been processed, or when the available UTxO set has been exhausted, whichever occurs sooner.

Termination

When both phases are complete, the algorithm terminates.

The accumulated coin selection is returned to the caller as the coin selection result.

The available UTxO set is returned to the caller as the remaining UTxO set result.

Reference Implementations

Largest-First

Reference implementations of the Largest-First algorithm are available in the following languages:

Language Documentation Source
Haskell Documentation Source

Random-Improve

Reference implementations of the Random-Improve algorithm are available in the following languages:

Language Documentation Source
Haskell Documentation Source

External Resources

Self Organisation in Coin Selection

Title Self Organisation in Coin Selection
Author Edsko de Vries
Year 2018
Location https://cardanofoundation.org/en/news/self-organisation-in-coin-selection/

This article introduces the Random-Improve coin selection algorithm, invented by Edsko de Vries.

It describes the three principles of self-organisation that inform the algorithm's design, and provides experimental evidence to demonstrate the algorithm's effectiveness at maintaining healthy UTxO sets over time.

Copyright

This CIP is licensed under CC-BY-4.0.