How Output descriptors enable wallet recovery
Due to the large number of keys that can be created from an extended public or private key, it is difficult for a bitcoin wallet to recover the keys used by the wallet without specifying the derivation path and the range used. If a user switches wallet software, it can be challenging to recover bitcoins using the mnemonics alone.
The wallet can choose a particular derivation path, generate keys, and look at the UTXO set outputs with the keys or the mined blocks for transactions with the key. After deriving 20 to 50 keys and failing to locate a transaction or UTXO with a corresponding key, the wallet will most likely stop searching. However, this process is ambiguous and inefficient because some bitcoins may be locked in a key in a specific derivation path, rendering them permanently lost.
Output descriptors address this issue by describing the derivation path, the range of keys with UTXOs, and transactions. Wallet software can easily use this information to recover bitcoins and prior transaction history; descriptors also provide the information required to unlock the scriptPubkeys.
Bitcoin core has supported output descriptors since v0.17, with up to 7 BIPs (Bitcoin Improvement Proposals) from a combined effort by Pieter Wuille and Andrew Chow.
Abstract for BIP380.
"Output Script Descriptors are a simple language which can be used to describe collections of output scripts. There can be many different descriptor fragments and functions. This document describes the general syntax for descriptors, descriptor checksums, and common expressions.”
The term output descriptors and wallet descriptors are interchangeable because they refer to the same thing, as they can describe a single output scriptPubkey or a range of scriptPubkeys.
The descriptor is made up of a SCRIPT, #, and CHECKSUM.
The script is made up of the descriptor type, key, # and a checksum, the checksum is like an error detection code which can tell whether a given descriptor is correct or incorrect.
The # and checksum are optional but some bitcoin implementations like bitcoin core reject a descriptor without a checksum.
Non-segwit legacy descriptors
- pk(KEY) takes a public key and describes a P2PK scriptpubkey. We already know a P2PK unlocking script is a signature with the private key.
- pkh(KEY) It takes a public key and describes a P2PKH scriptPubkey.
- sh(SCRIPT) takes a script and describes a P2SH scriptpubkey.
Segwit output descriptors
- wsh(SCRIPT) takes a script and describes P2WSH scriptPubKey.
- wpkh(KEY) takes a compressed public key and describes a P2WPKH scriptPubKey.
Multisig and combo output descriptors
- combo(KEY) it takes a public key and describes pk(KEY) and pkh(KEY) scriptPubkeys. If the key is compressed, it also includes wpkh(KEY) and sh(wpkh(KEY)) scriptPubkeys.
- multi(k,KEY_1,KEY_2,…,KEY_n) it takes a keys and a threshold and describes k-of-n multisig scriptPubkey using OP_CHECKMULTISIG.
- sortedmulti(k, KEY_1, KEY_2,…, KEY_n) takes keys and a threshold and describes k-of-n multisig scriptPubkey with the keys sorted lexicographically in the resulting script.
Taproot Output descriptors
- sortedmulti_a(k, KEY_1, KEY_2,…, KEY_N) takes a taproot key similar to multi_a, but the public keys in it will be sorted lexicographically.
- tr(KEY) or tr(KEY, TREE) takes a taproot pubkey or a taproot public key and a MAST tree which describes P2TR output scriptPubkey with the specified key as the internal key, and optionally a tree of script paths.
- rawtr(KEY) takes a taproot pub key and describes a P2TR output scriptPubkey with the specified key as output key. This has downsides, like being unable to prove no hidden script path exists.
Address Output descriptor
- addr(ADDR) takes an address and generates the corresponding scriptPubkey which the ADDR expands to.
- raw(HEX) takes a raw hex value and describes the scriptPubkey whose hex encoding is HEX.
Refer to the documentation for examples of how to use all the above output descriptors.
We will be using the wpkh descriptor as an example and we will see how we can use it to recover the keys of an HD wallet.
The mnemonics of an HD wallets can be used to derive an extended public key and an extended private key.
Assuming we used BIP84 derivation paths to generate all our addresses.
The receiving path is m/84'/0'/0'/0/*, whereas the change path is m/84'/0'/0'/1/*
The descriptor to recover our keys is wpkh(xpriv+derivationpath)#checksum
As indicated in the wpkh documentation.
xpriv is our extended private keys.
Assuming we have a new wallet in a bitcoin core software fully synced with all UTXO set.
Example mnemonic
inform pumpkin sting toss wood mesh now hammer lawsuit scrub flame seek
The extended private key is xprv9s21ZrQH143K4FVVuFzDzY3fG2KvGubYquQBZrwuxvrvunx7q6Q5v1mgxGQknHsCcNaK7K43uBZxkDxZgSuRSCn9A3FJc2i5KSsrGeMattP
The receiving account path /84h/0h/0h/0/*
The change account path /84h/0h/0h/1/*
The receiving account checksum gf9hh7nn
The change account checksum is eaqk2trt
You calculate the checksum using the reference implementation algorithm.
Our descriptors are;
wpkh(xprv9s21ZrQH143K4FVVuFzDzY3fG2KvGubYquQBZrwuxvrvunx7q6Q5v1mgxGQknHsCcNaK7K43uBZxkDxZgSuRSCn9A3FJc2i5KSsrGeMattP/84h/0h/0h/0/*)#gf9hh7nn
wpkh(xprv9s21ZrQH143K4FVVuFzDzY3fG2KvGubYquQBZrwuxvrvunx7q6Q5v1mgxGQknHsCcNaK7K43uBZxkDxZgSuRSCn9A3FJc2i5KSsrGeMattP/84h/0h/0h/1/*)#eaqk2trt
Assuming we have used 300 keys from both receiving and change account.
With the Bitcoin core import descriptors RPC.
bitcoin-cli importdescriptors '[{ "desc": "wpkh(xprv9s21ZrQH143K4FVVuFzDzY3fG2KvGubYquQBZrwuxvrvunx7q6Q5v1mgxGQknHsCcNaK7K43uBZxkDxZgSuRSCn9A3FJc2i5KSsrGeMattP/84h/0h/0h/1/*)#eaqk2trt","timestamp": "now" }, { "desc": "wpkh(xprv9s21ZrQH143K4FVVuFzDzY3fG2KvGubYquQBZrwuxvrvunx7q6Q5v1mgxGQknHsCcNaK7K43uBZxkDxZgSuRSCn9A3FJc2i5KSsrGeMattP/84h/0h/0h/1/*)#eaqk2trt","timestamp": "now"}]'
We recover out wallet and can spend the coins in it.
https://developer.bitcoin.org/reference/rpc/importdescriptors.html
Note: The default key range bitcoin core will scan for is 0 to 999. The range is flexible, we can change the range if we have used more than 1000 keys, using the keypool config variable in the bitcoin.conf file, or by adding range value in portdescriptor command.
With these two descriptors, we can effectively recover all the outputs and transaction history of our wallet and spend bitcoins controlled by the keys.
Conclusion
In conclusion, output descriptors provide a simple and effective solution to the problem of recovering keys from an extended public or private key in a Bitcoin wallet. With output descriptors, users can easily recover their bitcoins and previous transaction history by describing the derivation path and range of keys with the UTXOs and transactions. By using output descriptors, users can import their receiving and change scriptPubKeys during a wallet import, making it easier to recover their Bitcoin wallet. With the support of output descriptors and miniscript in Bitcoin Core, users can enjoy a more efficient and secure Bitcoin wallet experience. Miniscript is beyond the scope of this article and we shall look at Miniscript in a subsequent article.