**Introduction**
- Alice pays Bob by convincing full nodes to update their database to say that some of Alice's bitcoins are now controlled by Bob
- Everything in bitcoin is designed to ensure that transactions can be created, propagated on the network, validated, and finally added to the global ledger of transactions (the blockchain)
- Transactions are data structures that encode the transfer of value between participants in the bitcoin system
- Each transaction is a public entry in the blockchain, the global double-entry bookkeeping ledger
- The term "wallet" in this chapter refers to the software that constructs transactions, not just the database of keys
### Transactions in Detail
- c.f. Block Explorer transaction between Bob and Alice
- Much of the information constructed by the block explorer is not actually in the transaction
**Behind the Scenes**
- Raw transaction ![[Pasted image 20241210200440.png]]![[Pasted image 20241210200504.png]]
- Questions we will consider in these notes
- Where is Alice's address?
- Where is Bob's address?
- Where is the 0.1 input sent by Alice?
- In bitcoin there are no coins, no senders, no recipients, no balances, no accounts, and no addresses
- They are constructed at a higher level for the benefit of the user
**Transactions Outputs and Inputs**
- The fundamental building block of a bitcoin transaction is a *transaction output*
- Transaction outputs are indivisible chunks of bitcoin currency, recorded on the blockchain and recognized by the entire network
- Bitcoin full-nodes track all available and spendable outputs (known as *unspent transaction outputs* or UTXO)
- The collection of all UTXO is known as the UTXO set and currently numbers in the millions
- Grows as new UTXO is created and shrinks when UTXO is consumed
- Every transaction represents a change (state transition) in the UTXO set
- When we say that a user's wallet has *received bitcoin*, what we mean is that the wallet has *detected a UTXO that can be spent with one of the keys controlled by that wallet*
- Thus a user's bitcoin "balance" is the sum of all UTXO that user's wallet can spend
- It may be scattered among hundreds of transactions and hundreds of blocks
- The concept of a "balance" is created by the wallet application which calculates the user's balance by scanning the blockchain and aggregating the value of any UTXO the wallet can spend with the keys it controls
- Most wallets maintain a database or use a database service to store a quick reference set of all UTXO they can spend with the keys they control
- A transaction output can have arbitrary (integer) value denominated as a multiple of satoshis
- Although an output can have any arbitrary value, once created it is **indivisible**
- This is an important characteristic: discrete and indivisible
- An unspent output can only be consumed in its entirety by a transaction
- If an UTXO is larger than the desired value of the transaction it must still be consumed in its entirety and change must be generated
- As a result of the indivisible nature of transaction outputs, most bitcoin transactions will have to generate change
- The exception to the output and input chain is a special type of transaction called the *coinbase* transaction, which is the first transaction in each block
- This transaction is placed there by the "winning" miner and creates brand-new bitcoin payable to that miner as a reward for mining
- This special coinbase transaction does not consume UTXO, instead, it has a special type of input called the "coinbase"
- This is how bitcoin's money supply is created during the mining process c.f. Chapter 10
**Transaction Outputs**
- Every bitcoin transaction creates outputs, which are recorded on the bitcoin ledger
- Almost all of these outputs (with one exception c.f. Data Recording Output page 155) create spendable chunks of bitcoin called UTXO, which are then recognized by the whole network and available for the owner to spend in a future transaction
- UTXO are tracked by every full-node bitcoin client in the UTXO set
- Transaction outputs consist of 2 parts
1. An amount of bitcoin (denominated in satoshis)
2. A cryptographic puzzle that determines the conditions required to spend the output
- The cryptographic puzzle is also known as a locking script, witness script, or a $\text{scriptPubKey}$
- Alice's transaction
- In the JSON encoding, the outputs are in an array (list) name vout
- As you can see, the transaction contains 2 outputs
- The second part of the output is the cryptographic puzzle that sets the conditions for spending
- Bitcoin Core shows this as a $\text{scriptPubKey}$ and shows us a human-readable representation of the script
**Transaction Serialization - Outputs**
- When transactions are transmitted over the network or exchanged between applications, they are serialized
- This is a process of converting the internal representation of the data structure into a format that can be transmitted one byte at a time (byte stream)
- Serialization is most commonly used for encoding data structures for transmission over a network or for storage in a file![[Pasted image 20241210204454.png]]
- The serialization format of a transaction output is shown above
- Most bitcoin libraries and frameworks do not store transactions internally as byte streams
- Would require complex parsing every time you need to access a single field
- For convenience and readability, bitcoin libraries store transactions internally in data structures (usually object oriented)
- Deserialization or Transaction Parsing is the process of converting the byte stream representation of a transaction to a library's internal representation data structure
- Converting back to a byte stream for transmission over the network, for hashing, or for storage on disk is serialization
**Transaction Inputs**
- These help identify (by reference) which UTXO will be consumed and provide proof of ownership through an unlocking script
- For each UTXO that will be consumed to make this payment, the wallet creates one input pointing to the UTXO and unlocks it with an unlocking script
- Components of an input
1. The first part of an input is a pointer to an UTXO by reference to the *transaction hash* and *sequence number* where the UTXO is recorded in the blockchain
2. The second part is an unlocking script, which the wallet constructs in order to satisfy the spending conditions set in the UTXO
3. Sequence number
- Unlocking script
- Most often the unlocking script is a digital signature and public key providing ownership of the bitcoin
- However, not all unlocking scripts contain signatures
- Example
![[Pasted image 20241210210114.png]]
- The transaction inputs are an array called `vin`
- There is only one input in the list (because one UTXO contained sufficient value to make this payment)
- The input contains 4 elements
1. Transaction ID: referencing the transaction that contains the UTXO being spent
2. An output index: identifying which UTXO from that transaction is referenced (first one is $0$)
3. A `scriptSig`: which satisfies the conditions placed on the UTXO, unlocking it for spending
4. Sequence number: discussed later
- In Alice's transaction, the input points to transaction ID $\text{7957a3}\dots \text{6f18}$ and output index $0$ i.e. the first UTXO created by that transaction
- The unlocking script is constructed by Alice's wallet by first retrieving the referenced UTXO, examining its locking script, and then using it to build the necessary unlocking script to satisfy it
- Notice we know nothing about the UTXO other than a reference to the transaction containing it
- We do not know its value in satoshi, nor the locking script that sets the conditions for spending it
- To find this info, we must retrieve the referenced UTXO by retrieving the underlying transaction
- Since the value of the input is not explicitly stated, we must also use the referenced UTXO in order to calculate the fees that will be paid in this transaction
- It is not just Alice's wallet that needs to retrieve UTXO referenced in the inputs
- Once this transaction is broadcast to the network, every validating node will also retrieve the UTXO referenced in the transaction inputs in order to validate the transaction
- Transactions on their own seem incomplete because they lack context
- They reference UTXO in their inputs but without retrieving that UTXO we cannot know the value of their inputs or their locking conditions
- When writing bitcoin software, anytime you decode a transaction with the intent of validating it, or counting the fees, or checking the unlocking script, your code will first have to retrieve the referenced UTXO from the blockchain in order to build the context implied but not present in the UTXO references of the inputs
**Transaction Serialization - Inputs**
- When transactions are serialized for transmission over the network, their inputs are encoded into a byte stream ![[Pasted image 20241210212501.png]]![[Pasted image 20241210210114.png]]
- We can identify the following fields in the serialized hex encoding![[Pasted image 20241210212730.png]]
- Hints
- The transaction ID is serialized in reverse byte order, so it starts with $18_{\text{hex}}$ and ends with $79_{\text{hex}}$
- The output index is a $4$-byte group of zeros
- The length of the `scriptSig` is $139$ bytes or $8\text{b}_{\text{hex}}$
- The sequence number is set to $\text{FFFFFFFF}$
**Transaction Fees**
- Most transactions include transaction fees which compensate miners for their work while also acting as a security mechanism by making it economically infeasible for attackers to flood the network with transactions
- Most wallets calculate and include transaction fees automatically
- Though if you are constructing transactions programmatically or via command line, you must manually account for them
- They serve as an incentive to include (mine) a transaction into the next block and also as a disincentive against abuse of the system by imposing a small cost on every transaction
- Transaction fees are collected by the miner who mines the block that records the transaction on that blockchain
- Transaction fees are calculated based on the size of the transaction in kilobytes (not the value of the bitcoin transaction)
- Overall, transaction fees are set based on market forces within the bitcoin network
- Miners prioritize transactions based on many different criteria, including fees (and might even process transactions for free under certain circumstances)
- Transaction fees affect the processing priority, meaning that a transaction with sufficient fees is likely to be included in the next block mined, while one without might be delayed
- Transaction fees are not mandatory, and transactions without fees might be processed eventually
- Over time, the way transaction fees are calculated and the effect they have on transaction prioritization has evolved
- At first they were fixed and constant across the network
- Since 2016, capacity limits in bitcoin have created competition between transactions, resulting in higher fees and effectively making free transactions a thing of the past
- Zero or very low fee transactions rarely get mined and sometimes will not even be propagated across the network
- In Bitcoin Core, fee relay policies are set by the `minrelaytxfee` option which currently by default is 0.00001 BTC
- Thus by default transactions with a fee less than this are treated as free and only relayed if there is space in the mempool, else dropped
**Dynamic Fees**
- Any bitcoin service that creates transactions *must* implement dynamic fees
- These can be implemented through a 3rd party fee estimation service with a built-in fee estimation algorithm (or which you can implement yourself)
- Fee estimation algorithms calculate the appropriate fee, based on capacity and the fees offered by "competing" transaction
- These range from simplistic average or median fee in the last block to sophisticated statistical analysis
- They estimate the necessary fee (in satoshis per byte) that will give a transaction a high probability of being selected and included within a certain number of blocks
![[Pasted image 20241210214850.png]]
- This chart shows the realtime estimate of fees in 10 satoshi/byte increments and the expected confirmation time (in minutes and number of blocks) for transaction with fees in each range
- For each range, two horizontal bars show the number of unconfirmed transactions and the total number of transactions in the past 24 hours
- Based on the graph, the recommended high priority fee at this time was 80 satoshi/byte, a fee likely to result in the transaction being mined in the very next block (zero block delay)
**Adding Fees to Transactions**
- The data structure does not have a field for fees
- Instead, they are implied as the difference between the sum of inputs and sum of outputs
- A transaction made up of many little UTXO could be several kilobytes in size requiring a much higher fee than the median-sized transaction
- Eugenia's wallet application will calculate the appropriate fee by measuring the size of the transaction and multiplying that by the per-kilobyte fee
- The fee is independent of the transaction's bitcoin value
**Transaction Scripts and Script Language**
- The bitcoin transaction script language (called Script) is a Forth-like reverse-polish notation stack-based execution language
- Both the locking script placed on a UTXO and the unlocking script are written in this scripting language
- When a transaction is validated, the unlocking script in each input is executed alongside the corresponding locking script to see if it satisfies the spending condition
- Script is a very simple language which was designed to be limited in scope and executable on a range of hardware (perhaps as simple as an embedded device)
- For its use in validating programmable money, this is a deliberate security feature
- Today, most transactions processed through the bitcoin network have the form where you pay to a specific address and are based on a script called "Pay-to-Public-Key-Hash" script
- Bitcoin transactions are not limited to this however
- Locking scripts can be written to express a vast variety of complex conditions
- Bitcoin transaction validation is not based on a static pattern, but instead is achieved through the execution of a scripting language
- This language allows for a nearly infinite variety of conditions to be expressed
- Thus, programmable money
**Turing Incompleteness**
- The bitcoin transaction script language contains many operators, but is deliberately limited in one important way
- There are no loops or complex flow control capabilities, other than conditional flow control
- This ensures the language is not *Turing Complete*
- i.e. the scripts have limited complexity and predictable execution times
- Script is not a general-purpose language and its limitations ensure that the language cannot be used to create an infinite loop or "logic bomb" that could be embedded in a transaction in a way that causes a DoS attack against the bitcoin network
- Recall that every transaction is validated by every full node on the bitcoin network
- A limited language prevents the transaction validation mechanism from being used as a vulnerability
**Stateless Verification**
- The bitcoin transaction script language is stateless, in that there is no state prior to the execution of the script, or state saved after the execution of the script
- Thus all information needed to execute the script is contained within the script
- A script will predictably execute the same way on any system
- If your system verifies a script, you can be sure that every other system in the bitcoin network will also verify the script
- Meaning that a valid transaction is valid for everyone and everyone knows this
**Script Construction - Lock + Unlock**
- Bitcoin's transaction validation engine relies on 2 types of scripts to validate transactions i.e. locking script and unlocking script
- A locking script is a spending condition placed on an output
- It specifies the conditions that must be met to spend the output in the future
- Historically, the locking script was called a `scriptPubKey`
- It usually contained a public key or bitcoin address (public key hash)
- Also referred to as witness script or cryptographic puzzle
- An unlocking script is a script that "solves" or satisfies the conditions placed on an output by a locking script and allows the output to be spent
- They are a part of every transaction input
- Often contain a digital signature provided by the user's wallet from his or her private key
- Historically, the unlocking script was called `scriptSig` because it usually contained a digital signature
- Also referred to as witness
- Every bitcoin validating node will validate transactions by executing the locking and unlocking scripts together
- Each input contains an unlocking script and refers to a previously existing UTXO
- The validation software will copy the unlocking script, retrieve the UTXO referenced by the input, and copy the locking script from that UTXO
- The locking and unlocking script are then executed in sequence
- All inputs are validated independently as part of the overall validation of the transaction
- Only a valid transaction that correctly satisfies the conditions of the output results in the output being considered as "spent" and removed from the UTXO set
- ![[Pasted image 20241210230637.png]]
**The Script Execution Stack**
- Bitcoin's scripting language is called a stack-based language because it uses a data structure called a stack
- Allows push (adds item to top of stack) and pop (removes the top item from the stack) operations
- Also known as a LIFO queue
- The scripting language processes each item from left to right
- Numbers (data constants) are pushed onto the top
- Operators push or pop one or more parameters from the stack, act on them, and then might push a result onto the stack
- e.g. `OP_ADD` will pop two items from the stack, add them, and push the resulting sum onto the stack
- Conditional operators evaluate a condition, producing a boolean result
- e.g. `OP_EQUAL` pops two items from the stack and pushes true (i.e. $1$) if equal
- Bitcoin transaction scripts usually contain a conditional operator, so that they can produce the $1$ which signifies a valid transaction
**Simple Script**
- The script `2 3 OP_ADD 5 OP_EQUAL` adds two numbers and puts the result on the stack, followed by the conditional operator which checks that the resulting sum is equal to five
- Although most locking scripts refer to a public key hash (essentially a bitcoin address) thereby requiring proof of ownership to spend the funds, the script does not have to be that complex
- Any combination of locking and unlocking scripts that results in a boolean true is valid
- Even the example could be used
- Locking script `3 OP_ADD 5 OP_EQUAL`
- Unlocking script `2`
- The validation software combines them and runs the combination
- The UTXO of this locking script can be spent by anyone with the skills to know that `2` satisfies the script
- Transactions are valid if the top result of the stack is boolean TRUE (noted as 0x01), any other nonzero value, or if the stack is empty after script execution
- Transactions are invalid if the top value of the stack is boolean FALSE (a zero-length empty value {}) or if the script execution is halted explicitly by an operator such as `OP_VERIFY` or `OP_RETURN` or a condition terminator such as `OP_ENDIF`
- ![[Pasted image 20241210232325.png]]
**Separate Execution of Unlocking and Locking Scripts**
- In the original bitcoin client, the unlocking and locking scripts were concatenated and executed in sequence
- For security reasons, this was changed in 2010 because of a vulnerability that allowed a malformed unlocking script to push data onto the stack and corrupt the locking script
- Currently the scripts are executed separately, with the stack transferred between 2 executions
- First, the unlocking script is executed, using the stack execution engine
- If the result of executing the locking script with the stack data copied from the unlocking script is TRUE, the unlocking script has succeeded in resolving the conditions imposed by the locking script
- Else the input is invalid
**Pay-to-Public-Key-Hash - P2PKH**
- Vast majority of transactions processed on the bitcoin network spend outputs locked with a P2PKH script
- These outputs contain a locking script that locks the output to a public key hash (i.e. bitcoin address)
- An output locked by a P2PKH script can be unlocked (spent) by presenting a public key and a digital signature created by the corresponding private key
- e.g. Alice's payment to Bob's Café
- Alice's transaction as a payment of 0.015 BTC would have a locking script of the form:
- `OP_DUP OP_HASH160 <Cafe Public Key Hash> OP_EQUALVERIFY OP_CHECKSIG`
- The café public key hash is equivalent to the bitcoin address of the café *without* the Base58Check encoding
- Most applications would show the public key hash in hex encoding and not the familiar bitcoin address Base58Check format that begins with a $1$
- `OP_DUP` pushes a copy of the topmost stack item onto the stack
- The locking script can be satisfied with an unlocking script of the form (given by the Café when they want to send money):
- `<Cafe Signature> <Cafe Public Key>`
- The result will be TRUE if the unlocking script has a valid signature from the cafe's private key that corresponds to the public key hash set as an encumbrance
![[Pasted image 20241211011350.png]]
![[Pasted image 20241211011412.png]]
**Digital Signatures - ECDSA**
- Elliptic Curve Digital Signature Algorithm
- We investigate how digital signatures work and how they can present proof of ownership of a private key without revealing that private key
- ECDSA is used by the script functions
- `OP_CHECKSIG`
- `OP_CHECKSIGVERIFY`
- `OP_CHECKMULTISIG`
- `OP_CHECKMULTISIGVERIFY`
- Any time you see those in a locking script, the unlocking script must contain an ECDSA signature
- A digital signature serves 3 purposes
1. The signature proves that the owner of the private key has authorized the spending of those funds
- Authentication
2. The proof of authorization is undeniable
- Nonrepudiation
3. The signature proves that the transaction (or specific parts of it) have not and cannot be modified by anyone after it has been signed
- Integrity
- Each transaction input is *signed independently*
- This is critical as neither the signatures nor the inputs have to belong or be applied by the same owners
- A specific transaction scheme called "CoinJoin" uses this fact to create multiparty transactions for privacy
- Each transaction input and any signature it can contain is completely independent of any other input or signature
- Multiple parties can collaborate to construct transactions and sign only one input each
### How Digital Signatures Work
- This mathematical scheme consists of 2 parts
1. Algorithm for creating the signature, using a private key (the signing key), from a message (the transaction)
2. Algorithm that allows anyone to verify the signature, given also the message and public key
**Creating a Digital Signature**
- In bitcoin's implementation of the ECDSA algorithm, the "message" being signed is the transaction, or more accurately, a hash of a specific subset of data in the transaction (c.f. page 141)
- $\text{Sig}=F_{\text{sig}}(F_{\text{hash}}(m),dA)$
- Signing private key $dA$
- Transaction (or parts of it) $m$
- Hashing function $F_{\text{hash}}$
- Signing algorithm $F_{\text{sig}}$
- Resulting signature $\text{Sig}$
- $F_{\text{sig}}$ produces a signature $\text{Sig}$ that is composed of two values $R$ and $S$
- $\text{Sig}=(R,S)$
- Now that the two values have been calculated, they are serialized into a byte stream using an international encoding scheme called Distinguished Encoding Rules (DER)
**Serialization of Signatures (DER)**
- Taking a look at the transaction Alice created
- In the transaction input there is an unlocking script that contains the following DER-encoded signature from Alice's wallet
- $304502210088\dots \text{f6e381301}$
- This signature is a serialized byte stream of the $R$ and $S$ values produced by Alice's wallet to prove she owns the private key authorized to spend that output (contains 9 elements)
- $30$
- Start of the DER sequence
- $45$
- Length of the sequence (69 bytes)
- $02$
- An integer value follows
- $21$
- Length of the integer (33 bytes)
- $0088\dots \text{25cb}$
- $R$
- $02$
- Another integer follows
- $20$
- Length of the integer (32 bytes)
- $\text{4b9f}\dots 3813$
- $S$
- Suffix
- Indicating the type of hash used
- e.g. $01$ indicates `SIGHASH_ALL`
**Verifying the Signature**
- To verify it, one must have
- The signature ($R$ and $S$)
- The serialized transaction
- The public key (corresponding to the private key used to create the signature)
- "Only the owner of the private key that generated this public key could have produced this signature on this transaction"
- The signature verification algorithm takes the message (a hash of the transaction, or parts of it), the signer's public key, and the signature ($R$ and $S$ values)
- Returns TRUE if the signature is valid for this message and public key
**Signature Hash Types - SIGHASH**
- The signature implies a commitment by the signer to specific transaction data
- In the simplest form, the signature applies to the entire transaction thereby committing all the inputs, outputs, and other transaction fields
- However, a signature can commit to only a subset of the data in a transaction (which is useful for a number of scenarios)
- Bitcoin signatures have a way of indicating which part of the transaction's data is included in the hash signed by the private key using the `SIGHASH` flag
- This flag is a single byte that is appended to the signature
- Every signature has this flag and it can be different from input to input
- A transaction with 3 signed inputs may have 3 different signatures with different `SIGHASH` flags, each signature signing (committing) different parts of the transaction
- Recall each input may contain a signature in its unlocking script
- As a result, a transaction that contains several inputs may have signatures with different `SIGHASH` flags that commit different parts of the transaction in each of the inputs
- Note also that bitcoin transactions may contain inputs from different "owners" who may sign only one input in a partially constructed (and invalid) transaction, collaborating with others to gather all the necessary signatures to make a valid transaction
- Many of the `SIGHASH` flag types only make sense in multiple participants collaborating outside the bitcoin network and updating a partially signed transaction
- ![[Pasted image 20241211034151.png]]![[Pasted image 20241211034201.png]]
-------------------------------------------------------
# 3rd Edition
**A Serialized Bitcoin Transaction**
- Other than Bitcoin Core's serialization format, the only other widely used transaction serialization format is the Partially Signed Bitcoin Transaction (PSBT) format documented in BIP-174 and BIP-370
- PSBT allows an untrusted program to produce a transaction template that can be verified and updated by trusted programs (e.g. hardware signing devices) that have the necessary private keys or other sensitive data to fill in the template
- PSBT is less compact but recommended to developers of wallets that plan to support signing with multiple keys
![[Pasted image 20241211165353.png]]![[Pasted image 20241211165411.png]]
- The original version of Bitcoin transactions was version 1
- Version 2 bitcoin transactions were introduced in BIP68 soft fork change to bitcoin's consensus rules
- BIP68 places additional constraints on the sequence field
- The next 2 fields were added as a part of the segregated witness (segwit) soft fork change to Bitcoin's consensus rules
- If the transaction includes a witness structure, the marker must be zero and the flag must be nonzero
- If the transaction does not need a witness stack, the marker and flag must not be present (compatible with legacy serialization)
- In legacy serialization, the marker byte would have been interpreted as the number of inputs (zero) though a transaction cannot have zero inputs so that marker signals to modern programs that extended serialization is being used
**Protecting Presigned Transactions**
- The last step before broadcasting a transaction to the network for inclusion in the blockchain is to sign it
- It is possible to sign a transaction before broadcasting it immediately
- Adding new constraints (such as BIP68 did to the sequence field) may invalidate some presigned transactions
- If there is no way to create a new signature for an equivalent transaction, the money being spent on the presigned transactions is permanently lost
- This problem is solved by reserving some transaction features for upgrades, such as version numbers
- Anyone creating presigned transactions prior to BIP68 should have been using version 1 transactions, so only applying BIP68's additional constraints on sequence transactions v2 or higher should not invalidate any presigned transactions
- If you implement a protocol that uses presigned transactions, ensure that it does not use any features that are reserved for future upgrades
**Inputs**
- The input field contains several other fields
- ![[Pasted image 20241211170548.png]]
- The transaction input list starts with an integer indicating the number of inputs in the transaction
- Each input in a transaction must contain 3 fields
1. An outpoint field
2. A length-prefixed input script field
3. A sequence
- Some inputs also include a witness stack, but this is serialized at the end of the transaction
**Outpoint**
- For Alice to transfer control of some of her bitcoins to Bob, she first needs to tell full nodes how to find the previous transfer where she received those bitcoins
- Since control over bitcoins is assigned in transaction outputs, Alice *points* to the previous output using the outpoint field
- Each input must contain a single outpoint
- The outpoint contains a 32-byte txid for the transaction where Alice received the bitcoins she now wants to spend
- This txid is in Bitcoin's internal byte order for hashes
- Since transactions can contain multiple outputs, Alice needs to identify which particular output from that transaction to use i.e. *output index*
- These indices are 4-byte unsigned integers starting from 0
- Upon finding the previous output, the full node obtains several critical pieces of information from it
- The amount of bitcoins assigned to that previous output
- The authorization conditions for that previous output
- For confirmed transactions, the height of the block that confirmed it and the median time past (MTP) for that block (which is required for relative timelocks and outputs of coinbase transactions)
- Proof that the previous output exists in the blockchain (or as a known unconfirmed transaction) and that no other transaction has spent it (rule against double spending)
- Each time a new block of transactions arrives, all of the outputs they spend are removed from the UTXO database and all of the outputs they create are added to the database
**Internal and Display Byte Orders**
- Using the txid from the outpoint shown in the example transaction to retrieve a raw transaction in Bitcoin Core will show an error because internally the txid is reversed
- Thus, internal byte order is used for the data that appears within transactions and blocks (little-endian byte order)
- Display byte order for the form displayed to users (big endian byte order)
**Input Script**
- The input script field is a remnant of the legacy transaction format
- Our example transaction input spends a native segwit output that does not require any data in the input script, so the length prefix for the input script is set to zero
- Pre-SegWit
- The input script (known as `ScriptSig`) was used to provide the necessary data to unlock and spend the outputs from a previous transaction
- This script usually contained a signature and public key
- Post-SegWit
- Now the witness data (which includes the signatures) is separated from the transaction inputs and is not counted towards the block size limit in the same way
- However, the input script (`ScriptSig`) still exists and can be used, but for SegWit transactions, the witness data is mainly used to satisfy the unlocking conditions
- Traditional transactions: inputs scripts (`ScriptSig`) contain the signature and public key needed to spend funds
- SegWit transactions: inputs scripts still exist, but the main unlocking data (signatures) is moved to the witness part of the transaction
### Sequence
- The final 4 bytes of an input are its sequence number
- The use and meaning of this field has changed over time
**Original Sequence-Based Transaction Replacement**
- The sequence field was originally intended to allow creation of multiple versions of the same transaction, with later versions replacing earlier versions as candidates for confirmation
- The sequence number tracked the version of the transaction
- e.g. Alice and Bob bet on a game of cards
- They start by each signing a transaction that deposits some money into an output with a script that requires signatures from both to spend (i.e. a multisignature script)
- This is called a setup transaction
- They then create a transaction that spends that output
1. The first version of the transaction with nSequence 0, pays Alice and Bob back the money the initially deposited (i.e. a refund transaction) but neither of them broadcasts the refund transaction at the time (it is only there if there is a problem)
2. Alice wins the first round so the second version of the transaction (with sequence 1) increases the amount paid to Alice and decreases Bob's share, so they both sign and update the transaction (and do not broadcast unless there is a problem)
3. After many more rounds where the sequence is incremented, the funds redistributed, and the resulting transaction is signed but not broadcast, they decide to finalize the transaction. Creating a transactions with the final balance of funds, they set sequence to its maximum value (0xffffffff), finalizing the transaction. They broadcast this version of the transaction, it is relayed across the network and eventually confirmed by the miners
- If Alice broadcasts the final transaction but Bob broadcasts an earlier version, the full nodes using the original bitcoin code will not relay it to the miners and the miners who also used the original code will not mine it
- If Bob broadcasts it before Alice then the miners and relays will stop when Alice's version with a higher sequence number arrives
- Unless Bob got lucky and a block was discovered before Alice's version arrived, its Alice's version of the transaction that will get confirmed
- This type of protocol is what we now call a *payment channel*
- Problems with purely sequence-based payment channels
1. Rules for replacing a lower sequence transaction with a higher sequence transaction was only a matter of software policy i.e. there was no direct incentive for miners to prefer one version of the transaction over another
2. The first person to send their transaction might get lucky
3. It was possible to replace one version of a transaction with a different version an unlimited amount of times where each replacement would consume the bandwidth of all the relaying nodes on the network
- To eliminate this risk of attack, the original type of sequence-based transaction replacement was disabled in an early version of bitcoin software
- For several years, bitcoin full nodes would not allow an unconfirmed transaction containing a particular input (as indicated by its outpoint) to be replaced by a different transaction containing the same input (though this did not last forever)
**Opt-In Transaction Replacement Signaling**
- After the original sequence-based transaction replacement was disabled, a solution was proposed
- i.e. programming Bitcoin Core and other relaying full node software to allow a transaction that paid a higher transaction fee rate to replace a conflicting transaction that paid a lower fee rate
- This is called *replace by fee* (RBF)
- As documented in BIP125, an unconfirmed transaction with any input that has a sequence set to a value below 0xffffffe (i.e. at least 2 behind maximum value) signals to the network that the *signer wants* it to be replaceable by a conflicting transaction paying a higher fee rate
- Bitcoin Core allowed those unconfirmed transactions to be replaced and continued to disallow other transactions from being replaced
- This allowed users and businesses that objected to replacement to simply ignore unconfirmed transactions containing the BIP125 signal until they became confirmed
- c.f. RBF Fee Bumping for more modern transaction replacement policies
**Sequence as a Consensus-Enforced Relative Timelock**
- In "Version" on page 121, we learned that the BIP68 soft fork added a new constraint to transactions with version numbers 2 or higher, and this constraint applies to the sequence field
- Transaction inputs with sequence values less than $2^{31}$ are interpreted as having a relative timelock
- Such a transaction may only be included in the blockchain once the previous output (referenced by the outpoint) has aged by the relative timelock amount
- Since sequence is a per-input field, a transaction may contain any number of timelocked inputs, all of which must have sufficiently aged for the transaction to be valid
- The sequence value is specified in either blocks or seconds
- Differentiated by a type-flag (in the 23rd least significant bit)
- When interpreting sequence as a relative timelock, only the 16 least significant bits are considered
- Maximum relative timelock is a bit more than a year
- ![[Pasted image 20241211221114.png]]
**Outputs**
![[Pasted image 20241211221226.png]]
- The outputs field of a transaction contains several fields related to specific outputs
- Outputs Count
- Identical to the start of the inputs section of a transaction, the outputs field begins with a count indicating the number of outputs in this transaction
- In this example, we have $2$
- Amount
- The first field of a specific output
- Also called "value" in Bitcoin Core
- 8-byte signed integer indicating the number of satoshis to transfer
- $\in [0,21\text{ million BTC}]$
- Uneconomical outputs and disallowed dust
- Uneconomical outputs are where the value of the output is less than the cost of the additional fee (e.g. zero-value output, but also possibly some low value output)
- For UTXOs containing significant value, there is an incentive to eventually spend them, but there is no incentive for the person controlling uneconomical UTXO to ever spend it, potentially making it a perpetual burden on operators of full nodes
- Several full node implementations such as Bitcoin Core discourage the creation of uneconomical outputs using policies that affect the relay and mining of unconfirmed transactions
- These are called *dust policies*
- Many programs assume outputs with less than 546 satoshis are dust and will not be relayed or mined by default
- It is encourage to used presigned transactions or multiparty protocols to check whether policies change
- Utreexo will still require some nodes to store all UTXO data, especially nodes serving miners and other operations that need to quickly validate new blocks
- Thus uneconomical outputs can still be a problem for full nodes even in a possible future where most nodes use Utreexo
- Bitcoin Core's policy rules about dust do have one exception
- Output scripts starting with `OP_RETURN` (which causes the script to immediately fail) called *data carrier outputs* can have a value of zero
**Output Scripts**
- The output amount is followed by a compactSize integer indicating the length of the output script
- i.e. the script that contains the conditions needed to be fulfilled in order to spend the bitcoins
- An output script can be almost as large as the transaction containing it, and a transaction can be almost as large as the block containing it
- No explicit limit on the size of an output script however
- An output script with zero length can be spent by an input script containing `OP_TRUE`
- Anyone can create the input script, which means anyone can spend an empty output script
- Bitcoin Core's policy for relaying and mining transactions effectively limits output scripts to just a few templates called *standard transaction outputs*
**Witness Structure**
- When spending bitcoins, the important problem we want to solve is determining whether the spend was authorized by the person or people who control those bitcoins
- The thousands of full nodes that enforce bitcoin's consensus rules cannot interrogate human witnesses, but can accept witnesses that consist entirely of data for solving math problems
- An *unforgeable* digital signature scheme uses an equation that can only be solved by someone in possession of certain data they are able to keep secret
- They are able to reference that secret data using a public identifier (i.e. public key)
- A solution to the equation is called a signature
- The following script contains a public key and an opcode that requires a corresponding signature commit to the data in the spending transaction
- `<public key> OP_CHECKSIG`
- Witnesses (i.e. the values used to solve the math problems that protect bitcoins) need to be included in the transactions where they're used in order for full nodes to verify them
- In the legacy transaction format for all early bitcoin transactions
- Signatures and other data were placed in the input script field
- However when devs started to implement contract protocols on bitcoin (c.f. Original Sequence based Transaction Replacement page 127) they discovered several significant problems with placing witnesses in the input script field
**Circular Dependencies
- Many contract protocols for bitcoin involve a series of transactions that are signed out of order
- e.g. Alice and Bob want to deposit funds into a script that can only be spent with signatures from both of them, but they each also want to get their money back if the other person becomes unresponsive
- Simple solution is to sign transactions out of order
- $\text{Tx}_{0}$ pays money from Alice and money from Bob into an output with a script that requires signatures from both Alice and Bob to spend
- $\text{Tx}_{1}$ spends the previous output to 2 outputs, one for refunding Alice her money and one refunding Bob his money (minus tx fee)
- If Alice and Bob sign $\text{Tx}_{1}$ *before* $\text{Tx}_{0}$, then they are both *guaranteed* to get a refund at any time (trustless protocol)
- A problem with this construction in the legacy transaction format is that every field (including the input script field that contains signatures) is used to derive a transaction identifier (txid)
- The txid for $\text{Tx}_{0}$ is part of the input's outpoint in $\text{Tx}_{1}$
- This means there is *no way* for Alice and Bob to construct $\text{Tx}_{1}$ until both signatures for $\text{Tx}_{0}$ are known
- But if they know the signatures for $\text{Tx}_{0}$, one of them can broadcast that transaction before signing the refund transaction, eliminating the guarantee of a refund
- This is a *circular dependency*
**Third Party Transaction Malleability**
- A more complex series of transactions can sometimes *eliminate* a circular dependency, but many protocols will then encounter a new concern
- It is often possible to solve the same script in different ways![[Pasted image 20241211231616.png]]
- Each alternative encoding of the number 2 in an input script will produce a slightly different transaction with a completely different txid
- Each different version of the transaction spends the same inputs (outpoints) as every other version of the transaction, making them all *conflict* with each other
- *Only* *one* version of a set of conflicting transactions can be contained within a valid blockchain
- *Unwanted third party malleability* refers to the ability of an external party to alter the txid of a bitcoin transaction before it is confirmed in a block
- (Copilot) SegWit solves this by separating the witness data (signatures) from the rest of the transaction data, which ensures that the txid calculation does not include the witness data
- There are cases when people want their transactions to be malleable and bitcoin provides features to support that, most notably signature hashes (sighash)
- e.g. Alice can use a sighash to allow Bob to help pay some transactions fees
- This mutates Alice's transaction but only in a way that Alice wants
**Second Party Transaction Malleability**
- Even if devs were able to entirely eliminate third party malleability, users of contract protocols faced another problem
- If they required a signature from someone else involved in their protocol, that person could generate alternative signatures and change the txid
- e.g. Alice and Bob deposited their money into a script requiring a signature from both of them to spend, and have also created a refund transaction that allows each of them to get their money back
- Alice decides she wants to spend just some of the money, so she cooperates with Bob to create a chain of transactions
- $\text{Tx}_{0}$ includes signatures from both Alice and Bob, spending its bitcoins in 2 outputs
- The 1st output spends some of Alice's money
- The 2nd output returns the remainder of the bitcoins back to the script requiring Alice and Bob's signatures
- Before signing this transaction, they create a new refund transaction $\text{Tx}_{1}$
- $\text{Tx}_{1}$ spends the 2nd output of $\text{Tx}_{0}$ to 2 new outputs
- One for Alice for her share of joint funds
- One for Bob for his share
- Alice and Bob both sign this transaction before they sign $\text{Tx}_{0}$
- There is no circular dependency here
- If we ignore the 3rd party transaction malleability, this looks like it should provide us with a trustless protocol
- However, it is a property of bitcoin signatures that the signer has to choose a large random number when creating their signatures
- Choosing a different random number will produce a different signature, even if everything being signed stays the same
- This mutability of signatures means that if Alice tries to broadcast $\text{Tx}_{0}$ (which contains Bob's signature), Bob can generate an *alternative signature* to create a conflicting transaction with a different txid
- If Bob's alternative version of $\text{Tx}_{0}$ gets confirmed, then Alice cannot use the presigned version of $\text{Tx}_{1}$ to claim her refund
- This type of mutation is called *unwanted second party malleability*
- I think basically Bob knows that he had a certain signature of his together with Alice in $\text{Tx}_{1}$ (which has already been broadcasted) and he quickly changes his signature to broadcast an alternative $\text{Tx}_{0}$ to the one that Alice is trying to broadcast, then if his alternative version gets confirmed first, then $\text{Tx}_{1}$ can never be confirmed
**Segregated Witness**
- As early as 2011, protocol developers knew how to solve the problems of circular dependence, third party and second party malleability
- The idea was to avoid including the input script in the calculation that produces a transaction's txid
- However segwit required a hard fork
- An alternative approach to segwit was described in late 2015 which would be a soft fork
- Backward compatible means that full nodes implementing the change must not accept any blocks that full nodes without the change would consider invalid
- As long as they obey that rule, newer full nodes can reject blocks that older full nodes would accept, giving them the ability to enforce new consensus rules if the newer full nodes represent the economic consensus among bitcoin users
- The soft fork segwit approach is based on anyone-can-spend output scripts (to ensure backward compatibility)
- Copilot
- Older nodes which have not been upgraded to SegWit see SegWit transactions as if they have "anyone-can-spend" scripts
- This means they can validate these transactions without needing to understand the new SegWit structure
- New nodes that understand segwit will look for valid witness data in the transaction, and if valid, the transaction is accepted
- Legacy nodes will see the segwit output as if it can be spent by anyone, but they will not be able to actually spend it without the correct witness data
- A script that starts with any of the numbers $0$ to $16$ (i.e. version) and followed by $2$ to $40$ bytes is defined as a segwit output script template
- The data is called a witness program
- It is also possible to wrap the segwit template in a P2SH commitment
- From the perspective of old nodes, these output script templates can be spent with an empty input script
- From the perspective of old nodes, these output script templates can be spent with an empty input script
- Old nodes allow an empty input script
- From the perspective of a new node that is aware of the new segwit rules, any payment to a segwit output script template must only be spent with an empty input script
- New nodes require an empty input script
- An empty input script keeps witnesses from affecting the txid, eliminating circular dependencies and third/second party transaction malleability
- But with no ability to put data in an input script, users of segwit output script templates need a new field (called *witness structure*)
- Bitcoin whitepaper describes a system where bitcoins were received to *public keys* (pubkeys) and spent with *signatures* (sigs)
- The public key defined who was *authorized* to spend the bitcoins (whoever controlled the corresponding private key) and the signature provided authentication that the spending transaction came from someone who controlled the private key
- To make the system more flexible, the initial release of bitcoin introduced scripts that allowed bitcoins to be received to *output scripts* and spent with *input scripts*
- Later experience with contract protocols inspired allowing bitcoins to be received to *witness programs* and spent with the *witness structure*
- ![[Pasted image 20241212020759.png]]
**Witness Structure Serialization**
- Similar to inputs and output fields, the witness structure contains other fields![[Pasted image 20241212020911.png]]
- One input and one witness item
- Unlike the inputs and outputs fields, the overall witness structure does not start with any indication of the total number of witness stacks it contains
- Instead this is implied by the inputs field
- There is one witness stack for every input in a transaction
- The witness structure for a particular input does start with a count of the number of elements they contain (i.e. *witness items*) c.f. Chapter 7
- Each witness item is prefixed by a compactSize integer indicating its size
- Legacy inputs do not contain any witness items, so their witness stack consists entirely of a count of zero (0x00)
**Lock Time**
- The final field in a serialized transaction is its lock time
- This field was part of bitcoin's original serialization format but was initially only enforced by bitcoin's policy for choosing which transactions to mine
**Coinbase Transactions**
- The first transaction in each block is a special case
- Coinbase transactions are created by the miner of the block that includes them and gives the miner the option to claim any fees paid by transactions in that block
- Special rules for coinbase transactions
- They may only have 1 input
- The single input must have an output with null txid (entirely zeros)
- The single input must have a maximal output index
- This prevents the coinbase transaction from referencing a previous transaction output which would be confusing given that the coinbase transaction pays out fees and subsidy
- The field that would contain an input script in a normal transaction is called a coinbase
- Must be at least 2 bytes and no longer than 100
- This script is not executed
- The sum of the outputs must not exceed the value of the fees collected from all the transactions in that block plus the subsidy
- Since 2017 segwit soft fork in BIP141, any block that contains a transaction spending a segwit output must contain an output to the coinbase transaction that commits to all of the transactions in the block (including their witnesses) c.f. Chapter 12
- A coinbase transaction can have any other outputs that would be valid in a normal transaction
- However, a transaction spending one of those outputs cannot be included in any block until after the coinbase transaction has received 100 confirmations
- This is called the *maturity rule*
- Coinbase transaction outputs that do not yet have 100 confirmations are called *immature*
**Weight and Vbytes**
- The modern unit of measurement for Bitcoin is called *weight* (or alternatively a *vbyte*)
- 4 unites of weight = 1 vbyte
- Blocks are limited to 4 million weight
- Block header takes up 240 weight
- Transaction count uses either 4 or 12 weight
- All the remaining weight may be used for transaction data
- To calculate the weight of a particular field in a transaction, the size of that serialized field in bytes is multiplied by a factor
- To calculate the weight of a transaction, sum together the weights of all its fields
- The factors for each of the fields in a transaction are shown below![[Pasted image 20241212023551.png]]
- The factors and the fields to which they are applied were chosen to reduce the weight used when spending a UTXO
**Legacy Serialization**
- The serialization format described in this chapter is used for the majority of new bitcoin transactions
- The older format must be used on the Bitcoin P2P network for any transaction with an empty witness structure
- Which is only valid if the transaction does not spend any witness programs
- Legacy serialization does not include the marker, flag, and witness structure fields
**Conclusion**
- We briefly looked at the input/output script and witness structure that allow specifying and satisfying conditions that restrict who can spend what bitcoins
- Understanding how to construct and use these conditions is essential to ensuring that only Alice can spend her bitcoins, so they will be the subject of the next chapter