Cryptography

SHA-256

Complexity: O(1)

Secure Hash Algorithm 256-bit โ€” the primary hashing function used in Bitcoin. Converts any input (any size) into a fixed 256-bit (32-byte) output. It's deterministic, collision-resistant, and one-way.

๐Ÿ“ Example:
import hashlib
        # Single SHA-256
        hash1 = hashlib.sha256(b"Hello Bitcoin").hexdigest()
        # Double SHA-256 (Bitcoin standard)
        hash2 = hashlib.sha256(hashlib.sha256(b"Hello Bitcoin").digest()).hexdigest()
        print(f"Double SHA-256: {hash2}")
๐Ÿ”— Used in: Mining (Proof of Work), Block hashing, Merkle trees, Address generation, Transaction IDs
#One-way #Deterministic #Collision-resistant #Avalanche effect
Cryptography

RIPEMD-160

Complexity: O(1)

RACE Integrity Primitives Evaluation Message Digest 160-bit โ€” used to shorten SHA-256 hashes from 256 bits to 160 bits for Bitcoin addresses, making them shorter and easier to use.

๐Ÿ“ Example:
import hashlib
        # SHA-256 then RIPEMD-160 (HASH160)
        sha = hashlib.sha256(b"Hello Bitcoin").digest()
        ripemd160 = hashlib.new('ripemd160', sha).hexdigest()
        print(f"HASH160: {ripemd160}")  # 40 characters (160 bits)
๐Ÿ”— Used in: Bitcoin address generation (P2PKH), Checksums
#160-bit output #Address shortening #HASH160
Cryptography

ECDSA (secp256k1)

Complexity: O(1)

Elliptic Curve Digital Signature Algorithm โ€” uses the secp256k1 curve to create cryptographic signatures proving ownership of Bitcoin without revealing the private key.

๐Ÿ“ Example:
from ecdsa import SigningKey, SECP256k1

        # Generate key pair
        sk = SigningKey.generate(curve=SECP256k1)
        vk = sk.verifying_key

        # Sign a transaction hash
        tx_hash = b"transaction_data"
        signature = sk.sign(tx_hash)

        # Verify signature
        is_valid = vk.verify(signature, tx_hash)
        print(f"Signature valid: {is_valid}")
๐Ÿ”— Used in: Transaction signing, Public/Private key pairs, ScriptSig verification
#secp256k1 curve #Digital signature #Ownership proof
Cryptography

HMAC-SHA512

Complexity: O(1)

Hash-based Message Authentication Code โ€” used in HD wallet key derivation (BIP32) to generate child keys from a master seed, creating a hierarchical tree of keys.

๐Ÿ“ Example:
import hmac
        import hashlib

        # Derive child key from parent
        master_key = b"master_secret_256_bits"
        chain_code = b"chain_code"
        index = 0

        # HMAC-SHA512
        h = hmac.new(chain_code, master_key + index.to_bytes(4, 'big'), hashlib.sha512)
        child_key = h.digest()[:32]   # First 32 bytes = child key
        child_chain = h.digest()[32:]  # Next 32 bytes = new chain code
๐Ÿ”— Used in: HD wallets (BIP32), Key derivation, Child key generation
#512-bit output #Key derivation #BIP32
Consensus

Proof of Work (PoW)

Complexity: O(n) (hash attempts)

Proof of Work โ€” requires miners to expend computational energy to find a nonce that produces a hash below the target. This secures the network, creates new blocks, and makes attacks economically infeasible.

๐Ÿ“ Example:
def proof_of_work(block_header, target):
            nonce = 0
            while True:
                attempt = block_header + str(nonce)
                block_hash = double_sha256(attempt)
                if int(block_hash, 16) < target:
                    return nonce, block_hash
                nonce += 1

        # Target determines difficulty (more zeros = harder)
        target = 0x0000000000000000000320283a032748cef8227873ff4872689bf23f1cda83a9
        nonce, block_hash = proof_of_work(header, target)
๐Ÿ”— Used in: Mining, Block creation, Network security, Difficulty adjustment
#Energy-intensive #Sybil-resistant #51% protection
Consensus

Nakamoto Consensus

Complexity: O(n) (chain comparison)

Nakamoto Consensus โ€” the longest chain rule that determines which blockchain is the "true" history. Nodes always follow the chain with the most cumulative Proof of Work, solving the Byzantine Generals Problem.

๐Ÿ“ How it works:
def get_longest_chain(chains):
            longest = None
            max_work = 0
            for chain in chains:
                total_work = sum(block.difficulty for block in chain)
                if total_work > max_work:
                    max_work = total_work
                    longest = chain
            return longest

        # Nodes always build on the longest valid chain
        def choose_best_chain(chain_a, chain_b):
            if get_total_work(chain_a) > get_total_work(chain_b):
                return chain_a
            return chain_b
๐Ÿ”— Used in: Fork resolution, Chain selection, Transaction finality
#Longest chain #Economic incentives #Byzantine solution
Consensus

Difficulty Adjustment

Complexity: O(1)

Difficulty Adjustment Algorithm โ€” automatically adjusts the mining target every 2,016 blocks to maintain a consistent 10-minute block time, regardless of total network hashrate.

๐Ÿ“ Formula:
def adjust_difficulty(prev_difficulty, actual_time, expected_time=1209600):
            # expected_time = 2016 blocks ร— 600 seconds = 1,209,600 seconds
            ratio = actual_time / expected_time
            ratio = max(0.25, min(4.0, ratio))  # Cap at 4x or 0.25x
            new_difficulty = prev_difficulty / ratio
            return new_difficulty

        # Example: Blocks found too fast (600,000 seconds instead of 1,209,600)
        new_diff = adjust_difficulty(80_000_000_000, 600000)
        print(f"New Difficulty: {new_diff:.0f}")  # Difficulty increases
๐Ÿ”— Used in: Block time regulation, Network stability, Hashrate response
#Self-regulating #2016 blocks #10-minute target
Data Structure

Merkle Tree

Complexity: O(log n) verification

Binary Hash Tree โ€” summarizes all transactions in a block into a single 32-byte Merkle Root, enabling efficient SPV verification without downloading all transactions.

๐Ÿ“ Example:
def build_merkle_root(transactions):
            if len(transactions) == 1:
                return double_sha256(transactions[0])
            
            next_level = []
            for i in range(0, len(transactions), 2):
                if i + 1 < len(transactions):
                    combined = transactions[i] + transactions[i+1]
                else:
                    combined = transactions[i] + transactions[i]  # Duplicate odd
                next_level.append(double_sha256(combined))
            
            return build_merkle_root(next_level)

        # 4 transactions become 1 root
        txs = [hash_tx(tx1), hash_tx(tx2), hash_tx(tx3), hash_tx(tx4)]
        merkle_root = build_merkle_root(txs)
๐Ÿ”— Used in: Block headers, SPV wallets, Merkle proofs, Transaction verification
#Binary tree #O(log n) proof #SPV
Data Structure

Merkle Proof

Complexity: O(log n) size

Merkle Proof (Authentication Path) โ€” allows a light client to verify that a specific transaction is included in a block using only O(log n) hashes, not the entire block.

๐Ÿ“ Example:
def verify_merkle_proof(tx_hash, proof, merkle_root):
            current = tx_hash
            for sibling, direction in proof:
                if direction == 'left':
                    current = double_sha256(sibling + current)
                else:
                    current = double_sha256(current + sibling)
            return current == merkle_root

        # For a block with 4,000 transactions, proof is only ~12 hashes (384 bytes)
        proof = [("hash_of_sibling", "right"), ("hash_of_parent", "left")]
        is_valid = verify_merkle_proof(tx_hash, proof, root)
๐Ÿ”— Used in: SPV wallets, Light clients, Cross-chain bridges
#O(log n) size #SPV #Light client
Encoding

Base58Check

Complexity: O(n)

Base58Check โ€” encodes binary data (like addresses) into human-readable strings, removing visually ambiguous characters (0, O, I, l) and adding a checksum to prevent typos.

๐Ÿ“ Alphabet:
ALPHABET = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'
        # No 0, O, I, l โ€” prevents visual confusion

        def base58_encode(data):
            value = int.from_bytes(data, 'big')
            encoded = ''
            while value > 0:
                value, remainder = divmod(value, 58)
                encoded = ALPHABET[remainder] + encoded
            return encoded

        # Base58Check includes 4-byte checksum
        def base58check_encode(payload):
            checksum = double_sha256(payload)[:4]
            return base58_encode(payload + checksum)
๐Ÿ”— Used in: Bitcoin addresses (legacy/P2SH), Private key WIF format
#No ambiguous chars #Checksum #Address encoding
Encoding

Bech32

Complexity: O(n)

Bech32 (BIP173) โ€” newer address format for SegWit addresses (bc1...). Uses a BCH error correction code for better error detection and is entirely lowercase.

๐Ÿ“ Structure:
# Bech32 address format: bc1qar0srrr7xfkvy5l643lydnw9re59gtzzwf5mdq
        # hrp (human-readable part) = "bc" (Bitcoin mainnet)
        # separator = "1"
        # data = witness program (variable length)
        # checksum = BCH code (6 characters)

        # Advantages over Base58Check:
        # - Better error detection (can detect up to 4 errors)
        # - All lowercase (easier to read/type)
        # - More efficient encoding
๐Ÿ”— Used in: SegWit addresses (bc1...), Native SegWit, Taproot addresses (bc1p...)
#SegWit #BCH error correction #Lowercase only
Networking

Gossip Protocol

Complexity: O(nยฒ) propagation

Gossip (Epidemic) Protocol โ€” peer-to-peer communication where nodes broadcast transactions and blocks to their connected peers, propagating data globally within seconds.

๐Ÿ“ How it works:
class Node:
            def __init__(self, peers=[]):
                self.peers = peers
                self.mempool = []
                self.seen = set()  # Prevent infinite loops
            
            def broadcast_transaction(self, tx):
                if tx.id in self.seen:
                    return
                self.seen.add(tx.id)
                for peer in self.peers:
                    peer.receive_transaction(tx)
            
            def receive_transaction(self, tx):
                if self.validate_transaction(tx):
                    self.mempool.append(tx)
                    self.broadcast_transaction(tx)  # Propagate to neighbors
๐Ÿ”— Used in: Transaction propagation, Block distribution, Peer discovery
#Peer-to-peer #Epidemic #Flooding
Networking

Compact Block Relay (BIP152)

Complexity: O(n)

Compact Block Relay (BIP152) โ€” reduces block propagation bandwidth by sending only short transaction identifiers instead of full transactions that peers already have.

๐Ÿ“ How it works:
# Instead of sending full block (1-4 MB):
        # Send compact block with 6-byte transaction IDs (~10-40 KB)
        # Peers reconstruct missing transactions from mempool

        # Sender side:
        compact_block = {
            'header': block_header,
            'short_ids': [calculate_short_id(tx) for tx in block_txs],
            'missing_txs': []  # Only for txs not in peer's mempool
        }

        # Receiver side:
        missing = [tx for tx in compact_block.short_ids if tx not in mempool]
        request_missing(missing)  # Only download missing few
๐Ÿ”— Used in: Block propagation, Bandwidth optimization, Faster block relay
#BIP152 #Bandwidth saving #Faster sync
Transaction

UTXO Selection (Coin Selection)

Complexity: O(n) (knapsack)

Coin Selection Algorithm โ€” selects which UTXOs to spend to achieve a target amount while minimizing fees, avoiding dust, and preserving privacy.

๐Ÿ“ Example (Branch and Bound):
def coin_selection(utxos, target, fee_rate):
            # Sort by value (smallest first for knapsack)
            utxos = sorted(utxos, key=lambda x: x['value'])
            
            # Strategy 1: Use smallest UTXOs first (good for privacy)
            selected = []
            total = 0
            for utxo in utxos:
                selected.append(utxo)
                total += utxo['value']
                if total >= target + estimate_fee(selected, fee_rate):
                    return selected
            
            # Strategy 2: Use single UTXO (simplest, best for privacy)
            for utxo in utxos:
                if utxo['value'] >= target + estimate_fee([utxo], fee_rate):
                    return [utxo]
            
            # Strategy 3: Use largest UTXOs (fallback)
            return select_largest_utxos(utxos, target)
๐Ÿ”— Used in: Wallet transaction building, Fee optimization, Dust avoidance, Privacy
#Knapsack problem #Fee optimization #Privacy
Transaction

Fee Estimation

Complexity: O(n)

Fee Estimation Algorithm โ€” predicts the optimal transaction fee based on mempool congestion and recent block inclusion data to achieve desired confirmation time.

๐Ÿ“ Example:
def estimate_fee(target_blocks=6):
            # Analyze recent blocks to determine fee rates
            recent_fees = []
            for block in last_n_blocks(100):
                for tx in block.transactions:
                    recent_fees.append(tx.fee_rate)
            
            recent_fees.sort()
            
            # Calculate fee for desired confirmation time
            return {
                'fast': recent_fees[int(len(recent_fees) * 0.90)],   # 90th percentile
                'medium': recent_fees[int(len(recent_fees) * 0.50)], # 50th percentile
                'slow': recent_fees[int(len(recent_fees) * 0.10)]    # 10th percentile
            }

        # Mempool-based estimation (more accurate)
        def estimate_from_mempool(mempool, target_blocks):
            # Simulate block inclusion based on fee rate
            sorted_txs = sorted(mempool, key=lambda x: x.fee_rate, reverse=True)
            block_size = 0
            for tx in sorted_txs:
                block_size += tx.vsize
                if block_size >= 1_000_000:  # 1 MB block target
                    return tx.fee_rate
๐Ÿ”— Used in: Wallet fee recommendations, Mempool analysis, Transaction priority
#Mempool-based #Predictive #Fee optimization

SHA-256

How Bitcoin creates unique fingerprints for data

1

What is SHA-256 in simple terms?

Think of SHA-256 as a unique fingerprint generator for digital data. No matter how long or short your input is, SHA-256 always produces a fixed 64-character "fingerprint" called a hash. Change even one letter, and the fingerprint becomes completely different.

SHA-256 always outputs exactly 256 bits โ€” regardless of whether your input is one word or an entire novel. It's like a blender: you can put in a grape or a watermelon, but it always fills the same glass.
Key properties:
โ€” Same input always produces the same output (deterministic)
โ€” Tiny input change = completely different output (avalanche effect)
โ€” You cannot reverse a hash back to the original input (one-way)
โ€” No two different inputs should share the same hash (collision resistant)
Think of a person's fingerprint โ€” it uniquely identifies them, but you can't reconstruct the person from the fingerprint alone. SHA-256 works the same way for data.
2

Why does Bitcoin need SHA-256?

Bitcoin relies on SHA-256 for three critical jobs:

  • Linking blocks: Each block stores the hash of the block before it โ€” creating a chain. Changing one block would break every hash after it, making tampering instantly obvious.
  • Mining (proof of work): Miners must find a hash that starts with many zeros. The only way is to guess billions of times โ€” that effort is what makes Bitcoin secure.
  • Transaction IDs: Every transaction gets a unique ID (TXID) derived from a SHA-256 hash, so you can look up any transaction on the blockchain.
Without SHA-256, there's no way to verify data hasn't been tampered with. It's the foundation that makes Bitcoin trustworthy without requiring anyone to trust anyone else.
3

Step 1: input preparation (padding)

Before SHA-256 can hash your message, it must prepare it. This preparation is called padding, and it ensures every message is exactly the right size for processing.

SHA-256 works like an assembly line that only accepts packages of exactly 512 bits. If your message isn't the right size, we stuff it with padding โ€” like bubble wrap filling a box.
Padding rules:
1. Add a 1 bit to the end โ€” marks where the real message ends
2. Add 0 bits until total length = 448 bits (512 โˆ’ 64, saving room for step 3)
3. Add the original message length as a 64-bit number at the very end

Walkthrough with "abc":

One character = 8 bits (1 byte). So "abc" = 3 ร— 8 = 24 bits.
Original message โ€” 24 bits
01100001a 01100010b 01100011c
After appending the 1-bit bookmark โ€” 25 bits
01100001 01100010 01100011 1
After zero-padding to 448 bits (adds 423 zeros)
abc data 1 000โ€ฆ000 (423 zeros)
Final 512-bit block โ€” append original length (24) as 64 bits
abc data 1 zeros 0โ€ฆ011000 (= 24)
Why add the length at the end? Security. Without it, an attacker could silently append extra data to your message and still produce a valid-looking hash โ€” a trick called a "length extension attack."

Now the message is a perfect 512-bit block, and SHA-256 can begin its work. If your original message were longer than 448 bits, it would simply be split across multiple 512-bit blocks โ€” each one padded and processed in turn.

4

Step 2: breaking into chunks

The padded message is split into 512-bit chunks. Each chunk is divided into 16 smaller pieces called words, each exactly 32 bits long.

Think of it like cutting a loaf of bread into 16 equal slices. SHA-256 can only work on one slice at a time โ€” it processes each word separately before combining the results.
One 512-bit chunk = 16 words ร— 32 bits
W0
W1
W2
W3
W4
W5
W6
W7
W8
W9
W10
W11
W12
W13
W14
W15
Each cell = 32 bits. For "abc", we only have one chunk total.

If your message were longer โ€” say, a whole paragraph โ€” it would be split into multiple 512-bit chunks, each processed one after the other.

5

Step 3: creating 64 words (message schedule)

SHA-256 needs 64 rounds of mixing, but we only have 16 words. So it generates 48 more words from the original 16, giving us 64 words total (W0โ€“W63).

Imagine you have 16 ingredients and need to make 64 dishes. You use combinations of the original ingredients to create new flavors. That mixing ensures one small change in the input "ripples" through all 64 rounds.
Formula for words W16 through W63:

ฯƒ0 = (W[tโˆ’15] rotate right 7) XOR (W[tโˆ’15] rotate right 18) XOR (W[tโˆ’15] shift right 3)
ฯƒ1 = (W[tโˆ’2] rotate right 17) XOR (W[tโˆ’2] rotate right 19) XOR (W[tโˆ’2] shift right 10)
W[t] = W[tโˆ’16] + ฯƒ0 + W[tโˆ’7] + ฯƒ1
What is XOR? XOR (exclusive OR) compares two bits: if they're the same โ†’ 0, if they're different โ†’ 1. It's a "difference detector" that scrambles patterns. Rotate right? Imagine the bits on a circular belt โ€” spin them right by N positions, and bits that fall off the right reappear on the left.
This expansion is why SHA-256 is so hard to reverse. Each new word is a scrambled combination of four earlier words, so by round 64, every bit of output has been influenced by every bit of input โ€” many times over.
6

Step 4: initialize working variables

Before the 64 rounds begin, SHA-256 sets up 8 working variables (a through h). These are the "state" that gets shuffled with every round.

Think of these 8 variables as 8 tumblers in a combination lock. They start at a known position, and each round spins them in a different direction. At the end, their final positions form the hash.
Where do these starting values come from? They're the fractional parts of the square roots of the first 8 prime numbers (2, 3, 5, 7, 11, 13, 17, 19). Using irrational numbers guarantees the starting values have no hidden patterns โ€” no "backdoor."
Initial values (hex):
a = 6a09e667e = 510e527f b = bb67ae85f = 9b05688c c = 3c6ef372g = 1f83d9ab d = a54ff53ah = 5be0cd19

These values are fixed and public โ€” every SHA-256 implementation in the world starts from exactly these numbers.

7

Step 5: the compression function (64 rounds)

This is the core of SHA-256. For each of the 64 words, the 8 working variables are scrambled through a series of bitwise operations. After 64 rounds, the data is thoroughly mixed.

Picture a tumble dryer running 64 cycles. Each cycle tumbles the clothes (your data) in a different direction using a different constant. By the end, every fiber has touched every other fiber โ€” the data is completely scrambled.
Each round computes:

ฮฃ1 = (e โ‹™ 6) XOR (e โ‹™ 11) XOR (e โ‹™ 25)
Ch = (e AND f) XOR (NOT e AND g)
temp1 = h + ฮฃ1 + Ch + K[t] + W[t]

ฮฃ0 = (a โ‹™ 2) XOR (a โ‹™ 13) XOR (a โ‹™ 22)
Maj = (a AND b) XOR (a AND c) XOR (b AND c)
temp2 = ฮฃ0 + Maj

Then all 8 variables shift:
hโ†g, gโ†f, fโ†e, eโ†d+temp1, dโ†c, cโ†b, bโ†a, aโ†temp1+temp2
Ch (Choose): uses variable e to "choose" between f and g โ€” if a bit in e is 1, take from f; if 0, take from g.
Maj (Majority): outputs whichever bit value appears most often among a, b, and c.
These functions introduce non-linearity โ€” the property that makes the hash impossible to reverse.
8

Step 6: the round constants K[t]

Each of the 64 rounds uses a unique constant K[t] to ensure every round behaves differently. Without them, multiple rounds could produce the same transformation โ€” weakening security.

Imagine baking 64 cakes and using the same recipe for every one โ€” they'd all taste identical. The K constants are like 64 different spice blends, one per cake, ensuring every round has a unique flavor.
Where do K values come from? They are the fractional parts of the cube roots of the first 64 prime numbers โ€” chosen because irrational numbers have no hidden patterns.
K[0]  = 0x428a2f98   (cube root of prime 2)
K[1]  = 0x71374491   (cube root of prime 3)
K[2]  = 0xb5c0fbcf   (cube root of prime 5)
...
K[63] = 0xc67178f2   (cube root of prime 311)
These constants are fully public and standardized โ€” anyone can verify them. This "nothing up my sleeve" design proves no backdoor was built into SHA-256.
9

Step 7: producing the final hash

After all 64 rounds, the 8 working variables are added back to the original starting values (from Step 4), then concatenated to form the final 256-bit hash.

Adding the original starting values back is called a "feed-forward." It ensures that even if an attacker could reverse all 64 rounds, they'd still need to account for the starting values โ€” an extra layer of protection.
Final hash: a + b + c + d + e + f + g + h (8 ร— 32 bits = 256 bits = 64 hex characters)

Example โ€” change one letter, get a completely different hash:

SHA-256("abc")
ba7816bf 8f01cfea 414140de 5dae2223 b00361a3 96177a9c b410ff61 f20015ad
SHA-256("abd") โ€” one letter changed
7d9df07b c5a11e1e a0b2be59 f57a9dc3 eb87a3f9 b7f0c9d1 b8e2f8c9 d1b8e2f0
64 hex characters ร— 4 bits per character = 256 bits. Notice how completely unrelated the two hashes look โ€” that's the avalanche effect.
10

The avalanche effect (why it's secure)

The avalanche effect means that flipping even a single bit in the input causes roughly half the output bits to change. This makes SHA-256 completely unpredictable โ€” you cannot "tweak" a hash by making small changes to the input.

This is why Bitcoin mining is hard. Miners are searching for an input (nonce) that produces a hash starting with many zeros. The only way to find it is pure trial and error โ€” billions of guesses per second โ€” because there's no shortcut.
The compression function is specifically designed to maximize avalanche. The Ch, Maj, ฮฃ0, and ฮฃ1 operations each spread bit changes across the entire state โ€” so after 64 rounds, every output bit depends on every input bit.
11

Double SHA-256 (why Bitcoin applies it twice)

Bitcoin doesn't just hash once โ€” it runs SHA-256 twice in sequence. The output of the first hash becomes the input of the second.

Bitcoin hash = SHA-256( SHA-256( data ) )
Why twice? A theoretical vulnerability called a "length extension attack" can allow an attacker to forge a valid hash without knowing the original input โ€” but only on a single SHA-256. Running it twice defeats this attack entirely.
import hashlib

def bitcoin_hash(data: bytes) -> str:
    first  = hashlib.sha256(data).digest()      # first pass โ€” raw bytes
    second = hashlib.sha256(first).hexdigest()  # second pass โ†’ hex string
    return second

# Used for TXIDs, Block Header hashes, and Merkle tree nodes
Double SHA-256 is used everywhere in Bitcoin: block header hashing (proof of work), transaction IDs, Merkle tree nodes, and address generation. It's the backbone of Bitcoin's integrity model.
๐Ÿ“ Output size
256 bits / 64 hex chars
๐Ÿ“ฆ Block size
512 bits (64 bytes)
๐Ÿ”„ Rounds
64 per block
๐Ÿ” Security level
128-bit (collision)

ECDSA (secp256k1)

How Bitcoin proves you own your money without revealing your password

1

What is ECDSA in simple terms?

ECDSA is like a digital wax seal that proves YOU signed a document.

Imagine you have a special stamp (your Private Key). When you stamp a document, you create a unique seal (Signature). Anyone can look at the seal and verify it came from your stamp using your public ID (Public Key). But they can't recreate your stamp just by looking at the seal.

๐Ÿ’ก Real-world analogy: Think of a handwritten signature. Everyone can see your signature and know it's yours, but they can't forge it perfectly. ECDSA makes this mathematically certain.
Key properties:
โ€” Only the Private Key holder can create a valid signature
โ€” Anyone with the Public Key can verify the signature
โ€” You cannot reverse-engineer the Private Key from the signature
โ€” Each signature is unique, even for the same message
2

Why does Bitcoin need ECDSA?

Bitcoin uses ECDSA for ONE critical purpose:

  • ๐Ÿ’ฐ Proving Ownership: When you send Bitcoin, you must prove you own the address. ECDSA creates a mathematical proof that only you can create.
โš ๏ธ Without ECDSA, anyone could spend anyone else's Bitcoin. It's the lock on your digital wallet. No signature = no transaction.
๐Ÿ”‘ Why not just use a password? A password requires you to share it with the network โ€” which means someone could steal it. With ECDSA, you never share your Private Key. You only share the signature, which is mathematically proven to have come from your key without revealing it.
3

Step 1: understanding the elliptic curve (the "playing field")

ECDSA works on a special mathematical curve called secp256k1. The equation is:

yยฒ = xยณ + 7

This curve looks like a mirror-image loop. But we're only interested in points where both x and y are whole numbers (integers) modulo a huge prime number.

๐Ÿ“ Visualizing the curve: The secp256k1 curve is symmetric across the x-axis. It has a distinctive "S" shape that extends infinitely, but we only use points with integer coordinates within a massive finite field.
The finite field (mod p):
p = 2ยฒโตโถ - 2ยณยฒ - 977
p = 115792089237316195423570985008687907853269984665640564039457584007913129639747
๐Ÿ”ข Why a prime field? Working modulo a prime number creates a "wrap-around" effect that keeps all points within a finite range. This makes the math predictable and secure. The specific prime p was chosen because it's extremely close to 2ยฒโตโถ, making computations efficient.
4

Step 2: the generator point G (the starting line)

There's a special starting point on the curve called G (the Generator). Everyone in Bitcoin uses the same G.

Coordinates of G:
Gx = 0x79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798
Gy = 0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8
๐Ÿ Think of G as the "starting line" on a race track. Everyone starts from the same point. The number of "laps" (jumps) you take determines where you end up.
๐Ÿ”‘ Why a generator point? By repeatedly "adding" G to itself, we can generate every other point on the curve. This creates a one-way function: it's easy to jump forward, but nearly impossible to figure out how many jumps you took if you only see the destination.
5

Step 3: generating your Private Key (your secret)

Your Private Key is simply a random 256-bit number (between 1 and about 1.16 ร— 10โทโท).

Example Private Key (hex):
d = 0x1e99423a4ed27608a15a2616a2b0e9e52ced330ac530edcc32c8ffc6a526aedd
โš ๏ธ How random is 2ยฒโตโถ? It's roughly the number of atoms in the observable universe. You cannot guess someone's private key. The only way to lose your Bitcoin is to lose your key or have it stolen through malware/phishing.
๐ŸŽฒ How are private keys generated? Wallets use a Cryptographically Secure Pseudo-Random Number Generator (CSPRNG) โ€” a hardware or software source of true randomness. Hardware wallets have special random number chips; software wallets use system entropy.
6

Step 4: generating your Public Key (your identity)

Your Public Key is derived by "multiplying" the Generator Point (G) by your Private Key (d):

Q = d ร— G

This means: start at G, then jump to the next point d times. This is called scalar multiplication.

๐Ÿš€ Why this is secure: It's easy to calculate d ร— G (just jump d times). But if someone only knows Q and G, figuring out d is like trying to find how many jumps you took. This is called the Elliptic Curve Discrete Logarithm Problem (ECDLP) and is believed to be impossible with current computers.
Analogy: Imagine a giant clock with 1,000 positions. I tell you: "Start at position 0, then move 1,000,000 steps" โ€” you can calculate the final position easily. But if I only show you the final position and ask how many steps I took, you'd have to try every number. Now imagine the clock has 2ยฒโตโถ positions!
# Python concept (simplified)
def private_to_public(private_key):
    # Start at generator point G
    result = G
    # "Add" G to itself private_key times
    for _ in range(private_key):
        result = point_addition(result, G)
    return result

# In reality, this is done using a much faster algorithm called
# "double-and-add" โ€” O(log n) instead of O(n)
7

Public Key formats (compressed vs uncompressed)

Public Keys can be represented in two ways:

Uncompressed (65 bytes):
0x04 || X (32 bytes) || Y (32 bytes)
Compressed (33 bytes):
0x02 (if Y is even) or 0x03 (if Y is odd) || X (32 bytes)
๐Ÿ“ฆ Why compressed? The compressed format saves 32 bytes per transaction โ€” about 50% space! Since Bitcoin's equation yยฒ = xยณ + 7, if you know X, you can calculate Yยฒ. The prefix (02 or 03) just tells you whether Y is even or odd, so you don't need to store the full Y.
# Example: Compressed public key (33 bytes)
03 = prefix (Y is odd)
79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798 = X coordinate

# Most modern wallets use compressed public keys for efficiency
8

Step 5: signing a transaction (creating your "seal")

To prove you own the Bitcoin, you must sign the transaction. Here's how:

  1. Hash the transaction: Create a unique fingerprint (z) of the transaction using SHA-256
  2. Pick a random number (k): This is a temporary secret, called the nonce
  3. Calculate R = k ร— G: This gives you a point on the curve
  4. Calculate r = x-coordinate of R (mod n): Take the x value of that point
  5. Calculate s = kโปยน ร— (z + r ร— d) mod n: This is the second part of the signature
The signature is the pair (r, s)
Both r and s are 256-bit integers.
def sign(private_key, transaction_hash, k):
    # k must be random and NEVER reused!
    # Step 1: Calculate R = k ร— G
    R = scalar_multiply(k, G)
    r = R.x % n
    
    # Step 2: Calculate s = kโปยน ร— (z + r ร— d) mod n
    k_inv = modular_inverse(k, n)
    s = (k_inv * (transaction_hash + r * private_key)) % n
    
    return (r, s)
โš ๏ธ CRITICAL WARNING: If you ever reuse the same k for two different signatures, anyone can calculate your private key!

Formula if k is reused:
(sโ‚ - sโ‚‚) ร— k = (zโ‚ - zโ‚‚) mod n โ†’ solve for k โ†’ then d = (sโ‚ ร— k - zโ‚) / r

This is how the PlayStation 3's signing key was stolen in 2010.
๐ŸŽฒ How is k generated? Modern wallets use deterministic k generation (RFC 6979) โ€” k is derived from the private key and message hash using HMAC. This eliminates the risk of accidental k reuse while still appearing random.
9

Step 6: verifying a signature (anyone can do this)

Anyone can verify that you signed the transaction using only your Public Key (Q). No secret information needed.

  1. Check that r and s are between 1 and n-1 (valid range)
  2. Calculate w = sโปยน mod n (modular inverse of s)
  3. Calculate uโ‚ = (z ร— w) mod n
  4. Calculate uโ‚‚ = (r ร— w) mod n
  5. Calculate P = uโ‚ ร— G + uโ‚‚ ร— Q (point addition on the curve)
  6. Check if the x-coordinate of P equals r
If x-coordinate of P == r โ†’ Signature is VALID!
If not โ†’ Signature is INVALID!
def verify(public_key, transaction_hash, signature):
    r, s = signature
    
    # Check range
    if not (1 <= r < n) or not (1 <= s < n):
        return False
    
    # Calculate modular inverse of s
    w = modular_inverse(s, n)
    
    # Calculate u1 and u2
    u1 = (transaction_hash * w) % n
    u2 = (r * w) % n
    
    # Calculate P = u1ร—G + u2ร—Q
    P = point_addition(
        scalar_multiply(u1, G),
        scalar_multiply(u2, public_key)
    )
    
    # Verify
    return P.x == r
๐Ÿงฎ Why does this work mathematically?

Remember: Q = d ร— G (Public Key = Private Key ร— Generator)
Also: s = kโปยน ร— (z + r ร— d) โ†’ multiply both sides by k:
s ร— k = z + r ร— d โ†’ multiply both sides by sโปยน:
k = sโปยน ร— (z + r ร— d) = uโ‚ + uโ‚‚ ร— d

Then: P = uโ‚ ร— G + uโ‚‚ ร— Q = uโ‚ ร— G + uโ‚‚ ร— (d ร— G) = (uโ‚ + uโ‚‚ ร— d) ร— G = k ร— G = R
So x-coordinate of P must equal r. The math proves itself!
10

The order n (the "wrap-around" number)

Every elliptic curve has an order n โ€” the number of times you can add G to itself before you return to the starting point.

n ร— G = (point at infinity)
n = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141
๐ŸŽฏ Why n matters: All private keys are modulo n. If your private key is larger than n, it's effectively reduced modulo n. This ensures all keys are within the same finite range.
๐Ÿ”ข n is extremely close to 2ยฒโตโถ (specifically 2ยฒโตโถ - 0x14551231950B75FC4402DA1732FC9BEBF). This means that almost every 256-bit number is a valid private key โ€” there's no "wasted" key space.
11

Low-S signatures (BIP62)

ECDSA signatures can have two valid representations: (r, s) and (r, n - s). Both are mathematically valid, but this creates transaction malleability โ€” someone could change your transaction's signature without invalidating it.

โš ๏ธ What is transaction malleability? If an attacker can change the signature (and thus the TXID) of your unconfirmed transaction, they might trick you into thinking the transaction failed when it actually succeeded โ€” potentially leading to double-spends.
Bitcoin's rule (BIP62): Only accept signatures where s โ‰ค n/2
This eliminates one of the two possible representations.
def enforce_low_s(r, s, n):
    # Ensure s is the "low" form
    if s > n // 2:
        s = n - s
    return (r, s)

# Most wallets now produce low-s signatures by default
๐Ÿ”ง SegWit's fix: Segregated Witness (SegWit) also fixes transaction malleability by moving signatures out of the transaction ID calculation.
12

Real Bitcoin transaction example

Let's look at the actual signature from the first Bitcoin transaction ever made (Satoshi to Hal Finney):

Transaction ID:
f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16
Input ScriptSig (the signature):
3045022100bcbe87cdc8aaf78e89c373a1528abd9fec9f45e78bc699e13b569a4c854b719d02206ffdeb0af05b61e7c81d4b8648339a3edbb6b23598233e7f6ab2aa5415816f23
๐Ÿ” Breaking down the ScriptSig:
30 โ€” DER signature header
45 โ€” Length of signature (69 bytes)
02 โ€” Integer marker for r
21 โ€” Length of r (33 bytes โ€” includes leading zero)
00bcbe87cdc8aaf78e89c373a1528abd9fec9f45e78bc699e13b569a4c854b719d โ€” r value
02 โ€” Integer marker for s
20 โ€” Length of s (32 bytes)
6ffdeb0af05b61e7c81d4b8648339a3edbb6b23598233e7f6ab2aa5415816f23 โ€” s value
๐Ÿ“œ Historical significance: This signature was created by Satoshi Nakamoto on January 12, 2009. It's the first cryptographic proof of ownership ever recorded on the Bitcoin blockchain. Hal Finney never spent these coins โ€” they remain in that address as a historical artifact.
13

Why secp256k1? (Satoshi's choice)

Satoshi chose the secp256k1 curve for several important reasons:

  • Koblitz curve: More efficient for computation than random curves
  • No backdoor concerns: Unlike NIST curves (which were designed by government agencies), secp256k1's parameters are "nothing up my sleeve" โ€” they come from simple, verifiable constants
  • Performance: Faster point multiplication due to the specific form (a=0)
  • 2ยฒโตโถ proximity: The field size is extremely close to 2ยฒโตโถ, making operations efficient on 256-bit computers
Curve parameters:
a = 0
b = 7
p = 2ยฒโตโถ - 2ยณยฒ - 977
n = 115792089237316195423570985008687907852837564279074904382605163141518161494337
๐Ÿ” The name "secp256k1" means:
sec = Standards for Efficient Cryptography
p = Prime field
256 = 256-bit field
k = Koblitz curve
1 = Version 1
๐Ÿ†š vs NIST curves (P-256, P-384): NIST curves have parameters derived from an unexplained random seed, raising concerns about potential backdoors. Satoshi's choice of secp256k1 was a deliberate move toward transparency and verifiability.
๐Ÿ“ Curve equation
yยฒ = xยณ + 7
๐Ÿ”‘ Private Key size
256 bits (32 bytes)
๐Ÿ”“ Public Key size
33 bytes (compressed)
โœ๏ธ Signature size
~70-72 bytes (DER)
๐ŸŽฏ Order (n)
~1.16 ร— 10โทโท

RIPEMD-160

Making Bitcoin addresses shorter and safer

1

What is RIPEMD-160 in simple terms?

RIPEMD-160 is a hashing algorithm that produces 160-bit (20-byte) outputs โ€” about 40 hex characters.

Think of it as a "short fingerprint" generator. While SHA-256 produces a 64-character fingerprint, RIPEMD-160 produces a 40-character version. Both uniquely identify data, but the shorter one is much easier for humans to read, write, and share.

๐Ÿ’ก Analogy: Imagine you have two ID cards. One has a 64-digit ID number, the other has a 40-digit ID number. Both uniquely identify you, but the shorter one is easier to type into a form.
Key properties:
โ€” Fixed 160-bit output (20 bytes / 40 hex chars)
โ€” One-way function (cannot reverse)
โ€” Collision-resistant (no two inputs produce same output)
โ€” Designed as a backup to SHA-1
2

Why doesn't Bitcoin use SHA-256 alone for addresses?

A SHA-256 hash of a public key is 64 hex characters (256 bits). Bitcoin addresses would look like:

1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef

That's far too long for people to use comfortably! By applying RIPEMD-160, we get:

1234567890abcdef1234567890abcdef12345678
๐Ÿ“ That's 24 fewer characters โ€” much easier to read, write, share, and less prone to typos.
๐ŸŽฏ Why 160 bits specifically? 160 bits provides 2ยนโถโฐ possible addresses โ€” that's about 1.46 ร— 10โดโธ possibilities. Even if every person on Earth had a billion addresses, we'd never run out. It's the sweet spot between security and usability.
3

The HASH160 process (how Bitcoin creates addresses)

Bitcoin combines SHA-256 and RIPEMD-160 in a process called HASH160:

Step 1: SHA-256(Public Key) โ†’ 256-bit result
Step 2: RIPEMD-160(Result) โ†’ 160-bit result
def hash160(public_key_bytes):
    # Step 1: SHA-256
    sha256_hash = hashlib.sha256(public_key_bytes).digest()
    
    # Step 2: RIPEMD-160
    ripemd160_hash = hashlib.new('ripemd160', sha256_hash).digest()
    
    return ripemd160_hash  # 20 bytes (160 bits)

# Example output (this becomes the core of a Bitcoin address)
# 0x00 + hash160 + checksum โ†’ Base58Check โ†’ 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
๐Ÿ”’ Why two hash functions? This two-step process ensures that even if SHA-256 is compromised (weakened by future attacks), RIPEMD-160 provides backup security. It's defense in depth โ€” like having two locks on your door instead of one.
4

How RIPEMD-160 works (the "parallel" design)

RIPEMD-160 processes data in 512-bit blocks, similar to SHA-256. However, it has a unique parallel processing design for extra security.

Two parallel lines of 5 registers each:
Left line: A, B, C, D, E (5 registers, 32 bits each)
Right line: A', B', C', D', E' (5 registers, 32 bits each)
Total: 10 registers processing in parallel
๐Ÿงฎ Why parallel lines? Imagine two people independently solving the same puzzle. If they both arrive at the same answer, you can be very confident it's correct. The parallel design makes RIPEMD-160 resistant to certain types of cryptographic attacks.
Processing structure:
โ€” 5 rounds of 16 steps each (80 total steps per line)
โ€” Each round uses different Boolean functions
โ€” After all rounds, left and right results are combined
๐Ÿ“Š The 5 Boolean functions (one per round):
Round 1: F = (X โŠ• Y โŠ• Z) โ€” XOR (parallel)
Round 2: F = (X โˆง Y) โˆจ (ยฌX โˆง Z) โ€” selection
Round 3: F = (X โˆจ ยฌY) โŠ• Z โ€” majority-like
Round 4: F = (X โˆง Z) โˆจ (Y โˆง ยฌZ) โ€” selection
Round 5: F = X โŠ• (Y โˆจ ยฌZ) โ€” XOR with OR
5

Step-by-step: how RIPEMD-160 processes data

Like SHA-256, RIPEMD-160 follows a similar structure:

  1. Padding: Message is padded to a multiple of 512 bits (same as SHA-256)
  2. Initialization: 5 registers (A,B,C,D,E) are set to initial values
  3. Parallel processing: The left and right lines run independently through 5 rounds
  4. Combination: After 80 steps, left and right results are added to the initial values
  5. Output: The final 160-bit hash is the concatenation of A,B,C,D,E
Initial values (hex):
A = 0x67452301
B = 0xEFCDAB89
C = 0x98BADCFE
D = 0x10325476
E = 0xC3D2E1F0
๐Ÿ“œ Historical note: These initial values are the same as MD5 (a predecessor hash function). RIPEMD-160 was designed to be compatible with MD5's initialization for easier implementation.
6

Security properties and collision resistance

RIPEMD-160 was designed as a backup hash function in case SHA-1 was broken. Here's how it compares:

Security level:
โ€” 160-bit output โ†’ 2โธโฐ security against birthday attacks
โ€” 2โธโฐ operations is currently impossible with classical computers
โ€” Quantum computers would reduce this to 2โดโฐ (still somewhat secure)
โš ๏ธ SHA-1 was broken in 2017 โ€” researchers found two different PDF files with the same SHA-1 hash. RIPEMD-160 remains secure because it was designed with more conservative safety margins.
๐Ÿ” "Conservative safety margins" means: RIPEMD-160 uses more rounds (80 vs SHA-1's 80 but with a more complex structure) and the parallel design makes attacks much harder to execute.
7

Real Bitcoin address example

Let's trace how Satoshi's public key became a Bitcoin address:

Step 1 โ€” Public Key (compressed):
04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac
Step 2 โ€” SHA-256(Public Key):
0f715b7b5e67e0adabfca9b8b6d1e1f2f3e4f5f6f7f8f9fafbfcfdfeff0f1f2
Step 3 โ€” RIPEMD-160(Result):
62e907b15cbf27d5425399ebf6f0fb50ebb88f18
Step 4 โ€” Add version byte (0x00):
0062e907b15cbf27d5425399ebf6f0fb50ebb88f18
Step 5 โ€” Add checksum (first 4 bytes of double SHA-256):
0062e907b15cbf27d5425399ebf6f0fb50ebb88f18 + 88f0e1d2
Step 6 โ€” Base58Check encode:
1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa (Satoshi's Genesis address!)
๐Ÿ’ฐ This is the first Bitcoin address ever created! It belongs to Satoshi Nakamoto and contains the 50 BTC genesis reward (which is unspendable).
8

Why doesn't Bitcoin use ONLY RIPEMD-160?

RIPEMD-160 alone would be weaker than the combined HASH160 approach. Here's why:

  • Pre-image resistance: SHA-256 provides 256-bit security, RIPEMD-160 provides 160-bit. The combination gives the stronger of the two.
  • Collision resistance: Even if one hash function is weakened, the other still provides protection.
  • Historical reasons: When Bitcoin was created, SHA-1 was showing weakness. RIPEMD-160 was a conservative backup.
๐Ÿ”’ Defense in depth: Just like a house might have both a deadbolt and a chain lock, Bitcoin uses two hash functions because security in depth is always better than relying on a single function.
๐Ÿ“ Output size
160 bits (20 bytes / 40 hex chars)
๐Ÿ”„ Rounds
80 (5 rounds ร— 16 steps)
๐Ÿ“ฆ Block size
512 bits
๐Ÿ” Security level
2โธโฐ (birthday attack)

HMAC-SHA512

Creating a family of keys from one master key

1

What is HMAC in simple terms?

HMAC (Hash-based Message Authentication Code) is like a secret handshake that only you and someone who knows your secret can perform.

It combines a secret key with a message to create a unique, verifiable hash. Only someone with the same secret key can produce the same HMAC.

๐Ÿค Analogy: Imagine you and a friend agree on a secret password. When you want to send a message, you append the password and sign it. Your friend can verify it's from you by checking the signature with the same password. An attacker who doesn't know the password can't forge valid signatures.
HMAC formula:
HMAC(K, m) = H( (K โŠ• opad) || H( (K โŠ• ipad) || m) )

Where:
opad = 0x5c5c5c... (outer padding)
ipad = 0x363636... (inner padding)
โŠ• = XOR operation
2

Why does Bitcoin need HMAC-SHA512?

Bitcoin uses HMAC-SHA512 for HD Wallets (Hierarchical Deterministic wallets, BIP32). This allows you to generate thousands of addresses from a single seed phrase.

๐ŸŽฏ Without HMAC-SHA512: You'd need to back up your wallet after every transaction. With it, you just need your 12 or 24-word seed phrase, and the wallet can re-generate every address you've ever used.
๐Ÿ’ก Think of it like a tree: Your master seed is the trunk. HMAC-SHA512 is the mathematical formula that determines how branches (child keys) and leaves (addresses) grow from that trunk โ€” always in the same pattern.
3

How HMAC works (the "keyed hash" construction)

HMAC prevents length extension attacks โ€” a vulnerability where an attacker can append data to a hash without knowing the original message.

HMAC-SHA512 process:

Step 1: If key is longer than block size, hash it first
Step 2: Pad key to block size (512 bits for SHA-512)
Step 3: XOR key with ipad (0x36 repeated)
Step 4: Append message and hash
Step 5: XOR key with opad (0x5c repeated)
Step 6: Append previous hash and hash again
def hmac_sha512(key, message):
    block_size = 128  # SHA-512 block size (128 bytes = 1024 bits)
    
    # Step 1: If key is longer than block size, hash it
    if len(key) > block_size:
        key = hashlib.sha512(key).digest()
    
    # Step 2: Pad key to block size
    key = key.ljust(block_size, b'\x00')
    
    # Step 3: Create inner and outer padded keys
    ipad = bytes([0x36] * block_size)
    opad = bytes([0x5c] * block_size)
    
    inner_key = bytes([k ^ i for k, i in zip(key, ipad)])
    outer_key = bytes([k ^ o for k, o in zip(key, opad)])
    
    # Step 4: Inner hash
    inner_hash = hashlib.sha512(inner_key + message).digest()
    
    # Step 5: Outer hash (final HMAC)
    return hashlib.sha512(outer_key + inner_hash).digest()
4

BIP32 key derivation (how child keys are created)

Bitcoin uses HMAC-SHA512 to derive child keys from parent keys in HD wallets:

Derivation formula:
HMAC-SHA512(chain_code, parent_key || index)

The output is 512 bits (64 bytes), split into two 256-bit halves:

First 32 bytes (256 bits): Child private key
Last 32 bytes (256 bits): New chain code (for further derivation)
def derive_child_key(parent_key, parent_chain, index):
    # Combine parent key with index (4 bytes, big-endian)
    data = parent_key + index.to_bytes(4, 'big')
    
    # HMAC-SHA512 with parent chain code as key
    hmac_result = hmac.new(parent_chain, data, hashlib.sha512).digest()
    
    # Split into child key and new chain code
    child_key = hmac_result[:32]      # 256-bit child private key
    child_chain = hmac_result[32:]    # 256-bit new chain code
    
    return child_key, child_chain
๐ŸŒณ Why this works: The chain code ensures that even if someone gets one child key, they can't derive other children or the parent. Each child has its own unique chain code, creating a secure tree structure.
5

Normal vs Hardened derivation (the apostrophe)

BIP32 defines two types of key derivation:

Normal derivation (index < 2ยณยน):
data = parent_public_key || index
Hardened derivation (index โ‰ฅ 2ยณยน):
data = 0x00 || parent_private_key || index
โš ๏ธ The apostrophe (') means hardened: m/44'/0'/0'/0/0
Hardened derivation uses the parent private key, so it cannot be reversed. Normal derivation uses the parent public key, which allows anyone with the parent public key to derive child public keys.
๐Ÿ” Why hardened? If someone steals a normal child key, they might be able to derive the parent key. Hardened derivation prevents this โ€” it's more secure for higher-level accounts.
6

BIP44 derivation path (the "address formula")

BIP44 defines a standard path for organizing keys across different cryptocurrencies and accounts:

m / purpose' / coin_type' / account' / change / address_index
๐Ÿ“ Breaking down the path:
m: Master seed (your 12/24 words)
purpose': 44' for BIP44 (always)
coin_type': 0' for Bitcoin, 60' for Ethereum, etc.
account': 0', 1', 2'... (separate accounts like "savings" vs "checking")
change: 0 for receiving addresses, 1 for change addresses
address_index: 0, 1, 2... (increment for each new address)
Example: m/44'/0'/0'/0/0
        โ†‘    โ†‘   โ†‘   โ†‘ โ†‘
        โ”‚    โ”‚   โ”‚   โ”‚ โ””โ”€โ”€ First address (index 0)
        โ”‚    โ”‚   โ”‚   โ””โ”€โ”€โ”€โ”€ Receiving address (not change)
        โ”‚    โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ First account (account 0)
        โ”‚    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Bitcoin (coin type 0)
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ BIP44 purpose
๐ŸŒ Why standardize? The derivation path ensures that any BIP44-compatible wallet can recover your funds from your seed phrase, even if you use a different wallet app. It's like a universal language for wallets.
7

Seed phrase generation (BIP39)

Your 12 or 24-word seed phrase is the human-readable representation of your master seed. Here's how it works:

  1. Generate 128-256 bits of entropy (randomness)
  2. Add a checksum (SHA-256 of entropy, take first bits)
  3. Split into 11-bit chunks (each chunk indexes into a word list)
  4. Look up words from the BIP39 English word list (2048 words)
Entropy to seed:
Entropy (128-256 bits) โ†’ Add checksum โ†’ Split into 11-bit groups โ†’ 12-24 words
๐Ÿ“– The word list: The BIP39 word list has 2048 words (2ยนยน). Each word encodes exactly 11 bits of entropy. 12 words = 132 bits (128 entropy + 4 checksum).
# Example: 12-word seed phrase
"abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about"
# This encodes 128 bits of entropy (all zeros) + checksum
โš ๏ธ CRITICAL: Never type your seed phrase into any website, app, or computer. Never take a photo of it. Store it on metal (steel washers, CryptoSteel) to protect from fire/water damage. Your seed phrase IS your Bitcoin.
8

From seed phrase to master key (PBKDF2)

The seed phrase is converted into a master seed using PBKDF2 (Password-Based Key Derivation Function 2) with HMAC-SHA512:

Master Seed = PBKDF2(seed_phrase, salt="mnemonic", iterations=2048, key_length=64)
from hashlib import pbkdf2_hmac

def mnemonic_to_seed(mnemonic, passphrase=""):
    # Salt is "mnemonic" + optional passphrase
    salt = "mnemonic" + passphrase
    
    # PBKDF2 with HMAC-SHA512, 2048 iterations
    seed = pbkdf2_hmac(
        'sha512',
        mnemonic.encode('utf-8'),
        salt.encode('utf-8'),
        iterations=2048,
        dklen=64  # 512-bit seed
    )
    return seed

# The seed then becomes the master key for your HD wallet
๐Ÿ” What's PBKDF2? It's a "key stretching" function that makes brute-force attacks slower. 2048 iterations means an attacker has to compute HMAC-SHA512 2048 times for each guess โ€” 2048ร— harder!
๐Ÿ”‘ Optional passphrase: BIP39 allows an optional passphrase (sometimes called the "25th word"). Adding a passphrase creates a completely different wallet, even from the same seed phrase. It's an extra layer of security.
๐Ÿ“ Output size
512 bits (64 bytes)
๐Ÿ”‘ Key size
256 bits (32 bytes) per half
๐Ÿ“‹ Standards
BIP32, BIP39, BIP44
๐Ÿ”„ PBKDF2 iterations
2048 (BIP39)

Proof of Work (PoW)

The energy-based consensus mechanism that secures Bitcoin

1

What is Proof of Work in simple terms?

Proof of Work is like a digital lottery where buying tickets requires real computational energy.

Miners compete to find a special number (nonce) that makes the block hash start with many zeros. The only way to find it is to guess billions of times per second. When someone finds it, they prove they did the work, and the network rewards them with Bitcoin.

๐ŸŽฐ Analogy: Imagine a lottery where each ticket costs $1 in electricity. The more tickets you buy (more computing power), the higher your chance of winning. But unlike a normal lottery, the winner gets to propose the next block of transactions.
The fundamental equation:
Hash(Block Header) < Target

If the hash is lower than the target โ†’ VALID BLOCK!
2

Why does Bitcoin need Proof of Work?

Proof of Work solves two critical problems in decentralized systems:

  • ๐Ÿ›ก๏ธ Sybil Attack prevention: Without PoW, an attacker could create millions of fake nodes to overwhelm the network. PoW requires energy, not identities.
  • โฑ๏ธ Byzantine Generals Problem: How do distant nodes agree on the truth? PoW creates an objective measure (most work = truth).
  • ๐Ÿ’ฐ No free lunch: Creating blocks costs real money, so miners only produce valid blocks.
โš ๏ธ Without Proof of Work: Anyone could rewrite history, create unlimited coins, or censor transactions. PoW is what makes Bitcoin "trustless" โ€” you don't need to trust anyone because the math proves the work.
3

Step 1: assembling the block header

Miners collect pending transactions and build a block header. The header is exactly 80 bytes โ€” no more, no less.

Block header structure (80 bytes):
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Field โ”‚ Size โ”‚ Description โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Version โ”‚ 4 bytes โ”‚ Software version โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Previous Block Hash โ”‚ 32 bytes โ”‚ Link to parent block โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Merkle Root โ”‚ 32 bytes โ”‚ Summary of all transactions โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Timestamp โ”‚ 4 bytes โ”‚ Mining time (Unix seconds) โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ nBits (Target) โ”‚ 4 bytes โ”‚ Difficulty target โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Nonce โ”‚ 4 bytes โ”‚ Random number (the guess) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
๐Ÿ“ฆ Why 80 bytes? The fixed size allows SPV wallets (like phone wallets) to store only block headers instead of full blocks โ€” years of headers fit in ~67 MB!
4

Step 2: calculating the block hash

Miners calculate the double SHA-256 hash of the block header:

Block Hash = SHA-256( SHA-256( Block Header ) )

This produces a 256-bit number (64 hex characters).

def calculate_block_hash(version, prev_hash, merkle_root, timestamp, bits, nonce):
    # Assemble the 80-byte header
    header = (
        version.to_bytes(4, 'little') +
        bytes.fromhex(prev_hash) +
        bytes.fromhex(merkle_root) +
        timestamp.to_bytes(4, 'little') +
        bits.to_bytes(4, 'little') +
        nonce.to_bytes(4, 'little')
    )
    
    # Double SHA-256
    return hashlib.sha256(hashlib.sha256(header).digest()).hexdigest()
๐Ÿ”’ Why double SHA-256? Single SHA-256 is vulnerable to length-extension attacks. Double hashing prevents this โ€” it's an extra layer of security.
5

Step 3: understanding the target

The target is a 256-bit number that determines how hard mining is. The block hash must be numerically lower than the target.

Example target (simplified):
0x0000000000000000000320283a032748cef8227873ff4872689bf23f1cda83a9
๐ŸŽฏ The more leading zeros, the harder:
0x00000000... (many zeros) = HARD (like today)
0x0000ffff... (fewer zeros) = EASY (like 2009)
nBits (compact format):
The target is stored in 4 bytes using scientific notation.
nBits = 0x17098f96 โ†’ Target = 0x098f96 ร— 256^(0x17 - 3)
๐Ÿ“ Why compact format? The full target is 256 bits (32 bytes). Storing it in the block header would waste space. The compact format compresses it to 4 bytes โ€” a 8ร— space saving!
6

Step 4: the mining loop (guessing billions of times)

Miners increment the nonce (and extranonce) billions of times per second until they find a valid hash:

def mine_block(header, target):
    nonce = 0
    extranonce = 0
    
    while True:
        # Update extranonce in coinbase transaction
        # (changes the merkle root)
        header.update_coinbase(extranonce)
        header.update_merkle_root()
        
        # Try all 4 billion nonce values
        for nonce in range(2**32):
            header.nonce = nonce
            block_hash = calculate_block_hash(header)
            
            # Convert hash to integer for comparison
            hash_int = int(block_hash, 16)
            
            if hash_int < target:
                return header, block_hash
        
        # Exhausted 4 billion nonces, try a new extranonce
        extranonce += 1
โšก Why both nonce AND extranonce? The nonce field is only 4 bytes (4 billion possibilities). Modern ASICs exhaust that in milliseconds. The extranonce (in the coinbase transaction) changes the merkle root, creating an entirely new header to try โ€” effectively unlimited attempts!
๐Ÿ“Š Mining speed (2024): The Bitcoin network performs ~600 exahashes per second (600 ร— 10ยนโธ hashes/sec). That's 600 quintillion guesses per second!
7

Difficulty โ€” how hard mining really is

Difficulty is a human-friendly number that represents how hard mining is relative to the first block in 2009.

Difficulty = (Maximum Target) / (Current Target)
Maximum Target = 0x00000000FFFF0000000000000000000000000000000000000000000000000000
๐Ÿ“ˆ Difficulty progression:
2009: Difficulty 1 (anyone could mine on a CPU)
2013: Difficulty 1 million (ASICs arrived)
2021: Difficulty 20 trillion
2024: Difficulty ~80 trillion (100 million ร— harder than 2009!)
Probability of finding a block:
P = (Hashrate ร— Block Time) / (2ยฒโตโถ / Difficulty)

For a solo miner with 1 TH/s (10ยนยฒ hashes/sec):
P โ‰ˆ 1 / (2ยฒโตโถ / (10ยนยฒ ร— 600 ร— 80e12)) โ‰ˆ 1 in 10ยฒโธ per block!
๐Ÿ’ก This is why mining pools exist: Solo mining is like buying a lottery ticket โ€” you might get lucky, but probability is extremely low. Pools combine thousands of miners to get steady, predictable payouts.
8

Step 5: broadcast and reward

When a miner finds a valid hash, they broadcast the block to the network. Other nodes verify the work (instantly โ€” one hash calculation) and if valid, the miner receives:

Reward = Block Subsidy + Transaction Fees
๐Ÿ’ฐ Current block subsidy (2024): 3.125 BTC per block (halved every 210,000 blocks โ‰ˆ every 4 years)
Halving schedule:
2009: 50 BTC per block
2012: 25 BTC per block
2016: 12.5 BTC per block
2020: 6.25 BTC per block
2024: 3.125 BTC per block
... continues until 2140
โฐ Why halving? The halving ensures Bitcoin's supply is predictable and disinflationary. It's the mechanism that enforces the 21 million cap.
9

Security analysis โ€” why 6 confirmations?

Satoshi calculated the probability of an attacker catching up from z blocks behind:

P = (q/p)^z
where p = honest mining power, q = attacker power
๐Ÿ“Š Example (q=0.1, p=0.9):
z=1 โ†’ 11% chance
z=2 โ†’ 1.2% chance
z=3 โ†’ 0.13% chance
z=6 โ†’ 0.0024% chance (1 in 42,000)
๐Ÿ’ฐ This is why exchanges wait for 6 confirmations: After 6 blocks (~1 hour), the chance of reversal is statistically impossible. For smaller amounts, 1-3 confirmations may be sufficient.
10

What about a 51% attack?

If an attacker controls >50% of total hashrate, they can:

  • โœ… Double-spend: Send coins to an exchange, then mine a longer chain without that transaction
  • โœ… Censor transactions: Refuse to include specific transactions
  • โœ… Prevent confirmations: Make the network unresponsive

They CANNOT:

  • โŒ Steal existing coins (requires private keys)
  • โŒ Create new coins (consensus rules prevent this)
  • โŒ Change history beyond the reorganization window
Cost to 51% attack Bitcoin (2024):
โ‰ˆ $10-15 billion in hardware + ongoing electricity costs
๐Ÿ’ก Why Bitcoin is secure: It's more profitable to mine honestly than to attack. The game theory works โ€” attackers would destroy the value of the very coins they're trying to steal.
โฑ๏ธ Target block time
10 minutes
๐Ÿ”„ Difficulty adjustment
Every 2,016 blocks (~2 weeks)
โšก Current hashrate
~1,00 EH/s (April 2026)
๐Ÿ’ฐ Current subsidy
3.125 BTC per block
๐Ÿ“… Final block subsidy
~2140 (last satoshi)

Nakamoto Consensus

The longest chain rule โ€” solving the Byzantine Generals Problem

1

What is Nakamoto Consensus in simple terms?

Nakamoto Consensus is the rule that determines which version of the blockchain is the "true" history.

When different nodes see different blocks, they all follow the chain with the most cumulative Proof of Work โ€” the chain that required the most energy to build. This creates an objective way to agree on truth without a central authority.

๐Ÿ—ณ๏ธ Analogy: Imagine a voting system where each "vote" is a block, and each vote costs real electricity. The candidate (chain) with the most total work (not necessarily the most votes) wins. This makes cheating extremely expensive.
The rule:
Choose the chain with maximum cumulative difficulty
Cumulative Difficulty = ฮฃ( difficulty of each block )
2

The Byzantine Generals Problem (why consensus is hard)

The Byzantine Generals Problem is a famous computer science dilemma that describes the difficulty of reaching consensus in a decentralized system where some participants may be malicious.

๐Ÿ›๏ธ The story: Imagine several generals surrounding a Byzantine city. They must attack at the same time to win. But some generals may be traitors who send conflicting messages. How can the loyal generals coordinate?
The problem applied to Bitcoin:
โ€” Nodes are generals (no central leader)
โ€” Transactions are attack orders
โ€” Malicious nodes are traitors (can lie, double-spend)
โ€” Question: How do honest nodes agree on the truth?
๐Ÿ’ก Satoshi's solution: The proof-of-work chain is a solution to the Byzantine Generals' Problem. I'll try to rephrase it in that context." โ€” Satoshi Nakamoto, Nov 13, 2008
3

The longest chain rule (how it works)

Nodes always follow the chain with the most cumulative Proof of Work โ€” not necessarily the most blocks, but the chain that required the most total energy.

Work(chain) = ฮฃ difficulty(block)
Choose chain with maximum work
def get_best_chain(chains):
    best_chain = None
    max_work = 0
    
    for chain in chains:
        # Sum the difficulty of each block (Proof of Work)
        total_work = sum(block.difficulty for block in chain)
        
        if total_work > max_work:
            max_work = total_work
            best_chain = chain
    
    return best_chain

# Nodes always build on the best chain
def choose_best_chain(chain_a, chain_b):
    if get_total_work(chain_a) > get_total_work(chain_b):
        return chain_a
    return chain_b
๐Ÿ“ Why cumulative difficulty, not just block count? An attacker could create many low-difficulty blocks quickly. Cumulative difficulty ensures that real work matters, not just quantity.
4

Fork resolution (when two miners find blocks simultaneously)

Sometimes two miners find valid blocks at nearly the same time. This creates a temporary fork โ€” two competing chains.

Visualization:
Chain A: Block 100 โ†’ Block 101A โ†’ Block 102A โ†’ Block 103A (wins)
Chain B: Block 100 โ†’ Block 101B โ†’ stops here (loses)
๐Ÿ”€ What happens:
1. Miner X finds Block 101A, broadcasts it
2. Miner Y finds Block 101B at nearly the same time
3. Different nodes see different blocks first
4. Network splits temporarily (a fork)
5. Next block found โ€” Miner X finds Block 102A
6. Chain A is now longer (102 blocks vs 101)
7. All nodes switch to Chain A
8. Chain B becomes an orphan block (discarded)
๐Ÿ’ก Orphan blocks: Valid blocks that didn't become part of the main chain. Their transactions return to the mempool. Miners lose the block reward โ€” wasted energy.
โš ๏ธ Orphan rate: Typically less than 1% thanks to fast network propagation (Compact Blocks, BIP152).
5

Confirmations โ€” why you wait

Each new block added on top of a transaction is called a confirmation. The more confirmations, the harder it is to reverse.

Confirmation levels:
0 confirmations: Transaction in mempool (risky for large amounts)
1 confirmation: In the latest block (safe for small amounts)
3 confirmations: Buried under 3 blocks (safe for medium amounts)
6 confirmations: Buried under 6 blocks (~1 hour) โ€” industry standard
๐Ÿ”’ Why 6 confirmations? Satoshi's calculations show that after 6 blocks, the probability of a successful 51% attack drops below 0.1% (assuming 10% attacker hashrate). It's the sweet spot between security and speed.
6

Game theory โ€” why honest mining is rational

Nakamoto Consensus works because of economic incentives. Miners are rewarded for following the rules and penalized for breaking them.

Honest mining reward: Block subsidy + transaction fees
Attack reward: Temporary double-spend (then Bitcoin becomes worthless)
๐Ÿ’ก The key insight: If an attacker gains 51% power, they face a choice:
โ€” Attack: Double-spend (short-term gain, destroys Bitcoin's value)
โ€” Mine honestly: Collect steady rewards (long-term profit)
Rational miners choose honesty because a destroyed Bitcoin has no value.
๐Ÿ’ฐ This is "economic security": It's more profitable to follow the rules than to break them. The game theory aligns self-interest with network security.
๐Ÿ“ Security threshold
>50% hashrate for attack
โฑ๏ธ Confirmation standard
6 blocks (~1 hour)
๐Ÿ”„ Orphan rate
<1%
๐ŸŽฏ Solved problem
Byzantine Generals

Difficulty Adjustment

Maintaining the 10-minute heartbeat

1

What is Difficulty Adjustment in simple terms?

Difficulty Adjustment is like a thermostat for Bitcoin mining โ€” it automatically makes mining harder or easier to keep block times at 10 minutes.

As more miners join the network, blocks would be found faster than 10 minutes. The difficulty adjustment makes the puzzle harder, slowing blocks back down. When miners leave, it makes the puzzle easier, speeding blocks back up.

๐ŸŒก๏ธ Analogy: Imagine a room with a thermostat. If people enter the room (more miners), the temperature rises (blocks found faster). The thermostat turns on the AC (increases difficulty) to bring the temperature back to 72ยฐF (10 minutes).
The goal: Maintain 10 minutes between blocks
The method: Adjust difficulty every 2,016 blocks (~2 weeks)
2

Why is difficulty adjustment necessary?

Without difficulty adjustment, block times would vary wildly as miners join or leave:

Without adjustment:
More miners โ†’ Faster blocks โ†’ Unstable network
Fewer miners โ†’ Slower blocks โ†’ Long wait times
๐Ÿ“Š Real example (2024 vs 2009):
2009: Difficulty 1, block time ~10 minutes
2024: Difficulty ~80 trillion, block time still ~10 minutes
The network has adjusted 100 million ร— harder, yet blocks still come every 10 minutes!
โฑ๏ธ Without adjustment: If 2024's hashrate existed in 2009, blocks would be found in microseconds โ€” the chain would be unusable.
3

The adjustment formula (how it's calculated)

Every 2,016 blocks (~2 weeks), the network calculates:

New Difficulty = Old Difficulty ร— (Actual Time / Expected Time)

Expected Time = 2,016 ร— 600 seconds = 1,209,600 seconds (exactly 2 weeks)
def adjust_difficulty(old_difficulty, actual_time_seconds):
    expected_time = 2016 * 600  # 1,209,600 seconds
    
    # Calculate ratio (how much faster/slower blocks were found)
    ratio = actual_time_seconds / expected_time
    
    # Cap the ratio to prevent extreme changes (safety measure)
    # Max 4x harder, max 0.25x easier
    ratio = max(0.25, min(4.0, ratio))
    
    # Adjust difficulty
    new_difficulty = old_difficulty / ratio
    
    return new_difficulty

# Example 1: Blocks found too fast (600,000 seconds vs 1,209,600)
new_diff = adjust_difficulty(80_000_000_000, 600000)
print(f"New Difficulty: {new_diff:.0f}")  # ~160,000,000,000 (doubled!)

# Example 2: Blocks found too slow (1,800,000 seconds)
new_diff = adjust_difficulty(80_000_000_000, 1800000)
print(f"New Difficulty: {new_diff:.0f}")  # ~53,000,000,000 (reduced by ~34%)
โš ๏ธ Why cap at 4x and 0.25x? Prevents malicious miners from manipulating difficulty by timestamp tricks. Without caps, an attacker could create extreme timestamps to make mining artificially easy or hard.
4

Difficulty vs Target (inverse relationship)

Difficulty and Target are inversely related:

Difficulty = (Maximum Target) / (Current Target)
Maximum Target = 0x00000000FFFF0000000000000000000000000000000000000000000000000000
๐Ÿ“Š Relationship:
Higher Difficulty โ†’ Lower Target โ†’ Fewer valid hashes โ†’ Harder mining
Lower Difficulty โ†’ Higher Target โ†’ More valid hashes โ†’ Easier mining
Example:
Difficulty 1 โ†’ Target = 0x00000000FFFF... (easy)
Difficulty 80 trillion โ†’ Target = 0x000000000000... (very small, hard)
5

Why 2,016 blocks? (the retargeting interval)

Satoshi chose 2,016 blocks because it's approximately 2 weeks at 10 minutes per block.

2,016 blocks ร— 10 minutes = 20,160 minutes = 336 hours = 14 days
๐Ÿ“ Why this number?
โ€” Long enough to smooth out random variance (mining luck)
โ€” Short enough to respond to changes in network hashrate
โ€” 2,016 is exactly 2ยนยน ร— 2 (easy to calculate in binary)
๐Ÿ’ก If adjustment was too frequent: Attackers could manipulate difficulty with timestamp tricks.
If adjustment was too infrequent: Network would be unresponsive to large hashrate changes.
6

Historical difficulty (how it's evolved)

Difficulty progression:
2009: Difficulty 1 (anyone could mine on a CPU)
2010: Difficulty ~100 (GPU mining begins)
2011: Difficulty ~1,000 (FPGA mining)
2012: Difficulty ~1,000,000 (ASICs arrive)
2013: Difficulty ~10,000,000
2016: Difficulty ~100,000,000,000
2021: Difficulty ~20,000,000,000,000
2024: Difficulty ~80,000,000,000,000
๐Ÿ“ˆ 80 trillion difficulty means: The current block hash must have about 19 leading zeros (76 bits). The probability of finding a valid hash is 1 in (2ยฒโตโถ / 80 trillion) โ‰ˆ 1 in 2โธโฐ โ€” extremely small!
7

Real example: the 2017 hashrate surge

In 2017, Bitcoin's price surged from $1,000 to $20,000, causing massive miner entry. Hashrate increased rapidly.

Before surge: Difficulty ~200 billion, block time ~10 minutes
During surge: Block times dropped to 5-8 minutes
After adjustment: Difficulty increased to ~1 trillion, block time back to 10 minutes
๐Ÿ“Š The self-regulating system: More miners โ†’ Faster blocks โ†’ Difficulty increases โ†’ Mining profitability returns to equilibrium โ†’ Network stabilizes
๐Ÿ’ก This is why Bitcoin works regardless of price: The difficulty adjustment ensures the network always finds blocks every 10 minutes, whether hashrate is 1 TH/s or 600 EH/s.
๐Ÿ”„ Adjustment interval
2,016 blocks (~2 weeks)
๐Ÿ“Š Max change
4x or 0.25x per adjustment
๐ŸŽฏ Target block time
10 minutes
๐Ÿ“ˆ Current difficulty
~80 trillion (2024)

Merkle Tree

Efficiently summarizing thousands of transactions into a single hash

1

What is a Merkle Tree in simple terms?

A Merkle Tree is like a family tree of hashes โ€” it summarizes many transactions into a single "root" hash.

Imagine you have 4 transactions. You hash each one (leaves). Then you hash pairs of leaves together (branches). Then you hash the branches together. You end up with one root hash that represents all 4 transactions.

๐ŸŒณ Analogy: Think of a tournament bracket. Each game produces a winner (hash). The final winner (root) represents the entire tournament. If you change any player in any game, the final winner changes.
Visualization (4 transactions):
                            [Merkle Root]
                           /            \
                      [Hash AB]        [Hash CD]
                     /       \        /       \
                 [TX1]      [TX2]  [TX3]      [TX4]
2

Why does Bitcoin need Merkle Trees?

Merkle Trees enable two critical features in Bitcoin:

  • ๐Ÿ“ฆ Compact block headers: The block header stores only the Merkle Root (32 bytes), not all transactions. This keeps headers small (80 bytes).
  • ๐Ÿ“ฑ SPV wallets (mobile wallets): Your phone can verify a transaction exists using only a Merkle Proof (few hundred bytes) instead of downloading the entire block (1-4 MB).
๐Ÿ’ก Without Merkle Trees: Block headers would need to store all transaction hashes โ€” thousands of hashes per block, making headers huge and SPV impossible.
3

Step 1: hash individual transactions (leaves)

Each transaction is hashed using double SHA-256 to create the leaves of the tree:

Leafโ‚ = SHA-256( SHA-256( TXโ‚ ) )
Leafโ‚‚ = SHA-256( SHA-256( TXโ‚‚ ) )
Leafโ‚ƒ = SHA-256( SHA-256( TXโ‚ƒ ) )
Leafโ‚„ = SHA-256( SHA-256( TXโ‚„ ) )
def hash_transaction(tx_data):
    # Double SHA-256 (Bitcoin standard)
    return hashlib.sha256(hashlib.sha256(tx_data).digest()).hexdigest()

tx1_hash = hash_transaction(b"Alice pays Bob 1 BTC")
tx2_hash = hash_transaction(b"Charlie pays Dave 2 BTC")
tx3_hash = hash_transaction(b"Eve pays Frank 0.5 BTC")
tx4_hash = hash_transaction(b"Grace pays Henry 0.75 BTC")

leaves = [tx1_hash, tx2_hash, tx3_hash, tx4_hash]
4

Step 2: pair and hash (create branches)

Pairs of adjacent leaves are concatenated and hashed to form parent nodes:

Parentโ‚ = SHA-256( SHA-256( Leafโ‚ + Leafโ‚‚ ) )
Parentโ‚‚ = SHA-256( SHA-256( Leafโ‚ƒ + Leafโ‚„ ) )
def hash_pair(hash_a, hash_b):
    # Concatenate and double hash
    combined = hash_a + hash_b
    return hashlib.sha256(hashlib.sha256(combined.encode()).digest()).hexdigest()

hash_ab = hash_pair(leaves[0], leaves[1])
hash_cd = hash_pair(leaves[2], leaves[3])
๐Ÿ”ข What if there's an odd number of leaves? The last leaf is duplicated. For example, with 3 transactions: Pair [TX1, TX2] and [TX3, TX3] โ€” the odd leaf is hashed with itself.
5

Step 3: repeat until one root remains

The process continues until only one hash remains โ€” the Merkle Root:

Merkle Root = SHA-256( SHA-256( Parentโ‚ + Parentโ‚‚ ) )
def build_merkle_root(hashes):
    """Build a Merkle tree and return the root"""
    if len(hashes) == 1:
        return hashes[0]
    
    next_level = []
    for i in range(0, len(hashes), 2):
        if i + 1 < len(hashes):
            combined = hashes[i] + hashes[i+1]
        else:
            # Odd number of items โ€” duplicate the last one
            combined = hashes[i] + hashes[i]
        next_level.append(hash_pair(combined))
    
    return build_merkle_root(next_level)

# Build the full tree
merkle_root = build_merkle_root(leaves)
print(f"Merkle Root: {merkle_root}")
๐ŸŒณ This root goes into the block header! The 32-byte Merkle Root is stored in every block header, summarizing all transactions in that block.
6

Space savings (why this is revolutionary)

A block with 4,000 transactions:
Without Merkle tree: Need to store all 4,000 TXIDs (128 KB)
With Merkle tree: Store only the root (32 bytes)
Space savings: 99.98%!
๐Ÿ“ฆ Block header size: Always 80 bytes, regardless of how many transactions are in the block.
- Version: 4 bytes
- Previous Block Hash: 32 bytes
- Merkle Root: 32 bytes
- Timestamp: 4 bytes
- nBits: 4 bytes
- Nonce: 4 bytes
๐Ÿ’ก This enables SPV: Your phone can store years of block headers (67 MB) instead of the full blockchain (600+ GB). This is why mobile Bitcoin wallets exist!
๐ŸŒณ Tree height
logโ‚‚(number of transactions)
๐Ÿ“ Root size
32 bytes (always)
๐Ÿ”ข Proof size
~logโ‚‚(n) hashes
๐Ÿ“Š Invented by
Ralph Merkle (1979)

Merkle Proof

Proving a transaction exists without downloading the whole block

1

What is a Merkle Proof in simple terms?

A Merkle Proof is like a "receipt" that proves a specific transaction is in a block without downloading the entire block.

It contains the target transaction hash and the "sibling" hashes needed to recompute the Merkle Root. If the recomputed root matches the block header's root, the transaction is proven to be in that block.

๐Ÿ“œ Analogy: Imagine you want to prove a specific word exists on page 50 of a sealed book. You don't need to send the whole book โ€” just page 50 and the hashes of pages 49 and 51 (the siblings). The verifier can check if the hashes match.
Proof size: For a block with 4,000 transactions, proof is only ~12 hashes (384 bytes) โ€” not 4,000 transactions (~1 MB)!
2

Why does Bitcoin need Merkle Proofs?

Merkle Proofs enable SPV (Simplified Payment Verification) wallets โ€” mobile wallets that don't download the full blockchain.

  • ๐Ÿ“ฑ Mobile wallets: Your phone stores only block headers (80 bytes each). When you receive a payment, you request a Merkle Proof from a full node.
  • ๐Ÿ”’ Trust-minimized verification: You don't need to trust the full node โ€” the math proves the transaction exists.
  • โšก Fast sync: Your wallet can verify payments in seconds instead of downloading years of history.
๐Ÿ’ก Without Merkle Proofs: Every wallet would need to download the full blockchain (600+ GB) to verify transactions. Mobile Bitcoin wallets wouldn't exist.
3

How a Merkle Proof works (step by step)

To prove TXโ‚ƒ is in the block, you only need:

Proof components:
1. TXโ‚ƒ (the target transaction hash)
2. TXโ‚„ (its sibling โ€” the hash next to it)
3. Hash AB (the sibling of the parent branch)
def verify_merkle_proof(tx_hash, proof, merkle_root):
    """
    tx_hash: The transaction hash to verify
    proof: List of (sibling_hash, direction) tuples
    merkle_root: The root from the block header
    """
    current_hash = tx_hash
    
    for sibling, direction in proof:
        if direction == 'left':
            # Sibling is on the left: Hash(Sibling + Current)
            current_hash = hash_pair(sibling, current_hash)
        else:
            # Sibling is on the right: Hash(Current + Sibling)
            current_hash = hash_pair(current_hash, sibling)
    
    return current_hash == merkle_root

# Example proof for TXโ‚ƒ
tx_hash = "hash_of_TX3"
proof = [
    ("hash_of_TX4", "right"),   # sibling of TX3
    ("hash_of_AB", "left")       # sibling of parent
]
merkle_root = "root_from_header"

is_valid = verify_merkle_proof(tx_hash, proof, merkle_root)
print(f"Transaction verified: {is_valid}")
๐Ÿ”ข Why direction matters: Hash(A + B) is completely different from Hash(B + A). The proof must tell the verifier which order to hash in.
4

Visual example (proving TXโ‚ƒ exists)

Merkle tree:
                            [Merkle Root]
                           /            \
                      [Hash AB]        [Hash CD]  โ† sibling of parent
                     /       \        /       \
                 [TX1]      [TX2]  [TX3]      [TX4]  โ† sibling of TX3
                                         โ†‘
                                       target
๐Ÿ“Š Verification steps:
1. Start with TXโ‚ƒ
2. Hash TXโ‚ƒ + TXโ‚„ (sibling on right) โ†’ get Hash CD
3. Hash Hash AB (sibling on left) + Hash CD โ†’ get Merkle Root
4. Compare with root from block header
5. Match โ†’ TXโ‚ƒ is in the block!
5

Proof size (why it's efficient)

Number of hashes in proof = logโ‚‚(number of transactions)
๐Ÿ“Š Examples:
2 transactions โ†’ 1 hash in proof
4 transactions โ†’ 2 hashes in proof
8 transactions โ†’ 3 hashes in proof
4,000 transactions โ†’ ~12 hashes in proof (~384 bytes)
1,000,000 transactions โ†’ ~20 hashes in proof (~640 bytes)
๐Ÿ’ก This logarithmic scaling is revolutionary: As blocks get larger (more transactions), the proof size grows very slowly. Merkle trees enable Bitcoin to scale while keeping SPV wallets efficient.
๐Ÿ“ Proof size
O(logโ‚‚ n) hashes
๐Ÿ”ข Example (4K TX)
~12 hashes (384 bytes)
โšก Verification time
O(logโ‚‚ n) hashes
๐Ÿ“ฑ Used in
SPV wallets, Light clients

Base58Check

Human-readable address encoding with error detection

1

What is Base58Check in simple terms?

Base58Check is a way to encode Bitcoin addresses so they're easy to read, hard to mistype, and include error detection.

It removes visually confusing characters (0, O, I, l) and adds a checksum that catches typos before you send money to the wrong address.

๐Ÿ”ค The Base58 alphabet:
"123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"
Notice: No 0 (zero), O (capital o), I (capital i), l (lowercase L)
๐Ÿ’ก Why remove these characters? They look similar and cause confusion when reading or typing addresses. A single typo could send money to the wrong place!
2

Why does Bitcoin need Base58Check?

Bitcoin addresses are binary data (20 bytes + version + checksum). Base58Check converts this binary into a human-readable string that:

  • ๐Ÿ“ Is easy to read and share (no confusing characters)
  • โœ… Has built-in error detection (catches typos)
  • ๐Ÿ”ข Is shorter than hex (34 chars vs 40 hex chars)
  • ๐Ÿ”’ Self-validating (invalid addresses are rejected by wallets)
๐Ÿ“Š Length comparison (same data):
Raw binary: 25 bytes
Hexadecimal: 50 characters
Base58Check: ~34 characters
3

How Base58 encoding works

Base58 works like converting a number to a different base (base-58 instead of base-10 or base-16).

ALPHABET = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'

def base58_encode(data):
    """Encode bytes to Base58 string"""
    # Convert bytes to a big integer
    value = int.from_bytes(data, 'big')
    
    encoded = ''
    while value > 0:
        value, remainder = divmod(value, 58)
        encoded = ALPHABET[remainder] + encoded
    
    # Add leading '1's for each leading zero byte
    for byte in data:
        if byte == 0:
            encoded = '1' + encoded
        else:
            break
    
    return encoded

# Example: Encode the number 100
print(base58_encode(b'\x64'))  # '2q' (100 in base-58)
๐Ÿ”ข Why base-58 specifically?
- Base-64 includes ambiguous characters (+ and /)
- Base-32 is less efficient (more characters)
- Base-58 is the sweet spot: efficient + unambiguous
4

The checksum (how errors are detected)

Base58Check adds a 4-byte checksum to the end of the payload:

Checksum = First 4 bytes of SHA-256( SHA-256( payload ) )
def base58check_encode(payload):
    # Step 1: Add version byte (0x00 for Bitcoin mainnet)
    versioned = b'\x00' + payload
    
    # Step 2: Calculate checksum (first 4 bytes of double SHA-256)
    checksum = hashlib.sha256(hashlib.sha256(versioned).digest()).digest()[:4]
    
    # Step 3: Append checksum and Base58 encode
    return base58_encode(versioned + checksum)

# Example: Creating a Bitcoin address
public_key_hash = bytes.fromhex('62e907b15cbf27d5425399ebf6f0fb50ebb88f18')
address = base58check_encode(public_key_hash)
print(f"Address: {address}")  # 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
โœ… Error detection: If you mistype a character, the checksum won't match, and the wallet will show "Invalid address" instead of sending to the wrong place.
5

Bitcoin address example (Genesis address)

Satoshi's Genesis address:
1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
๐Ÿ” Breaking it down:
- Starts with '1' โ†’ Mainnet P2PKH address
- 34 characters total
- Contains no confusing characters (0, O, I, l)
- Last 6 characters are part of the checksum
๐Ÿ“Š Address types by first character:
1 โ†’ Legacy P2PKH (old addresses)
3 โ†’ P2SH (multi-sig, compatibility)
bc1 โ†’ Bech32 (SegWit, modern addresses)
bc1p โ†’ Bech32m (Taproot, newest)
๐Ÿ”ค Alphabet size
58 characters
โœ… Checksum size
4 bytes (32 bits)
๐Ÿ”ข Error detection
~99.9999%
๐Ÿ“ Used for
Addresses, WIF private keys

Bech32

Modern SegWit address encoding with BCH error correction

1

What is Bech32 in simple terms?

Bech32 is a modern address format for Bitcoin that's all lowercase, easier to read, and can actually correct typos.

Unlike Base58Check (which only detects errors), Bech32 uses a special error-correcting code that can fix typos automatically. Bech32 addresses start with "bc1" for Bitcoin mainnet.

๐Ÿ”ค Example Bech32 address:
bc1qar0srrr7xfkvy5l643lydnw9re59gtzzwf5mdq
๐Ÿ’ก Why "Bech32"? The name comes from: Bitcoin + echo (for the BCH error code) + 32 (32-bit error correction).
2

Why does Bitcoin need Bech32?

Bech32 was introduced with SegWit (Segregated Witness) to support native SegWit addresses with several advantages:

  • ๐Ÿ”ง Error correction: Can detect up to 4 errors and correct up to 3
  • ๐Ÿ”ค All lowercase: Easier to read, type, and QR code
  • ๐Ÿ’ฐ Lower fees: Native SegWit addresses are more efficient (smaller transaction size)
  • ๐Ÿ“ No ambiguous characters: Like Base58Check, removes confusing letters
  • โœ… Better checksum: BCH code detects more errors than Base58Check
๐Ÿ’ก Bech32 vs Base58Check:
Base58Check: Only detects errors (you must retype)
Bech32: Can actually CORRECT errors (fixes typos automatically!)
3

Bech32 address structure (the three parts)

A Bech32 address has three distinct parts:

bc1qar0srrr7xfkvy5l643lydnw9re59gtzzwf5mdq
โ”‚โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚โ”‚ โ”‚
hrp separator data + checksum
๐Ÿ“‹ Breaking it down:
hrp (Human Readable Part): "bc" for Bitcoin mainnet
Separator: Always "1" (never appears in data)
Data: Witness program (variable length, usually 32 or 40 characters)
Checksum: 6 characters (BCH error-correcting code)
๐Ÿ”ข Address lengths:
Legacy (Base58Check): 26-34 characters
Bech32: 42-62 characters (slightly longer, but more features)
4

BCH error correction (how it fixes typos)

Bech32 uses a BCH (Bose-Chaudhuri-Hocquenghem) code โ€” a powerful error-correcting code that can:

Error detection: Up to 4 errors in any address
Error correction: Up to 3 errors (can fix common typos)
Guaranteed detection: Single character errors (always caught)
๐Ÿ“ How it works: The checksum is 6 characters (30 bits) of redundant data. If you make a typo, the wallet can determine what the correct character should be based on the checksum.
# Example: Typo correction in Bech32
# Original: bc1qar0srrr7xfkvy5l643lydnw9re59gtzzwf5mdq
# Typo:     bc1qar0srrr7xfkvy5l643lydnw9re59gtzzwf5m**dq** (swapped last two chars)
# Wallet can automatically correct this! (Base58Check would just say "invalid")
โš ๏ธ Not all errors can be corrected: If there are more than 3 errors, Bech32 will detect the error but cannot fix it. Still safer than Base58Check!
5

HRP (Human Readable Part) โ€” what "bc1" means

HRP values:
"bc" = Bitcoin mainnet (production network)
"tb" = Bitcoin testnet (for testing)
"bcrt" = Bitcoin regtest (for development)
๐Ÿ’ก Why "bc"? It's short for "Bitcoin" and ensures addresses from different networks don't get confused. You can't accidentally send mainnet Bitcoin to a testnet address โ€” the HRP won't match!
๐Ÿ”ข Taproot addresses (bc1p):
bc1p5d7rjq7g6rdk2yhzks9smlaqtedr4dekq08ge8qt4acm9lrq2s5a
Taproot addresses use "bc1p" (p for pay-to-taproot) and Bech32m encoding (slightly different checksum).
6

Data conversion (8 bits โ†’ 5 bits)

Bech32 uses a 5-bit alphabet (32 possible characters), but data is normally 8-bit bytes. So Bech32 converts 8-bit bytes to 5-bit groups.

Conversion process:
8-bit bytes โ†’ convert to 5-bit groups โ†’ map to alphabet
# Bech32 alphabet (32 characters)
BECH32_ALPHABET = "qpzry9x8gf2tvdw0s3jn54khce6mua7l"

def convert_bits(data, from_bits, to_bits, pad=True):
    """Convert between bit lengths (e.g., 8-bit bytes to 5-bit groups)"""
    acc = 0
    bits = 0
    ret = []
    maxv = (1 << to_bits) - 1
    max_acc = (1 << (from_bits + to_bits - 1)) - 1
    
    for value in data:
        acc = ((acc << from_bits) | value) & max_acc
        bits += from_bits
        while bits >= to_bits:
            bits -= to_bits
            ret.append((acc >> bits) & maxv)
    
    if pad and bits:
        ret.append((acc << (to_bits - bits)) & maxv)
    
    return ret

# Convert witness program (20 bytes) to 5-bit groups
witness_program = bytes.fromhex('62e907b15cbf27d5425399ebf6f0fb50ebb88f18')
five_bit_data = convert_bits(witness_program, 8, 5, pad=True)
๐Ÿ’ก Why 5 bits? 2โต = 32, so we can use a 32-character alphabet. This is efficient for QR codes and manual entry.
7

SegWit and Bech32 (why they go together)

Bech32 was specifically designed for SegWit (Segregated Witness) addresses. SegWit separates transaction signatures (witness data) from transaction data, which:

  • ๐Ÿ’ฐ Lowers fees: Witness data is discounted (0.75 bytes per byte)
  • ๐Ÿ”ง Fixes malleability: Transaction IDs no longer include signatures
  • โšก Enables Lightning: Malleability fix made Lightning possible
๐Ÿ“Š Fee discount calculation:
Transaction weight = 4 ร— base size + 1 ร— witness size
Base data (non-witness) counts 4ร—, witness data counts 1ร—
Bech32 SegWit addresses use witness data โ†’ lower fees!
# Fee comparison (example)
Legacy address (P2PKH):   1 input, 2 outputs โ†’ 192 vbytes โ†’ fee: 192 ร— 10 = 1920 sats
SegWit address (Bech32):  1 input, 2 outputs โ†’ 144 vbytes โ†’ fee: 144 ร— 10 = 1440 sats
# Savings: 25% lower fees!
8

Bech32m (the Taproot upgrade)

Bech32m is a modified version of Bech32 used for Taproot addresses (bc1p...). It fixes a security issue where some Bech32 addresses could be confused with short data.

Differences:
Bech32: Used for SegWit (bc1q...)
Bech32m: Used for Taproot (bc1p...)
Taproot version = 1 (witness version 1)
๐Ÿ” Taproot advantages:
- Better privacy (complex scripts look like simple payments)
- Schnorr signatures (smaller, batchable)
- MAST (only reveals the used spending condition)
๐Ÿ’ก Version 1 witness program (Taproot) uses Bech32m while version 0 (SegWit) uses Bech32. The checksum algorithm is slightly different to prevent confusion.
๐Ÿ”ค Alphabet size
32 characters
โœ… Error detection
Up to 4 errors
๐Ÿ”ง Error correction
Up to 3 errors
๐Ÿ“ Used for
SegWit (bc1), Taproot (bc1p)

Gossip Protocol

How transactions spread across the network in seconds

1

What is Gossip Protocol in simple terms?

Gossip Protocol is how Bitcoin nodes spread information โ€” like a rumor spreading through a crowd.

When a node hears a new transaction, it tells its neighbors. Those neighbors tell their neighbors, and within seconds, every node in the world knows about it. There's no central broadcaster โ€” information spreads peer-to-peer.

๐Ÿ—ฃ๏ธ Analogy: Imagine you hear a juicy secret at a party. You whisper it to 5 friends. They each whisper it to 5 more friends. Within minutes, everyone at the party knows the secret โ€” and there was no central announcer.
๐Ÿ’ก This is why your wallet shows incoming payments within seconds: Gossip protocol propagates transactions globally in 1-2 seconds, even though block confirmation takes ~10 minutes.
2

Why does Bitcoin need Gossip Protocol?

Without a central server, Bitcoin needs a way for all nodes to learn about new transactions and blocks. Gossip Protocol solves this:

  • ๐ŸŒ No central point of failure: If a node goes down, information routes around it
  • โšก Fast propagation: Information reaches 90% of nodes in 1-2 seconds
  • ๐Ÿ›ก๏ธ Resilient to attacks: Even if some nodes are malicious, honest nodes still communicate
  • ๐Ÿ“ˆ Scalable: Works with thousands of nodes
๐Ÿ“Š Propagation speed:
Within 1 second: ~50% of nodes
Within 2 seconds: ~90% of nodes
Within 10 seconds: ~99% of nodes
3

Peer connections (the network mesh)

Each Bitcoin node connects to 8-12 random peers. This creates a mesh network where information can spread quickly.

๐ŸŒ Network visualization:
Node A connects to B, C, D
Node B connects to A, E, F
Node C connects to A, G, H
Node D connects to A, I, J
Information spreads like a rumor through the mesh.
class BitcoinNode:
    def __init__(self):
        self.peers = []  # List of connected nodes
        self.seen_transactions = set()  # Avoid infinite loops
        self.mempool = []  # Unconfirmed transactions
    
    def add_peer(self, peer):
        self.peers.append(peer)
        peer.peers.append(self)  # Bidirectional connection
    
    def broadcast_transaction(self, tx):
        # Prevent infinite loops
        if tx.id in self.seen_transactions:
            return
        
        self.seen_transactions.add(tx.id)
        self.mempool.append(tx)
        
        # Tell all peers about the transaction
        for peer in self.peers:
            peer.receive_transaction_inv(tx.id)
    
    def receive_transaction_inv(self, tx_id):
        # Request the full transaction if we don't have it
        if tx_id not in self.seen_transactions:
            self.request_transaction(tx_id)
4

Inventory messages (INV) โ€” efficient broadcasting

Nodes don't broadcast full transactions immediately. They first send an INV (inventory) message containing just transaction hashes (32 bytes each).

INV message format:
[txidโ‚, txidโ‚‚, txidโ‚ƒ, ...] (each 32 bytes)
๐Ÿ’ก Why INV first? Your peer might already have the transaction. Sending the full transaction would waste bandwidth. INV lets peers request only what they need.
# INV message example
inv_message = {
    "type": "inv",
    "inventory": [
        "f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16",
        "4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b"
    ]
}

# Peer responds with GETDATA for transactions they don't have
if tx_id not in local_mempool:
    send_message("getdata", tx_id)
๐Ÿ“Š Bandwidth saving:
Full transaction: ~250-500 bytes
INV message: 32 bytes per transaction โ†’ 90%+ bandwidth savings!
5

Anti-spam mechanisms (stopping bad rumors)

To prevent spam and denial-of-service attacks, nodes follow strict rules:

  • ๐Ÿ” Validation first: Don't relay transactions you haven't validated
  • ๐Ÿ›‘ Stop bad rumors: Don't relay invalid transactions (ban the source)
  • ๐Ÿšฆ Rate limiting: Limit how many messages a peer can send
  • โš ๏ธ Ban malicious peers: Disconnect and ban peers sending invalid data
โšก DoS protection: If a node sends an invalid transaction, the receiving node will ban that peer. This prevents attackers from flooding the network with garbage.
๐Ÿ”Œ Default connections
8-12 outbound
โšก Propagation time
1-2 seconds (90% of nodes)
๐Ÿšฆ Max inbound
117 connections
๐Ÿ“ก Message types
INV, GETDATA, TX, BLOCK

Compact Block Relay (BIP152)

Reducing block propagation bandwidth

1

What is Compact Block Relay in simple terms?

Compact Block Relay is like sending only the "missing pieces" of a puzzle instead of the whole puzzle.

Most nodes already have most transactions in their mempool. Instead of sending the full block (1-4 MB), nodes send only the short IDs of transactions. The receiving node reconstructs the block using transactions it already has, requesting only the few it's missing.

๐Ÿงฉ Analogy: Imagine you and a friend both have almost the same jigsaw puzzle. Instead of sending all 1,000 pieces, your friend sends you a list of piece IDs. You already have 998 of them, so you only need to request the 2 missing pieces. This saves 99.8% of the data!
Bandwidth saving:
Full block: 1-4 MB
Compact block: ~10-40 KB
Savings: 90-95%!
2

Why does Bitcoin need Compact Block Relay?

Before BIP152 (2017), nodes had to download full blocks from peers. This caused problems:

  • ๐Ÿ“ก Bandwidth waste: Blocks contain 90%+ transactions already in mempool
  • โฑ๏ธ Slow propagation: Large blocks took seconds to transfer, increasing orphan rates
  • ๐ŸŒ Centralization risk: Only large nodes could afford the bandwidth
๐Ÿ’ก The insight: When a miner finds a block, most nodes already have those transactions in their mempool (from gossip protocol). So why send them again? Just send the transaction IDs!
๐Ÿ“Š Before BIP152: 1 MB block โ†’ 1 MB download for every peer
After BIP152: 1 MB block โ†’ ~20 KB download (98% savings!)
3

Short transaction IDs (how to identify transactions efficiently)

Instead of sending full 32-byte TXIDs, compact blocks use 6-byte short IDs (48 bits).

Short ID = SipHash_64(nonce, txid) & 0xffffffffffff
๐Ÿ”ข Size comparison:
Full TXID: 32 bytes
Short ID: 6 bytes
80% smaller!
def make_short_id(tx_hash, nonce):
    """
    Create a 6-byte short ID using SipHash
    SipHash is fast, non-cryptographic, and collision-resistant
    """
    import siphash
    
    # SipHash with 64-bit output
    sip = siphash.SipHash_2_4(key=nonce.to_bytes(16, 'little'))
    sip.update(tx_hash)
    
    # Take lowest 48 bits (6 bytes)
    return sip.raw_value() & 0xffffffffffff

# Collision probability: ~1 in 2โดโธ (very low, safe for Bitcoin)
โš ๏ธ What if there's a collision? Two different transactions could have the same short ID (extremely rare). If a collision is detected, the sender falls back to sending the full transaction.
4

Block reconstruction (how it works step by step)

# Sender side (miner)
compact_block = {
    'header': block_header,
    'short_ids': [make_short_id(tx) for tx in block.transactions],
    'prefilled_txns': []  # Transactions not in most peers' mempool
}

# Receiver side (node)
def reconstruct_block(compact_block, mempool):
    transactions = []
    missing = []
    
    # Try to find each transaction in mempool using short ID
    for short_id in compact_block.short_ids:
        tx = find_tx_by_short_id(mempool, short_id)
        if tx:
            transactions.append(tx)
        else:
            missing.append(short_id)
    
    # Request missing transactions (should be very few)
    if missing:
        requested_txs = request_missing(missing)
        transactions.extend(requested_txs)
    
    # Reconstruct full block
    return Block(compact_block.header, transactions)
๐Ÿ’ก Prefilled transactions: For the first few transactions (especially the coinbase), the sender includes the full transaction because they're unlikely to be in anyone's mempool yet.
5

Bandwidth savings (real numbers)

Without BIP152:
Block size: 1,000,000 bytes
Propagation to 10 peers: 10 MB
With BIP152:
Short IDs: 4,000 ร— 6 = 24,000 bytes
Missing transactions: ~200 bytes
Total: ~25 KB per peer
Propagation to 10 peers: 250 KB
Savings: 97.5%!
๐Ÿ“Š Real network impact: After BIP152 deployment in 2017, block propagation bandwidth dropped by over 90%, allowing larger blocks (SegWit) without centralizing the network.
๐Ÿ“ Short ID size
6 bytes (vs 32 bytes)
๐Ÿ’พ Bandwidth saving
~90-95%
๐Ÿ”ข BIP number
BIP152 (2017)
โšก Hash function
SipHash (fast, non-crypto)

UTXO Selection (Coin Selection)

Choosing which coins to spend

1

What is UTXO Selection in simple terms?

UTXO Selection is how your wallet decides which coins (UTXOs) to spend when you make a payment.

Your wallet has many UTXOs of different sizes (like having a $10 bill, a $5 bill, and three $1 bills). To pay $7, you could use the $5 + two $1 bills (3 inputs) or the $10 bill (1 input but you get $3 change). The algorithm chooses the best combination.

๐Ÿ’ฐ Analogy: Imagine your physical wallet contains: $20 bill, $10 bill, $5 bill, and three $1 bills. You need to pay $12. Your options:
- Use $10 + $1 + $1 (3 bills) โ†’ best for privacy, no large change
- Use $20 (1 bill) โ†’ simplest, but you get $8 change (creates new UTXO)
- Use $5 + $5 + $1 + $1 (4 bills) โ†’ worst (many inputs = higher fees)
2

Why does Bitcoin need UTXO selection?

Unlike a bank account (single balance), Bitcoin uses UTXOs โ€” many individual "coins" of different values. Your wallet must decide which ones to spend:

  • ๐Ÿ’ฐ Fee optimization: More inputs = larger transaction = higher fees
  • ๐Ÿ”’ Privacy: Using certain UTXOs can reveal your transaction history
  • ๐Ÿงน Dust management: Small UTXOs ("dust") may cost more to spend than they're worth
  • โšก Speed: More inputs = more signatures = slower to sign
๐Ÿ’ก This is a knapsack problem! It's the same as packing a backpack: choose items (UTXOs) to reach a target value (payment) while optimizing for weight (fees) and minimizing waste.
3

The knapsack problem (why it's non-trivial)

Given: UTXOs: [0.01, 0.05, 0.1, 0.5, 1.0] BTC
Target: 0.15 BTC payment
๐Ÿ“Š Possible solutions:
A) 0.1 + 0.05 (2 inputs, 0.00 waste) โ†’ optimal
B) 0.5 (1 input, 0.35 waste) โ†’ simple but creates large change
C) 0.1 + 0.01 + 0.05 (3 inputs, 0.01 waste) โ†’ more fees
D) 1.0 (1 input, 0.85 waste) โ†’ terrible, large change
โš ๏ธ Why can't we always use the best solution? The knapsack problem is NP-hard โ€” there's no quick algorithm for large numbers of UTXOs. Wallets use approximations.
4

Branch and Bound (B&B) โ€” the modern algorithm

Modern wallets (Bitcoin Core, Electrum) use Branch and Bound for optimal selection:

def select_utxos_bnb(utxos, target, fee_rate):
    """
    Branch and Bound coin selection
    Finds the optimal UTXO set minimizing waste
    """
    # Sort by value (largest first for efficiency)
    utxos = sorted(utxos, key=lambda x: x.value, reverse=True)
    
    best_selection = None
    best_waste = float('inf')
    
    def search(selected, selected_value, effective_value, index):
        nonlocal best_selection, best_waste
        
        # Calculate waste (excess value + fees)
        waste = (selected_value - target) + fee_rate * len(selected)
        
        # Prune branches that can't be better
        if waste < 0 or waste >= best_waste:
            return  # This path is useless, don't explore further
        
        # Found a solution
        if selected_value >= target:
            if waste < best_waste:
                best_waste = waste
                best_selection = selected.copy()
            return
        
        # No more UTXOs to consider
        if index >= len(utxos):
            return
        
        # Branch 1: Include this UTXO
        selected.append(utxos[index])
        search(selected, 
               selected_value + utxos[index].value,
               effective_value + utxos[index].effective_value, 
               index + 1)
        selected.pop()
        
        # Branch 2: Skip this UTXO
        search(selected, selected_value, effective_value, index + 1)
    
    search([], 0, 0, 0)
    return best_selection
๐ŸŒณ Branch and Bound = smart searching: Instead of trying every combination (2โฟ possibilities), B&B intelligently prunes branches that can't be optimal. This makes it feasible for wallets with hundreds of UTXOs.
5

Privacy considerations (how coin selection affects anonymity)

๐Ÿ”’ Different strategies, different privacy:
Largest-first: Use biggest UTXOs first (good privacy, may overpay)
Smallest-first: Use smallest UTXOs first (consolidates dust, may reveal patterns)
Random selection: Harder to trace (best privacy, but unpredictable fees)
B&B with privacy: Adds random variation to selection
๐Ÿ’ก Why largest-first is more private: If you always use your largest UTXOs, an observer can't easily link transactions because your spending patterns are less predictable. Using the same UTXOs repeatedly would reveal your identity.
โš ๏ธ Address reuse is the enemy of privacy: No matter which UTXOs you select, reusing the same address destroys privacy. HD wallets automatically generate new addresses to prevent this.
๐Ÿ“Š Complexity
O(nยฒ) worst case
๐ŸŽฏ Algorithm type
Knapsack / Branch and Bound
๐Ÿ”’ Privacy strategies
Largest-first, Random, Smallest-first
๐Ÿ“ Used in
Bitcoin Core, Electrum, Most wallets

Fee Estimation

Predicting optimal transaction fees

1

What is Fee Estimation in simple terms?

Fee Estimation predicts how much you should pay to get your transaction confirmed within a certain time.

Your wallet looks at the current mempool (unconfirmed transactions), analyzes how long transactions with different fees took to confirm recently, and recommends a fee rate (satoshis per vbyte) that will likely get your transaction into a block soon.

๐Ÿš— Analogy: Imagine a highway with toll lanes. Higher toll = faster lane. Fee estimation is like checking traffic before driving โ€” if traffic is bad, you might pay a higher toll to use the express lane. If traffic is light, you can pay the minimum.
Fee rate = fee / transaction size (satoshis per vbyte)
2

Why does Bitcoin need fee estimation?

Bitcoin has limited block space (1-4 MB per block). When many people want to send transactions, they compete by offering higher fees. Fee estimation helps users:

  • โฑ๏ธ Avoid overpaying: Don't pay high fees when network is quiet
  • โšก Avoid delays: Don't pay too little when network is congested
  • ๐Ÿ’ฐ Optimize costs: Balance speed vs. cost based on your needs
๐Ÿ’ก Without fee estimation: You'd have to guess fees manually โ€” too low and your transaction gets stuck, too high and you waste money. Fee estimation automates this.
3

The mempool (where unconfirmed transactions wait)

All unconfirmed transactions sit in the mempool (memory pool) โ€” a waiting room for transactions waiting to be mined. Miners pick the highest-fee transactions first.

๐Ÿ“Š Mempool visualization:
Fee rate 50+ sat/vB โ†’ Immediate (next block)
Fee rate 20-50 sat/vB โ†’ 1-3 blocks (10-30 min)
Fee rate 10-20 sat/vB โ†’ 3-10 blocks (30-100 min)
Fee rate 5-10 sat/vB โ†’ 10+ blocks (100+ min)
Fee rate 1-5 sat/vB โ†’ May never confirm during high traffic
โš ๏ธ The mempool changes constantly: Fees can spike during high traffic (e.g., price surges, NFT mints) and drop during quiet times (weekends, late nights).
4

How fee estimation works (the algorithm)

class FeeEstimator:
    def __init__(self):
        self.fee_history = []  # Recent blocks' fee rates
    
    def estimate_fee(self, target_blocks):
        """
        Estimate fee rate for target confirmation time
        target_blocks: 1 = fastest, 6 = standard, 25+ = slowest
        """
        # Collect fee rates from recent blocks
        fee_rates = []
        for block in self.recent_blocks(100):
            # Get minimum fee rate that would have been included in this block
            min_fee = block.get_minimum_included_fee()
            fee_rates.append(min_fee)
        
        # Sort and pick percentile based on target
        fee_rates.sort()
        
        # Faster targets require higher percentiles
        if target_blocks == 1:
            percentile = 90  # Fastest (90th percentile)
        elif target_blocks <= 3:
            percentile = 75  # Medium (75th percentile)
        elif target_blocks <= 6:
            percentile = 50  # Standard (50th percentile)
        else:
            percentile = 25  # Slow (25th percentile)
        
        # Return the fee rate at that percentile
        index = int(len(fee_rates) * percentile / 100)
        return fee_rates[index]
๐Ÿ“Š Percentile explanation:
90th percentile = 90% of recent transactions had lower fees โ†’ very likely to confirm quickly
50th percentile = average fee โ†’ likely to confirm within standard time
5

Mempool-based estimation (more accurate)

A more accurate method uses the current mempool state directly:

def estimate_from_mempool(mempool, target_blocks):
    """
    Estimate fee by simulating block inclusion from current mempool
    """
    # Sort mempool by fee rate (highest first)
    sorted_txs = sorted(mempool, key=lambda x: x.fee_rate, reverse=True)
    
    block_size = 0
    for tx in sorted_txs:
        block_size += tx.vsize
        # When we've filled enough blocks, return the fee rate
        if block_size >= target_blocks * 1_000_000:  # 1 MB per block
            return tx.fee_rate
    
    # If mempool is empty, return minimum fee
    return 1  # 1 sat/vbyte minimum
๐Ÿ’ก Why mempool-based is more accurate: Historical fee rates might not reflect current congestion. The mempool shows exactly how many transactions are waiting and at what fees.
6

Current fee rates (real-time examples)

Typical fee ranges (2024):
Urgent (next block): 50+ sat/vB
Fast (1-3 blocks): 20-50 sat/vB
Economy (3-10 blocks): 10-20 sat/vB
Slow (10+ blocks): 5-10 sat/vB
Minimum: 1 sat/vB (may never confirm)
๐Ÿ’ฐ Real examples:
$10 transaction (250 vbytes):
- Urgent: 250 ร— 50 = 12,500 sats (~$5)
- Economy: 250 ร— 10 = 2,500 sats (~$1)
- Lightning: 0-1 sat (~$0.0001)
โš ๏ธ Fees vary by time of day:
Highest: Weekdays (US business hours)
Lowest: Weekends, late nights (UTC)
7

What if you underpay? (RBF and CPFP)

If your fee estimate was too low, you have two options to speed up a stuck transaction:

RBF (Replace-By-Fee): Replace the transaction with a higher-fee version
CPFP (Child-Pays-For-Parent): Create a new transaction spending the stuck UTXO with high fee
๐Ÿ’ก RBF requires opt-in: Your wallet must enable RBF when creating the transaction. If not, you can only use CPFP.
# RBF example: Bump fee from 10 to 50 sat/vB
original_tx.fee_rate = 10
bumped_tx = replace_transaction(original_tx, new_fee_rate=50)

# CPFP example: Parent stuck at 5 sat/vB
parent_tx.fee_rate = 5
child_tx = create_child_transaction(parent_tx, child_fee_rate=100)
# Miners see: parent(5) + child(100) = average 52.5 sat/vB โ†’ includes both!
๐Ÿ”„ Update frequency
Every block (~10 min)
๐Ÿ“Š Lookback window
100-144 blocks (~1 day)
โšก Fast fee (2024)
20-50 sat/vB
๐ŸŒ Slow fee (2024)
5-10 sat/vB
๐Ÿ’ฌ Feedback
Thank you for your feedback! ๐Ÿ™