Bitcoin Parser: Understanding Raw Transactions

Learn how to parse raw Bitcoin transactions manually — from hex to human-readable

📖 What is a Raw Bitcoin Transaction?

A raw Bitcoin transaction is the actual data that gets broadcast to the network and stored on the blockchain. It's not human-readable — it's a string of hexadecimal numbers that represent:

💡 Why Learn to Parse Raw Transactions?

When you use a wallet, it hides all this complexity. But understanding the raw structure helps you:

  • Debug transaction issues
  • Understand how Bitcoin actually works
  • Build custom transaction tools
  • Verify what your wallet is doing

🏗️ Transaction Structure Overview

🔢 Note: Little-Endian vs Big-Endian

Bitcoin uses little-endian for most integer fields. That means the bytes are stored in reverse order. For example, 01000000 in little-endian means 1.

📦 RAW TRANSACTION STRUCTURE Total: Variable bytes
4 bytes
Version
01000000
↓ little-endian = 1
Transaction format version
Currently 1 or 2. Determines which rules apply.
varint
Input Count
01
1 input
Number of inputs (UTXOs being spent)
Encoded as a variable-length integer (varint).
INPUT #1
32 bytes
Previous TX Hash
a530bdca8a35b98eb5a62c196191b9782cb119a19c423f765c32a7d33877f8dd
↑ reversed for display
Hash of the previous transaction
The UTXO being spent comes from this transaction.
4 bytes
Output Index
00000000
= 0
Which output from previous TX
0-indexed position of the UTXO.
varint
ScriptSig Length
8a
138 bytes
Length of the unlocking script
Variable-length integer.
138 bytes
ScriptSig (Unlocking Script)
30440220... (DER signature) + 01 + 04010203... (public key)
Proves ownership of the private key
Unlocking script (witness)
Contains DER signature + sighash type + public key.
4 bytes
Sequence
ffffffff
= 4294967295 (max)
Sequence number
Used for replace-by-fee (RBF) and locktime.
ffffffff = disabled.
varint
Output Count
01
1 output
Number of outputs
Where the bitcoins are being sent.
OUTPUT #1
8 bytes
Amount
40420f0000000000
= 1,000,000 satoshis (0.01 BTC)
Value in satoshis
1 BTC = 100,000,000 satoshis.
varint
ScriptPubKey Length
19
25 bytes
Length of the locking script
Variable-length integer.
25 bytes
ScriptPubKey (Locking Script)
76 a9 14 0000000000000000000000000000000000000000 88 ac
P2PKH (Pay to Public Key Hash)
Locking script (conditions)
76a914{20-byte hash}88ac = standard Bitcoin address.
4 bytes
Locktime
00000000
= 0 (no locktime)
Earliest time/block the TX can be mined
0 = no locktime. Can be a block height or UNIX timestamp.

🐍 Bitcoin Transaction Parser (Python)

This parser reads a raw transaction hex string and breaks it down into its components:

bitcoin_parser.py — Complete transaction parser
def parse_varint(data_bytes, offset):
    """
    Parse a variable-length integer (varint) used in Bitcoin transactions.
    
    Bitcoin uses a compact encoding for integers:
    - If value < 0xFD (253): stored as a single byte
    - If value <= 0xFFFF: stored as 0xFD followed by 2 bytes
    - If value <= 0xFFFFFFFF: stored as 0xFE followed by 4 bytes
    - Otherwise: stored as 0xFF followed by 8 bytes
    
    Returns:
        (value, bytes_consumed)
    """
    fbyte = data_bytes[offset]

    if fbyte < 0xFD:
        return (fbyte, 1)
    elif fbyte == 0xFD:
        value = int.from_bytes(data_bytes[offset+1:offset+3], 'little')
        return (value, 3)
    elif fbyte == 0xFE:
        value = int.from_bytes(data_bytes[offset+1:offset+5], 'little')
        return (value, 5)
    else:
        value = int.from_bytes(data_bytes[offset+1:offset+9], 'little')
        return (value, 9)
    
def parse_tx(tx):
    """
    Parse a raw Bitcoin transaction hex string into a structured dictionary.
    
    Transaction structure:
    - Version (4 bytes, little-endian)
    - Input Count (varint)
    - For each input:
        - Previous TX Hash (32 bytes, reversed for display)
        - Output Index (4 bytes, little-endian)
        - ScriptSig Length (varint)
        - ScriptSig (variable bytes)
        - Sequence (4 bytes, little-endian)
    - Output Count (varint)
    - For each output:
        - Amount (8 bytes, little-endian, satoshis)
        - ScriptPubKey Length (varint)
        - ScriptPubKey (variable bytes)
    - Locktime (4 bytes, little-endian)
    """
    hbytes = bytes.fromhex(tx)
    offset = 0

    # Version (4 bytes, little-endian)
    version = int.from_bytes(hbytes[offset:offset+4], 'little')
    offset += 4

    # Input Count (varint)
    input_count, vbytes = parse_varint(hbytes, offset)
    offset += vbytes

    inputs = []
    for _ in range(input_count):
        # Previous Transaction Hash (32 bytes, reversed for display)
        prev_tx_hash = hbytes[offset:offset+32][::-1].hex()
        offset += 32

        # Output Index (4 bytes, little-endian)
        output_index = int.from_bytes(hbytes[offset:offset+4], 'little')
        offset += 4

        # ScriptSig Length (varint)
        scriptSig_len, vbytes = parse_varint(hbytes, offset)
        offset += vbytes

        script_bytes = hbytes[offset:offset+scriptSig_len]
        
        # Parse the ScriptSig (signature + public key)
        pos = 0
        if scriptSig_len == 0:
            der_sign, sighash_type, pub_key = None, None, None
        elif script_bytes[0] == 0x30:
            # DER signature starts with 0x30
            der_len = script_bytes[pos+1]
            der_sign = script_bytes[pos:pos+der_len+2]
            pos += der_len + 2
        else:
            # Handle other cases (rare)
            pos += 1
            der_len = script_bytes[pos+1]
            der_sign = script_bytes[pos:pos+der_len+2]
            pos += der_len + 2

        # Sighash type (1 byte: 0x01 = SIGHASH_ALL)
        sighash_type = script_bytes[pos]
        pos += 1

        # Public key length and value
        pubkey_len = script_bytes[pos]
        pos += 1
        pub_key = script_bytes[pos:pos+pubkey_len]
        offset += scriptSig_len

        # Sequence number (4 bytes, little-endian)
        sequence = int.from_bytes(hbytes[offset:offset+4], 'little')
        offset += 4

        inputs.append({
            "prev_hash": prev_tx_hash,
            "index": output_index,
            "scriptSig_len": scriptSig_len,
            "der_sign": der_sign.hex() if der_sign else None,
            "sighash_type": sighash_type,
            "pub_key": pub_key.hex() if pub_key else None,
            "sequence": sequence
        })

    # Output Count (varint)
    output_count, vbytes = parse_varint(hbytes, offset)
    offset += vbytes

    outputs = []
    for _ in range(output_count):
        # Amount (8 bytes, little-endian, in satoshis)
        amount = int.from_bytes(hbytes[offset:offset+8], 'little')
        offset += 8

        # ScriptPubKey Length (varint)
        script_pubkey_len, vbytes = parse_varint(hbytes, offset)
        offset += vbytes

        # ScriptPubKey (locking script)
        script_pubkey = hbytes[offset:offset+script_pubkey_len].hex()
        offset += script_pubkey_len

        outputs.append({
            "amount_satoshis": amount,
            "amount_btc": amount / 100000000,
            "script_pubkey_len": script_pubkey_len,
            "script_pubkey": script_pubkey
        })

    # Locktime (4 bytes, little-endian)
    locktime = int.from_bytes(hbytes[offset:offset+4], 'little')

    return {
        "version": version,
        "input_count": input_count,
        "inputs": inputs,
        "output_count": output_count,
        "outputs": outputs,
        "locktime": locktime
    }

# Example usage
if __name__ == "__main__":
    # Raw transaction hex (simplified for demonstration)
    tx_hex = "0100000001a530bdca8a35b98eb5a62c196191b9782cb119a19c423f765c32a7d33877f8dd000000008a47304402200102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f2002202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f400141040102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40ffffffff0140420f00000000001976a914000000000000000000000000000000000000000088ac00000000"
    
    result = parse_tx(tx_hex)
    
    print("=" * 60)
    print("BITCOIN TRANSACTION PARSER")
    print("=" * 60)
    print(f"Version:           {result['version']}")
    print(f"Input Count:       {result['input_count']}")
    print("-" * 60)
    
    for i, inp in enumerate(result['inputs']):
        print(f"INPUT #{i+1}:")
        print(f"  Previous TX Hash: {inp['prev_hash']}")
        print(f"  Output Index:     {inp['index']}")
        print(f"  ScriptSig Length: {inp['scriptSig_len']}")
        if inp['der_sign']:
            print(f"  DER Signature:    {inp['der_sign'][:64]}...")
        print(f"  Sighash Type:     {inp['sighash_type']} (SIGHASH_ALL)")
        if inp['pub_key']:
            print(f"  Public Key:       {inp['pub_key'][:64]}...")
        print(f"  Sequence:         {inp['sequence']}")
        print("-" * 60)
    
    print(f"Output Count:      {result['output_count']}")
    for i, out in enumerate(result['outputs']):
        print(f"OUTPUT #{i+1}:")
        print(f"  Amount:          {out['amount_btc']} BTC ({out['amount_satoshis']} satoshis)")
        print(f"  ScriptPubKey Len: {out['script_pubkey_len']}")
        print(f"  ScriptPubKey:     {out['script_pubkey']}")
        print("-" * 60)
    
    print(f"Locktime:          {result['locktime']}")
    print("=" * 60)

🔍 Understanding Each Field

Let's break down every component of a raw Bitcoin transaction. Each field has a specific purpose and follows strict rules.

4 bytes Version Example: 01000000
What it does: Specifies which transaction format rules apply. Currently version 1 or 2
Version 1: Original Bitcoin transaction format. Still widely used.
Version 2: Adds support for relative locktimes (BIP68) and optional sequence number semantics.
Little-endian: 01000000 = 1, 02000000 = 2
💡 Most transactions use version 1 or 2. Version 2 enables advanced features like Replace-By-Fee (RBF).
varint Input Count Example: 01 (1 input)
What it does: Tells the parser how many inputs (UTXOs) are being spent in this transaction.
Varint encoding: If value < 253: 1 byte. If value ≤ 65,535: 0xFD + 2 bytes. If value ≤ 4.29B: 0xFE + 4 bytes.
Most common: 01 (1 input) or 02 (2 inputs)
💡 A typical transaction spends 1-2 UTXOs. Coinbase transactions (mining rewards) have 1 input with special rules.
32 bytes Previous TX Hash a530bdca...77f8dd
What it does: Points to the previous transaction that contains the UTXO you want to spend.
Double SHA-256: This hash is calculated as SHA256(SHA256(previous_tx_data))
Little-endian storage: Bytes are stored in reverse order. Humans reverse it back to big-endian for reading.
Coinbase special case: In mining transactions (coinbase), this field is all zeros because there's no previous transaction.
💡 You can look up this hash on any block explorer to see the previous transaction. It's like a "receipt number" from a previous payment.
4 bytes Output Index (vout) Example: 00000000 (0)
What it does: Specifies which output from the previous transaction you're spending.
0-indexed: First output is index 0, second is index 1, etc.
Why needed: A transaction can have multiple outputs. This tells the network exactly which one you're spending.
💡 Think of it like: "I want to spend output #3 from transaction XYZ" — that's exactly what this field does.
varint ScriptSig Length Example: 8a (138 bytes)
What it does: Tells the parser how many bytes to read for the unlocking script (ScriptSig).
Typical length: P2PKH signatures are ~106-109 bytes. Legacy multisig can be larger.
SegWit: SegWit transactions often have empty ScriptSig (length 0) because the witness data moved elsewhere.
💡 This is a varint because scripts can vary in size. A standard P2PKH signature is about 106 bytes.
⚡ SegWit Note

SegWit transactions have scriptSig_len = 0. The signature moves to the witness field (not parsed here).

~70-73 bytes DER Signature 30440220... (70-73 bytes)
What it does: A cryptographic proof that you own the private key corresponding to the public key.
DER format: Standardized encoding for ECDSA signatures. Always starts with 0x30.
Structure: 0x30 | length | 0x02 | r-length | r | 0x02 | s-length | s
r and s: Two 32-byte integers that make up the signature. If either has a leading zero byte, DER encoding adds an extra 0x00.
💡 You can't reverse a signature to get the private key. But anyone can verify it with your public key.
1 byte Sighash Type Example: 01 (SIGHASH_ALL)
What it does: Specifies which parts of the transaction are covered by the signature.
SIGHASH_ALL (0x01): Signs all inputs and outputs. Most common. Anyone who modifies the transaction invalidates the signature.
SIGHASH_NONE (0x02): Signs all inputs but no outputs. Allows anyone to add outputs (used for crowdfunding).
SIGHASH_SINGLE (0x03): Signs all inputs but only one output (the one with same index as the input).
SIGHASH_ANYONECANPAY (0x80): Can be ORed with others. Signs only one input, allowing others to add more inputs.
💡 99% of transactions use SIGHASH_ALL (0x01). It means "I agree to this exact transaction — no changes allowed."
33 or 65 bytes Public Key 04010203... (65B) or 020102... (33B)
What it does: The public key that matches the private key used to create the signature.
Compressed (33 bytes): Starts with 02 or 03 (even/odd y-coordinate), followed by 32-byte x-coordinate. More efficient.
Uncompressed (65 bytes): Starts with 04, followed by 32-byte x and 32-byte y coordinates. Older format.
Verification: Anyone can use this public key to verify your signature without knowing your private key.
💡 The public key is derived from the private key using elliptic curve multiplication. It's safe to share — you cannot reverse it to find the private key.
4 bytes Sequence (nSequence) Example: ffffffff (4294967295)
What it does: Used for transaction replacement and relative locktimes.
ffffffff (max value): Sequence is disabled. Transaction cannot be replaced (default behavior).
Replace-By-Fee (RBF): If sequence < 0xffffffff - 1, the transaction signals that it can be replaced with a higher-fee version.
Relative locktime (BIP68): Sequence values can specify a minimum number of blocks or seconds before the input can be spent.
💡 If you've ever seen "transaction replaced by fee" in a wallet, that happened because the sender used RBF (sequence not maxed out).
varint Output Count Example: 01 (1 output) or 02 (2 outputs)
What it does: Tells the parser how many outputs (recipients) this transaction has.
Minimum outputs: At least 1 output (unless it's a coinbase transaction with 0 outputs, which is invalid).
Typical: Most transactions have 1-2 outputs. One to the recipient, one for change back to the sender.
💡 If you send 1 BTC and have 2 BTC in your wallet, you'll see 2 outputs: 1 BTC to the recipient, and ~0.999 BTC back to you (after fees).
8 bytes Amount Example: 00e1f50500000000 (100,000,000 satoshis = 1 BTC)
What it does: Specifies how many satoshis are being sent to this output.
Satoshis (sats): The smallest unit of Bitcoin. 1 BTC = 100,000,000 satoshis.
Little-endian storage: Bytes are stored in reverse order (least significant byte first).
Example breakdown: Raw hex: 00e1f50500000000
Reverse bytes: 0000000005f5e100
Hexadecimal value: 0x05F5E100
Decimal value: 100,000,000 satoshis = 1 BTC
Common values:
00e1f50500000000 = 100,000,000 sats (1 BTC)
00ca9a3b00000000 = 1,000,000 sats (0.01 BTC)
0065cd1d00000000 = 500,000 sats (0.005 BTC)
0094357700000000 = 2,000,000 sats (0.02 BTC)
Maximum value: The total supply of Bitcoin is 21,000,000 BTC = 2,100,000,000,000,000 satoshis (0x77359400 in hex, fits in 8 bytes).
💡 Always think in satoshis when parsing raw transactions! The amount field is always an integer number of satoshis, never a decimal.
variable ScriptPubKey (Locking Script) 76a914{20 bytes}88ac
What it does: A mini-program that sets the conditions for spending this output.
P2PKH (Standard): 76a914{20-byte pubkey hash}88ac — "Pay to Public Key Hash". Most common. Requires a signature + public key.
P2SH (Multisig): a914{20-byte script hash}87 — "Pay to Script Hash". Used for multisignature wallets.
P2WPKH (SegWit): 0014{20-byte pubkey hash} — Native SegWit. Lower fees.
OP_RETURN: 6a{data} — Provably unspendable. Used to store data on the blockchain.
💡 The ScriptPubKey is like a lock. The ScriptSig is the key. If the key fits the lock, the output can be spent.
4 bytes Locktime (nLockTime) Example: 00000000 (0 = no locktime)
What it does: The earliest time or block height when the transaction can be added to the blockchain.
0: No locktime — transaction can be mined immediately.
Block height (if < 500,000,000): Transaction cannot be mined until that block number. Example: 0000e0ff = 16,711,680 (far in the future).
UNIX timestamp (if ≥ 500,000,000): Transaction cannot be mined until after that date (in seconds since 1970).
Sequence requirement: For locktime to be enforced, at least one input must have a sequence < 0xffffffff.
💡 Locktime is like a post-dated check. You create a transaction now, but it can only be sent after a future date or block height.

📊 Quick Reference Table

FieldSizePurposeTypical Value
Version4 bytesTransaction format01000000 (1)
Input CountvarintNumber of inputs01
Previous TX Hash32 bytesPoints to previous UTXOa530bdca...
Output Index4 bytesWhich output to spend00000000
ScriptSig LengthvarintUnlocking script size8a (138)
DER Signature~70-73BECDSA proof of ownership30440220...
Sighash Type1 byteWhat is signed01 (ALL)
Public Key33/65BYour public key04010203...
Sequence4 bytesReplacement/locktimeffffffff
Output CountvarintNumber of outputs01
Amount8 bytesValue in satoshis40420f0000000000
ScriptPubKey LengthvarintLocking script size19 (25)
ScriptPubKeyvariableSpending conditions76a914...88ac
Locktime4 bytesEarliest spend time00000000

📏 Variable-Length Integers (Varint)

Bitcoin uses a compact encoding for integers to save space. This is called varint or compact size uint.

Value RangeEncodingBytes Used
0 to 252As a single byte1
253 to 65,5350xFD + 2 bytes little-endian3
65,536 to 4,294,967,2950xFE + 4 bytes little-endian5
4,294,967,296 to 2^64-10xFF + 8 bytes little-endian9
📝 Why Varint?

Most transactions have 1-2 inputs and outputs, so using 1 byte for counts saves space. Larger counts use more bytes only when necessary.

💻 Running the Parser

How to run
# Save the code to a file
python bitcoin_parser.py

# Or run it directly in Python
python -c "exec(open('bitcoin_parser.py').read())"
🔧 Try It Yourself

Replace the tx_hex variable with any raw transaction from a block explorer:

  • Go to Mempool.space
  • Click on any transaction
  • Look for "Raw Transaction" or "Hex"
  • Copy the hex and paste it into the code

📜 Understanding Scripts

ScriptSig (Unlocking Script)

Contains the signature and public key that "unlock" the UTXO. Format: <DER signature><sighash><public key>

ScriptPubKey (Locking Script)

Contains the conditions for spending. Common types:

TypeScriptMeaning
P2PKH76a914{20-byte hash}88acPay to Public Key Hash (standard address)
P2SHa914{20-byte hash}87Pay to Script Hash (multisig, etc.)
P2WPKH0014{20-byte hash}Pay to Witness Public Key Hash (SegWit)
OP_RETURN6a{data}Provably unspendable (store data)

🖥️ Live Transaction Parser

Paste any raw Bitcoin transaction hex below to see it parsed in real-time.