Skip to main content

Try it Live

Run RLP examples in the interactive playground

RLP Decoding

Decode RLP-encoded bytes back to data structures with comprehensive validation and error handling.

Overview

RLP decoding parses compact byte representations back into nested data structures. The decoder performs comprehensive validation to ensure canonical encoding and prevent malformed input attacks. Key Features:
  • Canonical validation - Rejects non-minimal encodings
  • Depth limiting - Prevents stack overflow on deeply nested data
  • Stream support - Decode multiple values from a byte stream
  • Detailed errors - Clear error messages for debugging

decode

Decodes RLP-encoded bytes into data structures.

    Decoding Algorithm

    RLP decoder uses the first byte (prefix) to determine data type and length:

    Prefix Ranges

    // Prefix byte determines encoding type:
    // 0x00-0x7f: Single byte (value itself)
    // 0x80-0xb7: Short string (0-55 bytes)
    // 0xb8-0xbf: Long string (56+ bytes)
    // 0xc0-0xf7: Short list (0-55 bytes total)
    // 0xf8-0xff: Long list (56+ bytes total)
    

    Single Byte (0x00-0x7f)

    Bytes with value < 0x80 encode as themselves:
    import { Rlp } from 'tevm'
    
    const encoded = new Uint8Array([0x7f])
    const result = Rlp.decode(encoded)
    // => { data: { type: 'bytes', value: Uint8Array([0x7f]) }, remainder: Uint8Array([]) }
    
    const zero = new Uint8Array([0x00])
    const result = Rlp.decode(zero)
    // => { data: { type: 'bytes', value: Uint8Array([0x00]) }, remainder: Uint8Array([]) }
    

    Short String (0x80-0xb7)

    Length encoded in prefix: length = prefix - 0x80
    import { Rlp } from 'tevm'
    
    // Empty string
    const empty = new Uint8Array([0x80])
    const result = Rlp.decode(empty)
    // => { data: { type: 'bytes', value: Uint8Array([]) }, remainder: Uint8Array([]) }
    
    // 3-byte string
    const bytes = new Uint8Array([0x83, 1, 2, 3])
    const result = Rlp.decode(bytes)
    // 0x83 - 0x80 = 3 bytes
    // => { data: { type: 'bytes', value: Uint8Array([1, 2, 3]) }, remainder: Uint8Array([]) }
    
    // 55-byte string (maximum short form)
    const max = new Uint8Array([0xb7, ...Array(55).fill(0x42)])
    const result = Rlp.decode(max)
    // 0xb7 - 0x80 = 55 bytes
    

    Long String (0xb8-0xbf)

    Length-of-length encoding: lengthOfLength = prefix - 0xb7
    import { Rlp } from 'tevm'
    
    // 56-byte string (minimum long form)
    const min = new Uint8Array([0xb8, 56, ...Array(56).fill(0x42)])
    const result = Rlp.decode(min)
    // 0xb8 - 0xb7 = 1 (length needs 1 byte)
    // Next byte: 56 = actual length
    // => { data: { type: 'bytes', value: Uint8Array(56).fill(0x42) }, remainder: Uint8Array([]) }
    
    // 256-byte string (length needs 2 bytes)
    const large = new Uint8Array([0xb9, 0x01, 0x00, ...Array(256).fill(0x42)])
    const result = Rlp.decode(large)
    // 0xb9 - 0xb7 = 2 (length needs 2 bytes)
    // Next 2 bytes: [0x01, 0x00] = 256
    // => { data: { type: 'bytes', value: Uint8Array(256).fill(0x42) }, remainder: Uint8Array([]) }
    

    Short List (0xc0-0xf7)

    Total payload length in prefix: length = prefix - 0xc0
    import { Rlp } from 'tevm'
    
    // Empty list
    const empty = new Uint8Array([0xc0])
    const result = Rlp.decode(empty)
    // => { data: { type: 'list', value: [] }, remainder: Uint8Array([]) }
    
    // List with 2 single bytes
    const list = new Uint8Array([0xc2, 0x01, 0x02])
    const result = Rlp.decode(list)
    // 0xc2 - 0xc0 = 2 bytes total payload
    // => {
    //   data: {
    //     type: 'list',
    //     value: [
    //       { type: 'bytes', value: Uint8Array([1]) },
    //       { type: 'bytes', value: Uint8Array([2]) }
    //     ]
    //   },
    //   remainder: Uint8Array([])
    // }
    
    // List with encoded strings
    const strings = new Uint8Array([0xc7, 0x82, 1, 2, 0x83, 3, 4, 5])
    const result = Rlp.decode(strings)
    // 0xc7 - 0xc0 = 7 bytes payload
    // Contains: [0x82, 1, 2] (3 bytes) and [0x83, 3, 4, 5] (4 bytes)
    

    Long List (0xf8-0xff)

    Length-of-length encoding: lengthOfLength = prefix - 0xf7
    import { Rlp } from 'tevm'
    
    // List with 60 bytes total payload
    const items = Array(30).fill([0x01, 0x02]).flat()
    const long = new Uint8Array([0xf8, 60, ...items])
    const result = Rlp.decode(long)
    // 0xf8 - 0xf7 = 1 (length needs 1 byte)
    // Next byte: 60 = payload length
    
    // List with 256 bytes total payload
    const large = new Uint8Array([0xf9, 0x01, 0x00, ...Array(256).fill(0x01)])
    const result = Rlp.decode(large)
    // 0xf9 - 0xf7 = 2 (length needs 2 bytes)
    // Next 2 bytes: [0x01, 0x00] = 256
    

    Decoding Patterns

    Extract Transaction Data

    Decode transaction bytes and extract fields:
    import { Rlp } from 'tevm'
    
    // Legacy transaction RLP
    const txBytes = new Uint8Array([...])  // RLP-encoded transaction
    const result = Rlp.decode(txBytes)
    
    if (result.data.type === 'list') {
      const [nonce, gasPrice, gas, to, value, data, v, r, s] = result.data.value
    
      // Each field is a bytes data structure
      if (nonce.type === 'bytes') {
        console.log('Nonce:', nonce.value)
      }
      if (to.type === 'bytes') {
        console.log('To:', to.value)
      }
    }
    

    Recursive Flattening

    Flatten nested lists to extract all byte values:
    import { Rlp } from 'tevm'
    
    const nested = new Uint8Array([...])  // Deeply nested RLP
    const result = Rlp.decode(nested)
    
    // Flatten recursively extracts all bytes
    const allBytes = Rlp.flatten(result.data)
    // => Array of { type: 'bytes', value: Uint8Array }
    
    for (const item of allBytes) {
      console.log('Bytes:', item.value)
    }
    

    Stream Decoding

    Decode multiple RLP values from a stream:
    import { Rlp } from 'tevm'
    
    function* decodeStream(bytes: Uint8Array) {
      let remainder = bytes
    
      while (remainder.length > 0) {
        const result = Rlp.decode(remainder, true)
        yield result.data
        remainder = result.remainder
      }
    }
    
    // Use stream decoder
    const stream = new Uint8Array([0x01, 0x02, 0x03, 0x04])
    for (const data of decodeStream(stream)) {
      console.log('Decoded:', data)
    }
    // Outputs each byte as separate data structure
    

    Validate and Extract

    Decode with validation and type checking:
    import { Rlp } from 'tevm'
    
    function decodeTransaction(bytes: Uint8Array) {
      const result = Rlp.decode(bytes)
    
      // Must be a list
      if (result.data.type !== 'list') {
        throw new Error('Transaction must be RLP list')
      }
    
      // Must have 9 fields (legacy tx)
      if (result.data.value.length !== 9) {
        throw new Error('Invalid transaction field count')
      }
    
      // Must consume all bytes
      if (result.remainder.length > 0) {
        throw new Error('Extra data after transaction')
      }
    
      return result.data
    }
    

    Canonical Validation

    RLP decoder enforces canonical encoding rules to prevent malleability:

    Non-canonical Single Byte

    Single bytes < 0x80 must not have a length prefix:
    import { Rlp } from 'tevm'
    
    // Invalid: single byte 0x7f with prefix
    const invalid = new Uint8Array([0x81, 0x7f])
    try {
      Rlp.decode(invalid)
    } catch (error) {
      // Error: NonCanonicalSize
      // Single byte < 0x80 should not be prefixed
    }
    
    // Valid: 0x7f encodes as itself
    const valid = new Uint8Array([0x7f])
    const result = Rlp.decode(valid)  // OK
    

    Non-canonical Short Form

    Strings < 56 bytes must use short form:
    import { Rlp } from 'tevm'
    
    // Invalid: 3-byte string using long form
    const invalid = new Uint8Array([0xb8, 0x03, 1, 2, 3])
    try {
      Rlp.decode(invalid)
    } catch (error) {
      // Error: NonCanonicalSize
      // String < 56 bytes should use short form
    }
    
    // Valid: 3-byte string in short form
    const valid = new Uint8Array([0x83, 1, 2, 3])
    const result = Rlp.decode(valid)  // OK
    

    Leading Zeros

    Length encodings must not have leading zeros:
    import { Rlp } from 'tevm'
    
    // Invalid: length with leading zero
    const invalid = new Uint8Array([0xb9, 0x00, 0x38, ...Array(56).fill(0x42)])
    try {
      Rlp.decode(invalid)
    } catch (error) {
      // Error: LeadingZeros
      // Length encoding has leading zeros
    }
    
    // Valid: minimal length encoding
    const valid = new Uint8Array([0xb8, 56, ...Array(56).fill(0x42)])
    const result = Rlp.decode(valid)  // OK
    

    Length Mismatches

    Declared length must match actual data:
    import { Rlp } from 'tevm'
    
    // Invalid: declared 5 bytes but only 3 provided
    const invalid = new Uint8Array([0x85, 1, 2, 3])
    try {
      Rlp.decode(invalid)
    } catch (error) {
      // Error: InputTooShort
      // Expected 6 bytes, got 4
    }
    
    // Invalid: list length mismatch
    const invalid = new Uint8Array([0xc5, 0x01, 0x02])  // Says 5 bytes but only 2
    try {
      Rlp.decode(invalid)
    } catch (error) {
      // Error: InputTooShort
    }
    

    Error Handling

    Comprehensive error handling for malformed input:
    import { Rlp } from 'tevm'
    
    // Empty input
    try {
      Rlp.decode(Bytes())
    } catch (error) {
      console.error('InputTooShort: Cannot decode empty input')
    }
    
    // Extra data (non-stream mode)
    try {
      const bytes = new Uint8Array([0x01, 0x02])
      Rlp.decode(bytes, false)
    } catch (error) {
      console.error('InvalidRemainder: Extra data after decoded value: 1 bytes')
    }
    
    // Invalid prefix
    try {
      // No such thing as 0xff prefix in current spec
      Rlp.decode(new Uint8Array([0xff, 0x00]))
    } catch (error) {
      console.error('UnexpectedInput: Invalid RLP prefix')
    }
    
    // Recursion depth exceeded
    try {
      // Create deeply nested structure (> 32 levels)
      let nested = new Uint8Array([0xc0])  // Empty list
      for (let i = 0; i < 40; i++) {
        nested = new Uint8Array([0xc1, ...nested])  // Wrap in list
      }
      Rlp.decode(nested)
    } catch (error) {
      console.error('RecursionDepthExceeded: Maximum recursion depth 32 exceeded')
    }
    
    // Incomplete data
    try {
      const incomplete = new Uint8Array([0x83, 1, 2])  // Says 3 bytes, only has 2
      Rlp.decode(incomplete)
    } catch (error) {
      console.error('InputTooShort: Expected 4 bytes, got 3')
    }
    

    Error Types Reference

    ErrorCauseFix
    InputTooShortNot enough bytes for declared lengthProvide complete data
    InvalidRemainderExtra bytes after value (non-stream)Use stream=true or trim input
    NonCanonicalSizeNon-minimal length encodingUse canonical encoding
    LeadingZerosLength has leading zero bytesRemove leading zeros from length
    InvalidLengthList payload length mismatchFix list item encodings
    RecursionDepthExceededNested > 32 levels deepReduce nesting depth
    UnexpectedInputInvalid prefix or formatCheck input is valid RLP

    Performance Considerations

    Depth Limiting

    Maximum recursion depth is 32 to prevent stack overflow:
    import { Rlp } from 'tevm'
    
    // This is fine (depth = 3)
    const shallow = [[[new Uint8Array([1])]]]
    const encoded = Rlp.encode(shallow)
    const decoded = Rlp.decode(encoded)  // OK
    
    // This will fail (depth > 32)
    const deep = Array(40).fill(null).reduce(
      (acc) => [acc],
      new Uint8Array([1])
    )
    const encoded = Rlp.encode(deep)
    try {
      Rlp.decode(encoded)
    } catch (error) {
      // RecursionDepthExceeded
    }
    

    Stream Mode Efficiency

    Use stream mode to avoid re-parsing when decoding multiple values:
    import { Rlp } from 'tevm'
    
    // Inefficient: decode + re-decode remainder
    const data = new Uint8Array([0x01, 0x02, 0x03])
    const first = Rlp.decode(data.slice(0, 1))
    const second = Rlp.decode(data.slice(1, 2))
    const third = Rlp.decode(data.slice(2, 3))
    
    // Efficient: stream mode
    let remainder = data
    const values = []
    while (remainder.length > 0) {
      const result = Rlp.decode(remainder, true)
      values.push(result.data)
      remainder = result.remainder
    }
    

    Validation Overhead

    Canonical validation adds minimal overhead but catches malformed inputs:
    // Decode validates:
    // - Canonical encoding (minimal representation)
    // - Length consistency (declared vs actual)
    // - Depth limits (max 32 levels)
    // - No leading zeros in lengths
    // - Proper prefix ranges
    
    // This ensures decoded data can be safely re-encoded
    const decoded = Rlp.decode(bytes)
    const reencoded = Rlp.encode(decoded.data)
    // reencoded === bytes (if canonical)
    

    Round-trip Encoding

    Decode and re-encode produces identical bytes (for canonical input):
    import { Rlp } from 'tevm'
    
    // Original data
    const original = new Uint8Array([0x83, 1, 2, 3])
    
    // Decode
    const decoded = Rlp.decode(original)
    
    // Re-encode
    const reencoded = Rlp.encode(decoded.data)
    
    // Should match original (if canonical)
    console.log(original.every((b, i) => b === reencoded[i]))  // true
    
    // Compare bytes
    function bytesEqual(a: Uint8Array, b: Uint8Array): boolean {
      if (a.length !== b.length) return false
      return a.every((byte, i) => byte === b[i])
    }
    
    console.log(bytesEqual(original, reencoded))  // true
    
    Non-canonical input will be normalized on re-encoding:
    import { Rlp } from 'tevm'
    
    // This would fail to decode (non-canonical)
    // But if we had non-canonical that somehow got through:
    // const nonCanonical = new Uint8Array([0x81, 0x7f])
    
    // After decode and re-encode, becomes canonical:
    // const canonical = Rlp.encode(decoded.data)
    // => Uint8Array([0x7f])  // Canonical form