Skip to content

Testing: Golden Master Testing

Overview

Golden Master Testing (also called Characterization Testing) captures the current behavior of legacy code as a baseline, enabling safe refactoring. The challenge is that output formats, timestamps, and generated IDs change even when behavior is identical. Normalizing outputs lets you compare behavior while ignoring cosmetic differences.

Core Problem Statement

"Legacy code has no tests, but you need to refactor it safely." Direct output comparison fails because timestamps, transaction IDs, and formatting differ between runs. You need to verify that refactored code produces the same business logic results as the original.

Example Scenario

Your e-commerce order processing system is a legacy monolith with verbose, inconsistent logging. You're refactoring it to use structured logging and modern patterns. To verify the refactoring preserves behavior:

  1. Capture output from legacy system (the "golden master")
  2. Run same inputs through refactored system
  3. Normalize both outputs
  4. Verify they match

Input Data

Legacy System Output
Processing customer record: ID=1001, Name="Alice Johnson", Email=alice@example.com, Balance=$1,234.56 (Timestamp: 2024-11-15 10:00:01.123)
Applying discount: 10% off for loyalty tier GOLD (Calculation time: 0.45ms)
Order total before tax: $1,111.10
Tax calculation (State: CA, Rate: 9.5%): Tax=$105.55
Order total after tax: $1,216.65
Payment processed via credit card ending in 4532 (Transaction ID: TXN-20241115-000123, Processing time: 125ms)
Inventory updated: SKU-ABC-123 quantity decreased from 50 to 47 (Warehouse: Building-A, Aisle-3, Shelf-12)
Shipping label generated: Tracking #1Z999AA10123456784 via UPS Ground (Weight: 2.3 lbs, Estimated delivery: 2024-11-20)
Order #ORD-5001 completed successfully (Total processing time: 342ms, Server: prod-us-east-1a)
Email confirmation sent to alice@example.com (Queue: high-priority, Message ID: MSG-abc123def456)

Legacy system with verbose prose-style logging, commas in numbers, mixed formats.

Refactored System Output
[2024-11-15T14:30:22.456Z] CUSTOMER_PROCESSED customer_id=1001 name="Alice Johnson" email=alice@example.com balance=1234.56
[2024-11-15T14:30:22.489Z] DISCOUNT_APPLIED discount_pct=10 tier=GOLD compute_time_ms=0.38
[2024-11-15T14:30:22.501Z] ORDER_SUBTOTAL amount=1111.10
[2024-11-15T14:30:22.523Z] TAX_CALCULATED state=CA rate_pct=9.5 tax_amount=105.55
[2024-11-15T14:30:22.545Z] ORDER_TOTAL amount=1216.65
[2024-11-15T14:30:22.678Z] PAYMENT_PROCESSED method=credit_card last_four=4532 txn_id=TXN-20241115-000456 duration_ms=133
[2024-11-15T14:30:22.701Z] INVENTORY_UPDATED sku=SKU-ABC-123 prev_qty=50 new_qty=47 location=Building-A/Aisle-3/Shelf-12
[2024-11-15T14:30:22.734Z] SHIPPING_LABEL_CREATED tracking=1Z999AA10123456784 carrier=UPS service=Ground weight_lbs=2.3 est_delivery=2024-11-20
[2024-11-15T14:30:22.756Z] ORDER_COMPLETED order_id=ORD-5001 duration_ms=300 server=prod-us-west-2b
[2024-11-15T14:30:22.789Z] EMAIL_QUEUED recipient=alice@example.com queue=high-priority msg_id=MSG-xyz789ghi012

Refactored system with structured logging, ISO timestamps, consistent key=value format.

Normalization Rules

Create rules that extract business logic while ignoring format differences:

Golden Master Normalization Rules
rules:
  # Legacy: Customer record processing
  - name: legacy_customer
    pattern:
      - text: "Processing customer record: ID="
      - field: id
      - text: ", Name="
      - field: name
      - text: ", Email="
      - field: email
      - text: ", Balance=$"
      - field: balance
      - text: " (Timestamp: "
      - field: timestamp
      - text: ")"
    output: "[CUSTOMER:id={id},email={email}]"

  # Refactored: Customer processed
  - name: refactored_customer
    pattern:
      - text: "["
      - field: timestamp
      - text: "] CUSTOMER_PROCESSED customer_id="
      - field: id
      - text: " name="
      - field: name
      - text: " email="
      - field: email
      - text: " balance="
      - field: balance
    output: "[CUSTOMER:id={id},email={email}]"

  # Legacy: Discount applied
  - name: legacy_discount
    pattern:
      - text: "Applying discount: "
      - field: pct
      - text: "% off for loyalty tier "
      - field: tier
      - text: " (Calculation time: "
      - field: time
      - text: ")"
    output: "[DISCOUNT:pct={pct},tier={tier}]"

  # Refactored: Discount applied
  - name: refactored_discount
    pattern:
      - text: "["
      - field: timestamp
      - text: "] DISCOUNT_APPLIED discount_pct="
      - field: pct
      - text: " tier="
      - field: tier
      - text: " compute_time_ms="
      - field: time
    output: "[DISCOUNT:pct={pct},tier={tier}]"

  # Legacy: Order subtotal
  - name: legacy_subtotal
    pattern:
      - text: "Order total before tax: $"
      - field: amount
    output: "[SUBTOTAL:{amount}]"

  # Refactored: Order subtotal
  - name: refactored_subtotal
    pattern:
      - text: "["
      - field: timestamp
      - text: "] ORDER_SUBTOTAL amount="
      - field: amount
    output: "[SUBTOTAL:{amount}]"

  # Legacy: Tax calculation
  - name: legacy_tax
    pattern:
      - text: "Tax calculation (State: "
      - field: state
      - text: ", Rate: "
      - field: rate
      - text: "%): Tax=$"
      - field: tax
    output: "[TAX:state={state},amount={tax}]"

  # Refactored: Tax calculated
  - name: refactored_tax
    pattern:
      - text: "["
      - field: timestamp
      - text: "] TAX_CALCULATED state="
      - field: state
      - text: " rate_pct="
      - field: rate
      - text: " tax_amount="
      - field: tax
    output: "[TAX:state={state},amount={tax}]"

  # Legacy: Order total
  - name: legacy_total
    pattern:
      - text: "Order total after tax: $"
      - field: amount
    output: "[TOTAL:{amount}]"

  # Refactored: Order total
  - name: refactored_total
    pattern:
      - text: "["
      - field: timestamp
      - text: "] ORDER_TOTAL amount="
      - field: amount
    output: "[TOTAL:{amount}]"

  # Legacy: Payment processed
  - name: legacy_payment
    pattern:
      - text: "Payment processed via credit card ending in "
      - field: last_four
      - text: " (Transaction ID: "
      - field: txn_id
      - text: ", Processing time: "
      - field: time
      - text: ")"
    output: "[PAYMENT:card_last4={last_four}]"

  # Refactored: Payment processed
  - name: refactored_payment
    pattern:
      - text: "["
      - field: timestamp
      - text: "] PAYMENT_PROCESSED method=credit_card last_four="
      - field: last_four
      - text: " txn_id="
      - field: txn_id
      - text: " duration_ms="
      - field: time
    output: "[PAYMENT:card_last4={last_four}]"

  # Legacy: Inventory updated
  - name: legacy_inventory
    pattern:
      - text: "Inventory updated: "
      - field: sku
      - text: " quantity decreased from "
      - field: old_qty
      - text: " to "
      - field: new_qty
      - text: " (Warehouse: "
      - field: location
      - text: ")"
    output: "[INVENTORY:sku={sku},qty_change={old_qty}->{new_qty}]"

  # Refactored: Inventory updated
  - name: refactored_inventory
    pattern:
      - text: "["
      - field: timestamp
      - text: "] INVENTORY_UPDATED sku="
      - field: sku
      - text: " prev_qty="
      - field: old_qty
      - text: " new_qty="
      - field: new_qty
      - text: " location="
      - field: location
    output: "[INVENTORY:sku={sku},qty_change={old_qty}->{new_qty}]"

  # Legacy: Shipping label
  - name: legacy_shipping
    pattern:
      - text: "Shipping label generated: Tracking #"
      - field: tracking
      - text: " via "
      - field: carrier
      - text: " "
      - field: service
      - text: " (Weight: "
      - field: weight
      - text: ", Estimated delivery: "
      - field: delivery
      - text: ")"
    output: "[SHIPPING:carrier={carrier},tracking={tracking}]"

  # Refactored: Shipping label
  - name: refactored_shipping
    pattern:
      - text: "["
      - field: timestamp
      - text: "] SHIPPING_LABEL_CREATED tracking="
      - field: tracking
      - text: " carrier="
      - field: carrier
      - text: " service="
      - field: service
      - text: " weight_lbs="
      - field: weight
      - text: " est_delivery="
      - field: delivery
    output: "[SHIPPING:carrier={carrier},tracking={tracking}]"

  # Legacy: Order completed
  - name: legacy_order_complete
    pattern:
      - text: "Order #"
      - field: order_id
      - text: " completed successfully (Total processing time: "
      - field: time
      - text: ", Server: "
      - field: server
      - text: ")"
    output: "[ORDER_COMPLETE:id={order_id}]"

  # Refactored: Order completed
  - name: refactored_order_complete
    pattern:
      - text: "["
      - field: timestamp
      - text: "] ORDER_COMPLETED order_id="
      - field: order_id
      - text: " duration_ms="
      - field: time
      - text: " server="
      - field: server
    output: "[ORDER_COMPLETE:id={order_id}]"

  # Legacy: Email sent
  - name: legacy_email
    pattern:
      - text: "Email confirmation sent to "
      - field: recipient
      - text: " (Queue: "
      - field: queue
      - text: ", Message ID: "
      - field: msg_id
      - text: ")"
    output: "[EMAIL:to={recipient}]"

  # Refactored: Email queued
  - name: refactored_email
    pattern:
      - text: "["
      - field: timestamp
      - text: "] EMAIL_QUEUED recipient="
      - field: recipient
      - text: " queue="
      - field: queue
      - text: " msg_id="
      - field: msg_id
    output: "[EMAIL:to={recipient}]"

Rules preserve: business events, customer data, order amounts, inventory changes. Rules ignore: timestamps, transaction IDs, server names, processing times, number formatting.

Implementation

# Capture golden master from legacy system
run-legacy-system --input test-data.json > legacy-golden.log

# Save normalized golden master
patterndb-yaml --rules golden-master-rules.yaml legacy-golden.log \
    --quiet > golden-master.txt

# After refactoring, test new system
run-refactored-system --input test-data.json > refactored-output.log

# Normalize refactored output
patterndb-yaml --rules golden-master-rules.yaml refactored-output.log \
    --quiet > refactored-normalized.txt

# Compare
if diff -q golden-master.txt refactored-normalized.txt; then
    echo "✓ Refactoring preserves behavior"
else
    echo "✗ Behavioral differences detected:"
    diff golden-master.txt refactored-normalized.txt
fi

import sys
from patterndb_yaml import PatterndbYaml
from pathlib import Path
from io import StringIO

# Redirect stdout to file for testing
_original_stdout = sys.stdout
output_file = open("output.txt", "w")
sys.stdout = output_file

processor = PatterndbYaml(rules_path=Path("golden-master-rules.yaml"))

# Process legacy output
with open("legacy-output.log") as f:
    legacy_input = StringIO(f.read())
    golden_output = StringIO()
    processor.process(legacy_input, golden_output)

print("Golden master captured")
print(f"  Events: {len(golden_output.getvalue().splitlines())}")

# Process refactored output
with open("refactored-output.log") as f:
    refactored_input = StringIO(f.read())
    refactored_output = StringIO()
    processor.process(refactored_input, refactored_output)

# Compare
golden_lines = sorted(golden_output.getvalue().strip().split('\n'))
refactored_lines = sorted(refactored_output.getvalue().strip().split('\n'))

if golden_lines == refactored_lines:
    print("\n✓ Refactoring preserves behavior")
else:
    print("\n✗ Behavioral differences detected:")

    # Find differences
    golden_set = set(golden_lines)
    refactored_set = set(refactored_lines)

    missing = golden_set - refactored_set
    added = refactored_set - golden_set

    if missing:
        print("\nMissing in refactored (regressions):")
        for line in sorted(missing):
            print(f"  - {line}")

    if added:
        print("\nAdded in refactored (new behavior):")
        for line in sorted(added):
            print(f"  + {line}")

# Restore stdout and close output file
sys.stdout = _original_stdout
output_file.close()

Expected Output

Normalized Output (Both Systems)
[CUSTOMER:id=1001,email=alice@example.com]
[DISCOUNT:pct=10,tier=GOLD]
[SUBTOTAL:1111.10]
[TAX:state=CA,amount=105.55]
[TOTAL:1216.65]
[PAYMENT:card_last4=4532]
[INVENTORY:sku=SKU-ABC-123,qty_change=50->47]
[SHIPPING:carrier=UPS,tracking=1Z999AA10123456784]
[ORDER_COMPLETE:id=ORD-5001]
[EMAIL:to=alice@example.com]

Both legacy and refactored systems produce identical normalized behavior.

Note: Minor formatting differences (e.g., "1,111.10" vs "1111.10") are normalized away, focusing on business logic equivalence.

Practical Workflows

1. Initial Golden Master Creation

Capture comprehensive golden master from production:

#!/bin/bash
# Run comprehensive test suite against legacy system
echo "Capturing golden master from legacy system..."

for test_case in tests/data/*.json; do
    echo "  Processing $(basename $test_case)..."

    # Run legacy system
    run-legacy-system --input "$test_case" > \
        "output/legacy-$(basename $test_case .json).log"

    # Normalize
    patterndb-yaml --rules golden-master-rules.yaml \
        "output/legacy-$(basename $test_case .json).log" \
        --quiet > "golden/$(basename $test_case .json).txt"
done

echo "Golden master created for $(ls tests/data/*.json | wc -l) test cases"
### 5. Regression Detection

Detect unintended behavior changes:

```bash
#!/bin/bash
# Continuous testing against golden master

echo "Running regression tests..."

# Track results
passed=0
failed=0

for test_case in tests/data/*.json; do
    name=$(basename "$test_case" .json)

    # Run refactored system
    run-refactored-system --input "$test_case" > output/current-$name.log 2>&1

    # Normalize
    patterndb-yaml --rules golden-master-rules.yaml \
        output/current-$name.log --quiet > output/current-$name.txt

    # Compare with golden master
    if diff -q golden/$name.txt output/current-$name.txt > /dev/null; then
        echo "  ✓ $name"
        ((passed++))
    else
        echo "  ✗ $name - REGRESSION DETECTED"
        ((failed++))

        # Save diff for review
        diff golden/$name.txt output/current-$name.txt > output/diff-$name.txt
    fi
done

# Report
echo ""
echo "Results: $passed passed, $failed failed"

if [ $failed -gt 0 ]; then
    echo ""
    echo "Regressions detected in:"
    ls output/diff-*.txt 2>/dev/null | while read diff_file; do
        name=$(basename "$diff_file" | sed 's/diff-\(.*\)\.txt/\1/')
        echo "  - $name (see output/diff-$name.txt)"
    done
    exit 1
fi

echo "✓ All regression tests passed"

Key Benefits

  • Safe refactoring: Verify behavior preservation without existing tests
  • Characterize legacy code: Document current behavior as executable specification
  • Catch regressions: Detect unintended changes immediately
  • Approval testing: Human-in-the-loop for intentional behavior changes
  • Incremental improvement: Refactor with confidence, one step at a time