Testing: Golden Master Testing
Overview
Golden Master Testing (also called Characterization Testing) captures the current behavior of legacy code as a baseline, enabling safe refactoring. The challenge is that output formats, timestamps, and generated IDs change even when behavior is identical. Normalizing outputs lets you compare behavior while ignoring cosmetic differences.
Core Problem Statement
"Legacy code has no tests, but you need to refactor it safely." Direct output comparison fails because timestamps, transaction IDs, and formatting differ between runs. You need to verify that refactored code produces the same business logic results as the original.
Example Scenario
Your e-commerce order processing system is a legacy monolith with verbose, inconsistent logging. You're refactoring it to use structured logging and modern patterns. To verify the refactoring preserves behavior:
- Capture output from legacy system (the "golden master")
- Run same inputs through refactored system
- Normalize both outputs
- Verify they match
Input Data
Legacy System Output
Processing customer record: ID=1001, Name="Alice Johnson", Email=alice@example.com, Balance=$1,234.56 (Timestamp: 2024-11-15 10:00:01.123)
Applying discount: 10% off for loyalty tier GOLD (Calculation time: 0.45ms)
Order total before tax: $1,111.10
Tax calculation (State: CA, Rate: 9.5%): Tax=$105.55
Order total after tax: $1,216.65
Payment processed via credit card ending in 4532 (Transaction ID: TXN-20241115-000123, Processing time: 125ms)
Inventory updated: SKU-ABC-123 quantity decreased from 50 to 47 (Warehouse: Building-A, Aisle-3, Shelf-12)
Shipping label generated: Tracking #1Z999AA10123456784 via UPS Ground (Weight: 2.3 lbs, Estimated delivery: 2024-11-20)
Order #ORD-5001 completed successfully (Total processing time: 342ms, Server: prod-us-east-1a)
Email confirmation sent to alice@example.com (Queue: high-priority, Message ID: MSG-abc123def456)
Legacy system with verbose prose-style logging, commas in numbers, mixed formats.
Refactored System Output
[2024-11-15T14:30:22.456Z] CUSTOMER_PROCESSED customer_id=1001 name="Alice Johnson" email=alice@example.com balance=1234.56
[2024-11-15T14:30:22.489Z] DISCOUNT_APPLIED discount_pct=10 tier=GOLD compute_time_ms=0.38
[2024-11-15T14:30:22.501Z] ORDER_SUBTOTAL amount=1111.10
[2024-11-15T14:30:22.523Z] TAX_CALCULATED state=CA rate_pct=9.5 tax_amount=105.55
[2024-11-15T14:30:22.545Z] ORDER_TOTAL amount=1216.65
[2024-11-15T14:30:22.678Z] PAYMENT_PROCESSED method=credit_card last_four=4532 txn_id=TXN-20241115-000456 duration_ms=133
[2024-11-15T14:30:22.701Z] INVENTORY_UPDATED sku=SKU-ABC-123 prev_qty=50 new_qty=47 location=Building-A/Aisle-3/Shelf-12
[2024-11-15T14:30:22.734Z] SHIPPING_LABEL_CREATED tracking=1Z999AA10123456784 carrier=UPS service=Ground weight_lbs=2.3 est_delivery=2024-11-20
[2024-11-15T14:30:22.756Z] ORDER_COMPLETED order_id=ORD-5001 duration_ms=300 server=prod-us-west-2b
[2024-11-15T14:30:22.789Z] EMAIL_QUEUED recipient=alice@example.com queue=high-priority msg_id=MSG-xyz789ghi012
Refactored system with structured logging, ISO timestamps, consistent key=value format.
Normalization Rules
Create rules that extract business logic while ignoring format differences:
Golden Master Normalization Rules
rules:
# Legacy: Customer record processing
- name: legacy_customer
pattern:
- text: "Processing customer record: ID="
- field: id
- text: ", Name="
- field: name
- text: ", Email="
- field: email
- text: ", Balance=$"
- field: balance
- text: " (Timestamp: "
- field: timestamp
- text: ")"
output: "[CUSTOMER:id={id},email={email}]"
# Refactored: Customer processed
- name: refactored_customer
pattern:
- text: "["
- field: timestamp
- text: "] CUSTOMER_PROCESSED customer_id="
- field: id
- text: " name="
- field: name
- text: " email="
- field: email
- text: " balance="
- field: balance
output: "[CUSTOMER:id={id},email={email}]"
# Legacy: Discount applied
- name: legacy_discount
pattern:
- text: "Applying discount: "
- field: pct
- text: "% off for loyalty tier "
- field: tier
- text: " (Calculation time: "
- field: time
- text: ")"
output: "[DISCOUNT:pct={pct},tier={tier}]"
# Refactored: Discount applied
- name: refactored_discount
pattern:
- text: "["
- field: timestamp
- text: "] DISCOUNT_APPLIED discount_pct="
- field: pct
- text: " tier="
- field: tier
- text: " compute_time_ms="
- field: time
output: "[DISCOUNT:pct={pct},tier={tier}]"
# Legacy: Order subtotal
- name: legacy_subtotal
pattern:
- text: "Order total before tax: $"
- field: amount
output: "[SUBTOTAL:{amount}]"
# Refactored: Order subtotal
- name: refactored_subtotal
pattern:
- text: "["
- field: timestamp
- text: "] ORDER_SUBTOTAL amount="
- field: amount
output: "[SUBTOTAL:{amount}]"
# Legacy: Tax calculation
- name: legacy_tax
pattern:
- text: "Tax calculation (State: "
- field: state
- text: ", Rate: "
- field: rate
- text: "%): Tax=$"
- field: tax
output: "[TAX:state={state},amount={tax}]"
# Refactored: Tax calculated
- name: refactored_tax
pattern:
- text: "["
- field: timestamp
- text: "] TAX_CALCULATED state="
- field: state
- text: " rate_pct="
- field: rate
- text: " tax_amount="
- field: tax
output: "[TAX:state={state},amount={tax}]"
# Legacy: Order total
- name: legacy_total
pattern:
- text: "Order total after tax: $"
- field: amount
output: "[TOTAL:{amount}]"
# Refactored: Order total
- name: refactored_total
pattern:
- text: "["
- field: timestamp
- text: "] ORDER_TOTAL amount="
- field: amount
output: "[TOTAL:{amount}]"
# Legacy: Payment processed
- name: legacy_payment
pattern:
- text: "Payment processed via credit card ending in "
- field: last_four
- text: " (Transaction ID: "
- field: txn_id
- text: ", Processing time: "
- field: time
- text: ")"
output: "[PAYMENT:card_last4={last_four}]"
# Refactored: Payment processed
- name: refactored_payment
pattern:
- text: "["
- field: timestamp
- text: "] PAYMENT_PROCESSED method=credit_card last_four="
- field: last_four
- text: " txn_id="
- field: txn_id
- text: " duration_ms="
- field: time
output: "[PAYMENT:card_last4={last_four}]"
# Legacy: Inventory updated
- name: legacy_inventory
pattern:
- text: "Inventory updated: "
- field: sku
- text: " quantity decreased from "
- field: old_qty
- text: " to "
- field: new_qty
- text: " (Warehouse: "
- field: location
- text: ")"
output: "[INVENTORY:sku={sku},qty_change={old_qty}->{new_qty}]"
# Refactored: Inventory updated
- name: refactored_inventory
pattern:
- text: "["
- field: timestamp
- text: "] INVENTORY_UPDATED sku="
- field: sku
- text: " prev_qty="
- field: old_qty
- text: " new_qty="
- field: new_qty
- text: " location="
- field: location
output: "[INVENTORY:sku={sku},qty_change={old_qty}->{new_qty}]"
# Legacy: Shipping label
- name: legacy_shipping
pattern:
- text: "Shipping label generated: Tracking #"
- field: tracking
- text: " via "
- field: carrier
- text: " "
- field: service
- text: " (Weight: "
- field: weight
- text: ", Estimated delivery: "
- field: delivery
- text: ")"
output: "[SHIPPING:carrier={carrier},tracking={tracking}]"
# Refactored: Shipping label
- name: refactored_shipping
pattern:
- text: "["
- field: timestamp
- text: "] SHIPPING_LABEL_CREATED tracking="
- field: tracking
- text: " carrier="
- field: carrier
- text: " service="
- field: service
- text: " weight_lbs="
- field: weight
- text: " est_delivery="
- field: delivery
output: "[SHIPPING:carrier={carrier},tracking={tracking}]"
# Legacy: Order completed
- name: legacy_order_complete
pattern:
- text: "Order #"
- field: order_id
- text: " completed successfully (Total processing time: "
- field: time
- text: ", Server: "
- field: server
- text: ")"
output: "[ORDER_COMPLETE:id={order_id}]"
# Refactored: Order completed
- name: refactored_order_complete
pattern:
- text: "["
- field: timestamp
- text: "] ORDER_COMPLETED order_id="
- field: order_id
- text: " duration_ms="
- field: time
- text: " server="
- field: server
output: "[ORDER_COMPLETE:id={order_id}]"
# Legacy: Email sent
- name: legacy_email
pattern:
- text: "Email confirmation sent to "
- field: recipient
- text: " (Queue: "
- field: queue
- text: ", Message ID: "
- field: msg_id
- text: ")"
output: "[EMAIL:to={recipient}]"
# Refactored: Email queued
- name: refactored_email
pattern:
- text: "["
- field: timestamp
- text: "] EMAIL_QUEUED recipient="
- field: recipient
- text: " queue="
- field: queue
- text: " msg_id="
- field: msg_id
output: "[EMAIL:to={recipient}]"
Rules preserve: business events, customer data, order amounts, inventory changes. Rules ignore: timestamps, transaction IDs, server names, processing times, number formatting.
Implementation
# Capture golden master from legacy system
run-legacy-system --input test-data.json > legacy-golden.log
# Save normalized golden master
patterndb-yaml --rules golden-master-rules.yaml legacy-golden.log \
--quiet > golden-master.txt
# After refactoring, test new system
run-refactored-system --input test-data.json > refactored-output.log
# Normalize refactored output
patterndb-yaml --rules golden-master-rules.yaml refactored-output.log \
--quiet > refactored-normalized.txt
# Compare
if diff -q golden-master.txt refactored-normalized.txt; then
echo "✓ Refactoring preserves behavior"
else
echo "✗ Behavioral differences detected:"
diff golden-master.txt refactored-normalized.txt
fi
import sys
from patterndb_yaml import PatterndbYaml
from pathlib import Path
from io import StringIO
# Redirect stdout to file for testing
_original_stdout = sys.stdout
output_file = open("output.txt", "w")
sys.stdout = output_file
processor = PatterndbYaml(rules_path=Path("golden-master-rules.yaml"))
# Process legacy output
with open("legacy-output.log") as f:
legacy_input = StringIO(f.read())
golden_output = StringIO()
processor.process(legacy_input, golden_output)
print("Golden master captured")
print(f" Events: {len(golden_output.getvalue().splitlines())}")
# Process refactored output
with open("refactored-output.log") as f:
refactored_input = StringIO(f.read())
refactored_output = StringIO()
processor.process(refactored_input, refactored_output)
# Compare
golden_lines = sorted(golden_output.getvalue().strip().split('\n'))
refactored_lines = sorted(refactored_output.getvalue().strip().split('\n'))
if golden_lines == refactored_lines:
print("\n✓ Refactoring preserves behavior")
else:
print("\n✗ Behavioral differences detected:")
# Find differences
golden_set = set(golden_lines)
refactored_set = set(refactored_lines)
missing = golden_set - refactored_set
added = refactored_set - golden_set
if missing:
print("\nMissing in refactored (regressions):")
for line in sorted(missing):
print(f" - {line}")
if added:
print("\nAdded in refactored (new behavior):")
for line in sorted(added):
print(f" + {line}")
# Restore stdout and close output file
sys.stdout = _original_stdout
output_file.close()
Expected Output
Normalized Output (Both Systems)
[CUSTOMER:id=1001,email=alice@example.com]
[DISCOUNT:pct=10,tier=GOLD]
[SUBTOTAL:1111.10]
[TAX:state=CA,amount=105.55]
[TOTAL:1216.65]
[PAYMENT:card_last4=4532]
[INVENTORY:sku=SKU-ABC-123,qty_change=50->47]
[SHIPPING:carrier=UPS,tracking=1Z999AA10123456784]
[ORDER_COMPLETE:id=ORD-5001]
[EMAIL:to=alice@example.com]
Both legacy and refactored systems produce identical normalized behavior.
Note: Minor formatting differences (e.g., "1,111.10" vs "1111.10") are normalized away, focusing on business logic equivalence.
Practical Workflows
1. Initial Golden Master Creation
Capture comprehensive golden master from production:
#!/bin/bash
# Run comprehensive test suite against legacy system
echo "Capturing golden master from legacy system..."
for test_case in tests/data/*.json; do
echo " Processing $(basename $test_case)..."
# Run legacy system
run-legacy-system --input "$test_case" > \
"output/legacy-$(basename $test_case .json).log"
# Normalize
patterndb-yaml --rules golden-master-rules.yaml \
"output/legacy-$(basename $test_case .json).log" \
--quiet > "golden/$(basename $test_case .json).txt"
done
echo "Golden master created for $(ls tests/data/*.json | wc -l) test cases"
### 5. Regression Detection
Detect unintended behavior changes:
```bash
#!/bin/bash
# Continuous testing against golden master
echo "Running regression tests..."
# Track results
passed=0
failed=0
for test_case in tests/data/*.json; do
name=$(basename "$test_case" .json)
# Run refactored system
run-refactored-system --input "$test_case" > output/current-$name.log 2>&1
# Normalize
patterndb-yaml --rules golden-master-rules.yaml \
output/current-$name.log --quiet > output/current-$name.txt
# Compare with golden master
if diff -q golden/$name.txt output/current-$name.txt > /dev/null; then
echo " ✓ $name"
((passed++))
else
echo " ✗ $name - REGRESSION DETECTED"
((failed++))
# Save diff for review
diff golden/$name.txt output/current-$name.txt > output/diff-$name.txt
fi
done
# Report
echo ""
echo "Results: $passed passed, $failed failed"
if [ $failed -gt 0 ]; then
echo ""
echo "Regressions detected in:"
ls output/diff-*.txt 2>/dev/null | while read diff_file; do
name=$(basename "$diff_file" | sed 's/diff-\(.*\)\.txt/\1/')
echo " - $name (see output/diff-$name.txt)"
done
exit 1
fi
echo "✓ All regression tests passed"
Key Benefits
- Safe refactoring: Verify behavior preservation without existing tests
- Characterize legacy code: Document current behavior as executable specification
- Catch regressions: Detect unintended changes immediately
- Approval testing: Human-in-the-loop for intentional behavior changes
- Incremental improvement: Refactor with confidence, one step at a time
Related Topics
- Rules - Pattern matching and normalization
- Statistics - Measure match coverage
- Explain Mode - Debug pattern matching