Troubleshooting

This guide covers common errors, their causes, and how to resolve them.

Error Code Reference

All errors include a machine-readable code (E001-E008) at the start of the error message. Use these codes for programmatic error handling.

E001: JsonParseError

Message: [E001] JSON parsing failed: ...

Cause: The input string is not valid JSON.

Common triggers:

Missing quotes around keys or values
Trailing commas after the last element
Single quotes instead of double quotes
Unescaped special characters in strings
Incomplete JSON (missing closing braces or brackets)
Passing a file path instead of the file contents

Solution:

# Wrong
result = tools.execute("hello world")          # Not JSON
result = tools.execute("{'key': 'value'}")     # Single quotes
result = tools.execute('{"a": 1,}')            # Trailing comma

# Correct
result = tools.execute('{"key": "value"}')
result = tools.execute({"key": "value"})       # Pass a dict directly

E002: RegexError

Message: [E002] Regex pattern error: ...

Cause: A key or value replacement pattern failed to compile as a regex.

Common triggers:

Unescaped special regex characters (., *, +, ?, (, ), [, ])
Unclosed groups or character classes
Invalid backreferences

Solution:

# Wrong -- unescaped dot matches any character
tools.key_replacement("user.name", "username")

# Correct -- escape the dot for literal matching
tools.key_replacement(r"user\.name", "username")

# Or use a simpler pattern that won't be misinterpreted
tools.key_replacement("user_name", "username")

Note: If regex compilation fails, the library automatically falls back to literal string matching. This error only surfaces when the pattern is syntactically broken (e.g., unclosed groups).

E003: InvalidReplacementPattern

Message: [E003] Invalid replacement pattern: ...

Cause: The replacement pattern configuration is malformed.

Solution: Ensure replacement patterns are provided as (find, replace) pairs:

# Correct usage
tools.key_replacement("find_pattern", "replacement")
tools.value_replacement("old_value", "new_value")

E004: InvalidJsonStructure

Message: [E004] Invalid JSON structure: ...

Cause: The JSON is valid but not compatible with the requested operation.

Common triggers:

Unflattening a JSON array (unflatten requires a flat object)
Unflattening a non-flat object (nested values where flat keys are expected)

Solution:

# Wrong -- unflatten expects a flat object, not an array
result = jt.JSONTools().unflatten().execute('[1, 2, 3]')

# Wrong -- unflatten expects flat keys
result = jt.JSONTools().unflatten().execute('{"a": {"b": 1}}')

# Correct -- flat object with dot-separated keys
result = jt.JSONTools().unflatten().execute('{"a.b": 1, "a.c": 2}')

E005: ConfigurationError

Message: [E005] Operation mode not configured: ...

Cause: .execute() was called without first setting an operation mode.

Solution: Always call .flatten(), .unflatten(), or .normal() before .execute():

# Wrong
result = jt.JSONTools().execute(data)

# Correct
result = jt.JSONTools().flatten().execute(data)
result = jt.JSONTools().unflatten().execute(data)
result = jt.JSONTools().normal().execute(data)

This error also occurs if num_threads is set to 0:

# Wrong
tools = jt.JSONTools().flatten().num_threads(0)

# Correct
tools = jt.JSONTools().flatten().num_threads(1)    # At least 1
tools = jt.JSONTools().flatten()                    # Use default (CPU count)

E006: BatchProcessingError

Message: [E006] Batch processing failed at index {N}: ...

Cause: One or more items in a batch failed to process. The error includes the index of the failing item and the underlying error.

Solution: Check the item at the reported index. The inner error (usually E001 or E004) describes what went wrong:

try:
    results = tools.execute(batch_of_json)
except jt.JsonToolsError as e:
    msg = str(e)
    if "[E006]" in msg:
        # Extract the index from the message to find the bad item
        print(f"Batch error: {e}")
        # Fix or filter the problematic items and retry

E007: InputValidationError

Message: [E007] Input validation failed: ...

Cause: The input type is not supported.

Common triggers:

Passing an integer, float, or boolean directly
Passing a non-JSON-string, non-dict type in a list
Using execute_to_output() with a DataFrame or Series (use execute() instead)

Solution:

# Wrong
result = tools.execute(42)
result = tools.execute([1, 2, 3])

# Correct
result = tools.execute('{"value": 42}')
result = tools.execute({"value": 42})
result = tools.execute(['{"a": 1}', '{"b": 2}'])

E008: SerializationError

Message: [E008] JSON serialization failed: ...

Cause: The processed result could not be serialized back to JSON. This is typically an internal error.

Solution: If you encounter this error, please report it as a bug. As a workaround, check that your input does not contain unusual Unicode sequences or extremely large numbers that may not round-trip through JSON.

Common Issues

Empty Separator

The separator must be a non-empty string. Using an empty separator is always a logic error -- it would make keys ambiguous.

# This raises an error
tools = jt.JSONTools().flatten().separator("")

# Use any non-empty string
tools = jt.JSONTools().flatten().separator(".")
tools = jt.JSONTools().flatten().separator("::")
tools = jt.JSONTools().flatten().separator("/")

In Rust, an empty separator causes a panic (via assert!). In Python, it raises a ValueError.

Missing Operation Mode

The most common mistake is forgetting to set a mode:

# This always raises E005
tools = jt.JSONTools()
tools.execute(data)  # Error!

# Set a mode first
tools = jt.JSONTools().flatten()
tools.execute(data)  # OK

Dict vs String Input

Both str and dict inputs are accepted, but the output type mirrors the input type:

# String in -> string out
result = tools.execute('{"a": {"b": 1}}')
assert isinstance(result, str)
# result == '{"a.b":1}'

# Dict in -> dict out
result = tools.execute({"a": {"b": 1}})
assert isinstance(result, dict)
# result == {"a.b": 1}

If you need the raw JSON string output from a dict input, use .execute_to_output():

output = tools.execute_to_output({"a": {"b": 1}})
json_str = output.get_single()  # Returns a JSON string

Regex Patterns in Replacements

Replacement patterns use standard regex syntax. Common pitfalls:

# The dot matches ANY character -- "user.name" matches "username" too
tools.key_replacement("user.name", "id")

# Escape dots for literal matching
tools.key_replacement(r"user\.name", "id")

# Use anchors for precise matching
tools.key_replacement("^user_", "")       # Only at start of key
tools.key_replacement("_suffix$", "")      # Only at end of key

Performance Tuning

When Parallelism Helps

Parallel processing adds overhead for thread spawning and synchronization. It helps when:

Batch size is large (100+ items by default) -- amortizes spawning cost
Individual documents are complex -- deep nesting, many keys, expensive transformations
CPU cores are available -- parallelism on a single-core machine adds only overhead

When Parallelism Hurts

Reduce or disable parallelism when:

Documents are tiny (a few flat keys) -- thread overhead dominates
Batch sizes are small (<50 items) -- raise parallel_threshold
Memory is constrained -- each thread needs its own stack and working set
Running inside a GIL-heavy Python workload -- the GIL is released during Rust processing, but other Python threads may contend

# Disable parallelism for small workloads
tools = jt.JSONTools().flatten().parallel_threshold(999_999)

# Or limit threads
tools = jt.JSONTools().flatten().num_threads(1)

Profiling Tips

Use the built-in benchmark suites to profile your specific workload pattern:

# Profile stress scenarios
cargo bench --profile profiling --bench stress_benchmarks --no-run
samply record --save-only -o /tmp/profile.json -- \
    ./target/profiling/deps/stress_benchmarks-* --bench

For Python profiling, measure wall-clock time since CPU profilers may not capture time spent in Rust:

import time
start = time.perf_counter()
result = tools.execute(data)
elapsed = time.perf_counter() - start
print(f"Processing took {elapsed:.3f}s")

Platform Notes

mimalloc (Rust-only)

The mimalloc global allocator is an optional feature that provides a 5-10% performance improvement. Enable it with features = ["mimalloc"] in your Cargo.toml. It is not included in Python builds because PyO3 manages memory through Python's allocator.

sonic-rs (64-bit only)

The default JSON parser is sonic-rs, which uses SIMD instructions available on 64-bit platforms (x86_64, aarch64). On 32-bit platforms, the library automatically falls back to simd-json. This is transparent -- the API is identical regardless of which parser is active.

macOS Profiling

On macOS, flamegraph requires full Xcode (not just Command Line Tools). Use samply instead:

cargo install samply
samply record --save-only -o profile.json -- ./target/profiling/deps/BENCH_BINARY --bench
samply load profile.json  # Opens Firefox Profiler

Valgrind does not work on modern macOS. Use Instruments (if Xcode is installed) or samply for profiling.

JSON Tools RS Guide