Python API Reference

import json_tools_rs

JSONTools

The main builder class for all JSON operations. All configuration methods return self for chaining; only .execute() and .execute_to_output() trigger processing.

Construction

tools = json_tools_rs.JSONTools()

Creates a new JSONTools instance with all default settings. The instance is reusable -- you can call .execute() multiple times with different inputs.

Operation Modes

Exactly one mode must be set before calling .execute(). Calling a mode method replaces any previously set mode.

`.flatten()`

tools.flatten() -> JSONTools

Set the operation to flatten nested JSON into dot-separated (or custom separator) keys.

import json_tools_rs as jt

result = jt.JSONTools().flatten().execute({"a": {"b": {"c": 1}}})
# {"a.b.c": 1}

`.unflatten()`

tools.unflatten() -> JSONTools

Set the operation to reconstruct nested JSON from flat, separator-delimited keys.

result = jt.JSONTools().unflatten().execute({"a.b.c": 1})
# {"a": {"b": {"c": 1}}}

`.normal()`

tools.normal() -> JSONTools

Set the operation to apply transformations (filtering, replacements, type conversion) without changing the nesting structure.

result = jt.JSONTools().normal().lowercase_keys(True).execute({"Name": "Alice"})
# {"name": "Alice"}

Configuration Methods

All configuration methods return self for chaining.

`.separator(sep)`

tools.separator(sep: str) -> JSONTools

Set the key separator for flatten/unflatten operations.

Parameter	Type	Default	Description
`sep`	`str`	`"."`	Non-empty string used to join/split nested keys

Raises: ValueError if sep is an empty string.

result = jt.JSONTools().flatten().separator("::").execute({"a": {"b": 1}})
# {"a::b": 1}

`.lowercase_keys(flag)`

tools.lowercase_keys(flag: bool) -> JSONTools

Convert all keys to lowercase after processing.

Parameter	Type	Default	Description
`flag`	`bool`	`False`	Enable or disable lowercase key conversion

result = jt.JSONTools().flatten().lowercase_keys(True).execute({"User": {"Name": "Alice"}})
# {"user.name": "Alice"}

`.remove_empty_strings(flag)`

tools.remove_empty_strings(flag: bool) -> JSONTools

Remove key-value pairs where the value is an empty string "".

Parameter	Type	Default	Description
`flag`	`bool`	`False`	Enable or disable empty string removal

result = jt.JSONTools().flatten().remove_empty_strings(True).execute({"a": "", "b": "hello"})
# {"b": "hello"}

`.remove_nulls(flag)`

tools.remove_nulls(flag: bool) -> JSONTools

Remove key-value pairs where the value is None / null.

Parameter	Type	Default	Description
`flag`	`bool`	`False`	Enable or disable null removal

result = jt.JSONTools().flatten().remove_nulls(True).execute({"a": None, "b": 1})
# {"b": 1}

`.remove_empty_objects(flag)`

tools.remove_empty_objects(flag: bool) -> JSONTools

Remove key-value pairs where the value is an empty object {}.

Parameter	Type	Default	Description
`flag`	`bool`	`False`	Enable or disable empty object removal

`.remove_empty_arrays(flag)`

tools.remove_empty_arrays(flag: bool) -> JSONTools

Remove key-value pairs where the value is an empty array [].

Parameter	Type	Default	Description
`flag`	`bool`	`False`	Enable or disable empty array removal

`.key_replacement(find, replace)`

tools.key_replacement(find: str, replace: str) -> JSONTools

Add a key replacement pattern. Patterns use standard regex syntax. If the regex fails to compile, it falls back to literal string replacement. Multiple replacements can be chained.

Parameter	Type	Description
`find`	`str`	Regex pattern (or literal string) to match in keys
`replace`	`str`	Replacement string (supports regex capture groups like `$1`)

result = (jt.JSONTools()
    .flatten()
    .key_replacement("^user_", "")
    .key_replacement("_name$", "_id")
    .execute({"user_name": "Alice"}))
# {"id": "Alice"}

`.value_replacement(find, replace)`

tools.value_replacement(find: str, replace: str) -> JSONTools

Add a value replacement pattern. Works the same as key replacements but applies to string values.

Parameter	Type	Description
`find`	`str`	Regex pattern (or literal string) to match in values
`replace`	`str`	Replacement string

result = (jt.JSONTools()
    .flatten()
    .value_replacement("@example\\.com", "@company.org")
    .execute({"email": "user@example.com"}))
# {"email": "user@company.org"}

`.handle_key_collision(flag)`

tools.handle_key_collision(flag: bool) -> JSONTools

When enabled, keys that would collide after transformations (e.g., after lowercasing) are collected into arrays instead of overwriting each other.

Parameter	Type	Default	Description
`flag`	`bool`	`False`	Enable collision handling

result = (jt.JSONTools()
    .flatten()
    .lowercase_keys(True)
    .handle_key_collision(True)
    .execute({"Name": "Alice", "name": "Bob"}))
# {"name": ["Alice", "Bob"]}

`.auto_convert_types(flag)`

tools.auto_convert_types(flag: bool) -> JSONTools

Automatically convert string values to their native types:

Numbers: "123" -> 123, "1,234.56" -> 1234.56, "$99.99" -> 99.99, "1e5" -> 100000
Booleans: "true" / "TRUE" / "True" -> true, "false" / "FALSE" / "False" -> false
Nulls: "null" / "None" -> null

If conversion fails, the original string is kept. No errors are raised on conversion failure.

Parameter	Type	Default	Description
`flag`	`bool`	`False`	Enable automatic type conversion

result = (jt.JSONTools()
    .flatten()
    .auto_convert_types(True)
    .execute({"id": "123", "price": "1,234.56", "active": "true"}))
# {"id": 123, "price": 1234.56, "active": true}

`.parallel_threshold(n)`

tools.parallel_threshold(n: int) -> JSONTools

Set the minimum batch size to trigger parallel processing. Batches smaller than this are processed sequentially to avoid thread-spawning overhead.

Parameter	Type	Default	Description
`n`	`int`	`100`	Minimum batch size for parallelism

Default can be overridden with the JSON_TOOLS_PARALLEL_THRESHOLD environment variable.

tools = jt.JSONTools().flatten().parallel_threshold(50)

`.num_threads(n)`

tools.num_threads(n: int) -> JSONTools

Set the number of threads used for parallel processing.

Parameter	Type	Default	Description
`n`	`int`	CPU count	Number of worker threads

Default can be overridden with the JSON_TOOLS_NUM_THREADS environment variable.

tools = jt.JSONTools().flatten().num_threads(4)

`.nested_parallel_threshold(n)`

tools.nested_parallel_threshold(n: int) -> JSONTools

Set the minimum number of keys/items within a single JSON document to trigger nested (intra-document) parallelism. Only objects or arrays exceeding this count are parallelized internally.

Parameter	Type	Default	Description
`n`	`int`	`100`	Minimum keys/items for nested parallelism

Default can be overridden with the JSON_TOOLS_NESTED_PARALLEL_THRESHOLD environment variable.

tools = jt.JSONTools().flatten().nested_parallel_threshold(200)

`.max_array_index(n)`

tools.max_array_index(n: int) -> JSONTools

Set the maximum array index allowed during unflattening. This is a DoS protection: a malicious key like "items.999999999" would otherwise allocate a massive array.

Parameter	Type	Default	Description
`n`	`int`	`100000`	Maximum array index

Default can be overridden with the JSON_TOOLS_MAX_ARRAY_INDEX environment variable.

Execution Methods

`.execute(input)`

tools.execute(input) -> str | dict | list[str] | list[dict] | DataFrame | Series

Execute the configured operation. The return type mirrors the input type:

Input Type	Output Type
`str`	`str` (JSON string)
`dict`	`dict` (Python dictionary)
`list[str]`	`list[str]`
`list[dict]`	`list[dict]`
`pandas.DataFrame`	`pandas.DataFrame`
`pandas.Series`	`pandas.Series`
`polars.DataFrame`	`polars.DataFrame`
`polars.Series`	`polars.Series`
`pyarrow.Table`	`pyarrow.Table`
`pyarrow.ChunkedArray`	`pyarrow.ChunkedArray`
`pyspark.sql.DataFrame`	`pyspark.sql.DataFrame`

Raises: JsonToolsError if no mode is set, input is invalid, or processing fails.

# String input -> string output
result = jt.JSONTools().flatten().execute('{"a": {"b": 1}}')
assert isinstance(result, str)

# Dict input -> dict output
result = jt.JSONTools().flatten().execute({"a": {"b": 1}})
assert isinstance(result, dict)

# Batch string input -> batch string output
results = jt.JSONTools().flatten().execute(['{"a": 1}', '{"b": 2}'])
assert isinstance(results, list) and isinstance(results[0], str)

# Batch dict input -> batch dict output
results = jt.JSONTools().flatten().execute([{"a": {"b": 1}}, {"c": {"d": 2}}])
assert isinstance(results, list) and isinstance(results[0], dict)

`.execute_to_output(input)`

tools.execute_to_output(input) -> JsonOutput

Execute the operation but return a JsonOutput wrapper instead of native Python types. Useful when you need to inspect whether the result is single or multiple before extracting.

Note: DataFrame and Series inputs are not supported with execute_to_output(). Use .execute() for those types.

Parameter	Type	Description
`input`	`str`, `dict`, `list[str]`, `list[dict]`	JSON data to process

output = jt.JSONTools().flatten().execute_to_output('{"a": {"b": 1}}')
if output.is_single:
    print(output.get_single())
elif output.is_multiple:
    for item in output.get_multiple():
        print(item)

JsonOutput

Output wrapper returned by .execute_to_output(). Provides typed access to results.

Properties

Property	Type	Description
`.is_single`	`bool`	`True` if the result contains a single JSON string
`.is_multiple`	`bool`	`True` if the result contains multiple JSON strings

Methods

`.get_single()`

output.get_single() -> str

Extract the single JSON string result.

Raises: ValueError if the result is multiple.

`.get_multiple()`

output.get_multiple() -> list[str]

Extract the list of JSON string results.

Raises: ValueError if the result is single.

`.to_python()`

output.to_python() -> str | list[str]

Convert to native Python type: returns str for single results, list[str] for multiple results.

import pandas as pd
import json_tools_rs as jt

df = pd.DataFrame({"json_col": [
    '{"user": {"name": "Alice", "age": 30}}',
    '{"user": {"name": "Bob", "age": 25}}',
]})

tools = jt.JSONTools().flatten().separator(".")

# Process column containing JSON strings
result_df = tools.execute(df)
# Returns a DataFrame with flattened columns: "user.name", "user.age"

Pandas Series

series = pd.Series([
    '{"a": {"b": 1}}',
    '{"a": {"b": 2}}',
])

result_series = jt.JSONTools().flatten().execute(series)
# Returns a Series of flattened JSON strings

Polars DataFrame

import polars as pl

df = pl.DataFrame({"json_col": [
    '{"user": {"name": "Alice"}}',
    '{"user": {"name": "Bob"}}',
]})

result_df = jt.JSONTools().flatten().execute(df)

Polars Series

series = pl.Series("data", [
    '{"a": {"b": 1}}',
    '{"a": {"b": 2}}',
])

result_series = jt.JSONTools().flatten().execute(series)

PyArrow Table

import pyarrow as pa

table = pa.table({"json_col": [
    '{"user": {"name": "Alice"}}',
    '{"user": {"name": "Bob"}}',
]})

result_table = jt.JSONTools().flatten().execute(table)

PySpark DataFrame

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame([
    ('{"user": {"name": "Alice"}}',),
    ('{"user": {"name": "Bob"}}',),
], ["json_col"])

result_df = jt.JSONTools().flatten().execute(df)

JsonToolsError

Exception class for all errors raised by JSON Tools RS.

import json_tools_rs as jt

try:
    result = jt.JSONTools().flatten().execute("not valid json")
except jt.JsonToolsError as e:
    print(f"Error: {e}")
    # Error: [E001] JSON parsing failed: ...

Error messages include a machine-readable code (E001-E008) at the start of the message. See Error Codes for the full reference.

Error Codes Quick Reference

Code	Name	Common Cause
`E001`	`JsonParseError`	Invalid JSON input
`E002`	`RegexError`	Bad regex in key/value replacement
`E003`	`InvalidReplacementPattern`	Malformed replacement pair
`E004`	`InvalidJsonStructure`	Wrong JSON shape for the operation
`E005`	`ConfigurationError`	No mode set before `.execute()`
`E006`	`BatchProcessingError`	Error in one item during batch processing
`E007`	`InputValidationError`	Unsupported input type
`E008`	`SerializationError`	Internal serialization failure

Handling Specific Errors

import json_tools_rs as jt

try:
    result = jt.JSONTools().execute({"a": 1})  # No mode set
except jt.JsonToolsError as e:
    msg = str(e)
    if "[E005]" in msg:
        print("Forgot to call .flatten() or .unflatten()")
    elif "[E001]" in msg:
        print("Invalid JSON input")

Complete Example

import json_tools_rs as jt

# Build once, reuse many times
tools = (jt.JSONTools()
    .flatten()
    .separator("::")
    .lowercase_keys(True)
    .remove_nulls(True)
    .remove_empty_strings(True)
    .key_replacement("^user_", "")
    .auto_convert_types(True)
    .parallel_threshold(50)
    .num_threads(4)
)

# Single dict
result = tools.execute({"User_Name": "Alice", "User_Age": "30"})
# {"name": "Alice", "age": 30}

# Batch of dicts (processed in parallel if >= 50 items)
results = tools.execute([{"data": str(i)} for i in range(1000)])

# JSON string
result = tools.execute('{"User_Name": "Alice", "nested": {"User_Age": "30"}}')

# DataFrame
import pandas as pd
df = pd.DataFrame({"json": [
    '{"User_Name": "Alice", "nested": {"User_Age": "30"}}',
    '{"User_Name": "Bob", "nested": {"User_Age": "25"}}',
]})
df_result = tools.execute(df)

JSON Tools RS Guide