Models And Introspection¶

Prediction¶

predict(...) returns hard predictions
predict_proba(...) returns class probabilities for classification models

Classification prediction returns hard labels, while predict_proba(...) exposes the underlying class distributions when supported by the model.

That distinction is important because ForestFire does not treat “probability prediction” as an afterthought. For ensembles, especially forests and boosted classifiers, the probability path is the more semantically faithful one:

forests aggregate per-tree probabilities and then derive hard labels
optimized runtimes preserve those leaf distributions rather than collapsing everything to winner-take-all labels
introspection and dataframe export expose the same learned leaf payloads that drive both predict(...) and predict_proba(...)

Optimized inference¶

optimize_inference(...) lowers a trained model into a runtime-oriented representation for faster prediction.

It preserves model semantics while changing the execution layout.

That includes missing-value semantics: optimized models keep the learned missing routing for the features you ask them to preserve, and otherwise retain the canonical semantic model for serialization and inspection.

This split between training representation and runtime representation is one of the core design decisions in the project.

Why do it this way:

the best structure for training is not automatically the best structure for scoring
training nodes carry bookkeeping that is useful for inspection and serialization but wasteful on the hot prediction path
runtime lowering lets the project keep one semantic model while still specializing execution for batch prediction
optimized runtimes can compact the active feature space without changing what inputs the semantic model expects

The important design rule is:

Model is the canonical semantic object
OptimizedModel is a derived execution object

That distinction is what lets ForestFire keep introspection, serialization, and optimized inference aligned instead of forcing them to compete.

What it changes internally¶

prediction-only node layouts drop training-only fields
compiled CART-style trees use a fallthrough layout
multiway classifier nodes use dense bin lookup tables
oblivious trees become compact level arrays plus a leaf table
optimized runtimes project inputs down to the union of features the model actually uses
forests and boosted ensembles reorder trees by simple feature-locality keys before lowering
batch preprocessing happens ahead of traversal
compiled runtimes use compact u8/u16 column-major batch matrices
polars.LazyFrame inputs are collected and scored in batches of about 10_000 rows

Why those changes help:

fallthrough layouts reduce branch-heavy pointer chasing in binary trees
dense lookup tables replace repeated branch scans in multiway nodes
oblivious trees are naturally amenable to regular array-based execution
feature projection avoids materializing and binning columns that are never touched by the trained model
locality-oriented tree ordering gives ensembles a better chance of reusing hot feature columns and top-level metadata
compact batch matrices reduce bandwidth and improve cache density
batched preprocessing amortizes input conversion over many rows instead of repeating it per traversal

Another way to say this is that optimized inference changes three things at once:

which data structures represent the model
which feature columns are materialized at prediction time
which execution strategy is used for row batches

The optimized model is therefore not just “the same predictor with a faster tree walk”. It is a full runtime lowering pass.

Feature projection¶

Optimized models still accept the full semantic input schema, but they no longer preprocess every feature eagerly.

Instead, ForestFire:

inspects the semantic model
computes the sorted union of all feature indices that appear in splits
remaps the optimized runtime into that compact feature space
preprocesses only those projected columns during optimized inference

This matters most when:

upstream pipelines emit wide tables but each tree only touches a small subset
forests or boosted ensembles repeatedly reuse a narrow set of strong predictors
batch preprocessing cost is large enough to matter, not just traversal cost

Both Model and OptimizedModel expose:

used_feature_indices()
used_feature_count()

That makes it easy to inspect whether a trained model is genuinely sparse in feature usage before relying on the optimized path.

This is especially useful for ensembles, because the semantic feature count and the effective runtime feature count can differ substantially:

the semantic model may have been trained on a wide table
each tree may only use a small subset
the optimized ensemble can then project to the union of all actually-used features

That is one of the main reasons optimized inference can improve not just traversal speed but total end-to-end scoring cost.

Compiled optimized models¶

OptimizedModel can also be serialized into a compiled artifact.

That artifact keeps:

the semantic IR
the lowered runtime layout
the feature projection metadata

The reason to keep all three is that they solve different problems:

the semantic IR preserves meaning
the lowered runtime preserves load-time work
the projection metadata preserves how runtime-local feature indices map back to the semantic schema

So a compiled optimized model is best understood as a deployment cache for the optimized runtime, not as a new canonical model definition.

For categorical models, the compiled artifact also carries the categorical transform metadata needed to convert raw mixed inputs into the encoded feature space expected by the lowered runtime.

Where the impact is largest:

large batches
deep or ensemble-heavy models
repeated scoring of already-trained models
workflows where the same model serves many predictions and the one-time lowering cost is amortized

Serialization¶

Available export paths:

JSON model serialization
JSON IR export
compiled optimized runtime serialization

Compiled optimized artifacts retain both:

the semantic IR
the runtime-specific feature projection and lowered execution layout

That means reloading a compiled optimized model skips the lowering step without changing the model’s semantic serialization.

In other words, ForestFire deliberately has two layers of artifacts:

canonical semantic artifacts for portability and inspection
optional compiled runtime artifacts for faster load and predict paths

That split is what keeps the project flexible as runtime layouts evolve.

IR¶

The IR exists to make inference semantics explicit and portable.

It includes:

algorithm, task, tree type, and criterion
explicit node_tree and oblivious_levels representations
training-time numeric bin boundaries
categorical transform metadata when categorical strategies were used
leaf payloads for classification and regression
node and leaf stats like sample counts, impurity, gain, class counts, and variance when relevant

The rationale for the IR is broader than export alone:

it is the semantic contract between training and optimized inference
it keeps serialization honest by forcing preprocessing and leaf semantics to be represented explicitly
it makes introspection possible without inventing a second ad hoc debug format

In other words, the IR is not a side artifact. It is the shared meaning layer for the project.

That is why Model and OptimizedModel export the same IR JSON:

both objects represent the same learned function
only one of them stores it in a runtime-specialized way
the IR must therefore describe the common semantics, not the optimization strategy

For categorical models, that common semantic layer now includes:

the raw input schema
the categorical strategy configuration
the serialized transform state needed to reproduce encoded inference before tree evaluation

Tree introspection¶

All tree-backed models expose:

tree structure summaries
node/level/leaf inspection
prediction-value statistics
dataframe export

Typical use cases:

understanding realized tree size and depth
inspecting learned splits and leaf payloads
comparing optimized and non-optimized views
inspecting one tree at a time inside forests and boosted ensembles

The introspection API exists because “tree model” users often need to answer questions that pure prediction APIs cannot:

did the learner actually grow the shape I expected?
which feature did a particular node split on?
how deep did the ensemble trees become in practice?
are the leaf values concentrated or spread out?

The design goal here is to expose the trained structure without making users decode the raw IR by hand.

There is also a practical runtime reason to keep introspection semantic:

optimized runtimes may reorder trees or remap feature indices internally
users generally want to inspect the trained meaning, not the lowered execution cache

So introspection stays anchored to the semantic model even when optimized inference is available.

`to_dataframe(...)`¶

to_dataframe(...) is a tabular export of the tree structure:

standard trees include split rows, leaf rows, and unmatched fallback leaves where relevant
oblivious trees include one row per level and one row per leaf
forests and boosted ensembles include tree_index

This method exists for interoperability more than aesthetics. A dataframe-like export is the easiest way to:

join tree structure with downstream analysis code
compare models programmatically
inspect many trees at once in forests and boosting
build custom reports without re-walking nested JSON structures