githubEdit

scale-balancedData Normalization

In practical machine learning workflows, raw input features are rarely fed directly into a model. Normalization and scaling are usually required to ensure numerical stability, faster convergence, and consistent behavior across different datasets. Brain4J addresses this explicitly through the concept of feature scalers.

A feature scaler is a stateful object that learns statistics from data and applies a deterministic transformation to input tensors. This is formalized by the FeatureScaler interface.

The separation between fit and transform is intentional. A scaler must be fitted only on training data, and then reused unchanged for validation, testing, and inference. This avoids data leakage and makes preprocessing behavior explicit and reproducible.

public interface FeatureScaler extends JsonAdapter {
    void fit(List<Tensor> tensors);
    Tensor transform(Tensor tensor);
}
circle-info

Because feature scalers implement JsonAdapter, their internal state (such as mean and standard deviation) can be serialized and restored later.

Z-Score Normalization

A typical implementation is ZScoreScaler, which standardizes features using the mean and standard deviation computed from the training set:

x' = (x - mean) / std

During fit, the scaler computes and stores these statistics. During transform, it applies the same normalization to every incoming tensor.

List<Tensor> trainData = getTrainData(); // example
List<Tensor> testData = getTrainData(); // example

ZScoreScaler scaler = new ZScoreScaler();
scaler.fit(data);

List<Tensor> transformed = scaler.transform(testData);

Integrating Scaling into the Model Pipeline

Brain4J allows feature scaling to be integrated directly into the model itself through ScalingLayer.

A ScalingLayer wraps a FeatureScaler and applies it during the forward pass:

Conceptually, this makes preprocessing part of the network definition rather than an external transformation applied by user code.

There are several important implications:

  1. Scaling becomes part of the serialized model: when the model is saved, the scaler configuration and learned statistics are saved with it. Reloading the model automatically restores the exact preprocessing behavior used during training.

  2. Training and inference pipelines become identical: the same model object can be used on raw input tensors without having to manually reapply normalization logic.

  3. The scaling operation is transparent to autograd: the ScalingLayer preserves the autograd context of its input tensors, ensuring that gradients flow correctly through the network, even though the scaler itself has no trainable parameters.

  4. All inputs gets normalized: if no second argument is specified, by default all inputs gets transformed. This can be avoided by passing a Set<Integer> that specifies which inputs to preprocess as a list of indices.

Last updated