Tensors

A tensor is a multi-dimensional mathematical object that generalizes scalars (numbers), vectors (1D arrays), and matrices (2D arrays) to higher dimension.

Tensors are the fundamental data structure in Brain4J. All computations, models, and data transformations operate on tensors, which provide an efficient way to represent and manipulate high-dimensional data such as vectors, matrices, sequences, images, and text.

A tensor in Brain4J is defined by the following properties:

Order (Rank): Number of indices required to uniquely identify an element. For example, scalars have order 0, vectors have order 1, and matrices have order 2. Note: In Brain4J scalars have rank 1, because they are.
Shape: Number of elements along each dimension, or axis of the tensor. For example: an RGB image can be represented by a tensor with shape [color_channels, height, width].
Stride: The number of elements to skip in order to access the next element in that dimension.
Strides: A list containing the stride value for each dimension of the tensor.

In Brain4J, scalars are represented as tensors with shape [1] (1D vector). As a result, scalars have rank 1 and Brain4J does not support rank-0 tensors.

Device placement

In Brain4J, every tensor is associated with a specific execution device. A device defines where the tensor’s data is stored and where operations on that tensor are executed.

Currently, Brain4J supports the following devices:

CPU (default)
GPU (via OpenCL)

Creating tensors on a device

Tensors are created on the CPU by default. A tensor can be explicitly moved to another device when required.

Device device = Brain4J.firstDevice(); // gets the first device available
Tensor t = Tensors.random(1024, 1024); // CPU tensor

// moves the tensor to the GPU
Tensor gpuTensor = t.to(device);

Whenever a tensor is located in the GPU, operations can be batched in a single command queue and planned for execution. Note: the data() method is not recommended for use when a tensor is hosted in the GPU.

Memory Layout

Internally, Brain4J tensors store their data in a flat, contiguous memory buffer, typically a one-dimensional float[] (or a device buffer in the case of GPU tensors). Multi-dimensional structure is not represented through nested arrays, but through shape and stride metadata.

This design choice is fundamental for performance, interoperability, and flexibility. By default, Brain4J uses a row-major (C-style) memory layout, meaning that the last dimension is contiguous in memory.

For example, a tensor with shape:

[2, 3]

is laid out in memory as:

row 0: elements 0, 1, 2
row 1: elements 3, 4, 5

This corresponds to strides:

strides = [3, 1]

Broadcasting

Broadcasting allows tensors with compatible shapes to participate in the same operation without explicitly duplicating data.

A tensor can be broadcast along a dimension if its size in that dimension is either:

equal to the target size, or
equal to 1

Broadcasting is applied per-dimension, from the trailing axes.

Example

Tensor a = Tensors.random(4, 3);
Tensor b = Tensors.random(1, 3);

Tensor c = a.plus(b); // b is broadcast along the first dimension

Result shape: [4, 3]

Important notes

Broadcasting does not allocate new memory for the broadcasted tensor.
If shapes are not compatible, the operation fails with an error.
Not all operations support broadcasting.

In-place operations & Mutability

Brain4J tensors are mutable objects. Depending on the operation, a method may:

modify the tensor data in-place
allocate a new data buffer (deep copy)
return a view that shares the same underlying memory

This distinction is critical. Failing to understand it can easily introduce silent and hard-to-debug errors, especially when combining views, in-place mutations, and autograd.

For this reason, Brain4J does not hide mutability behind implicit copies. Instead, the API exposes it explicitly, and users are expected to reason about it.

In-place operations

In-place operations directly modify the internal data buffer of the tensor. No new memory is allocated, and the original values are overwritten.

Typical examples include:

Direct writes such as set(...)
Element-wise mutations such as map(...)
Fill operations like fill(...)
Normalization methods such as layerNorm(...)

These operations return this (or otherwise operate on the same instance) and permanently change the tensor contents.

Methods like layerNorm(...) are semantically destructive. Despite their functional-looking name, they overwrite the original data. This behavior is intentional but must be understood clearly by the user.

If you need to preserve the original tensor, you must explicitly create a copy using clone() before calling these methods.

Object Mutability vs Data Mutability

Some methods mutate the tensor object without modifying the numerical data itself. This mainly concerns autograd-related state:

enabling or disabling gradient tracking
resetting accumulated gradients
updating the autograd context

These operations still mutate the tensor instance and therefore must be treated as stateful, even though the underlying data buffer is unchanged.

Operations That Allocate New Memory

Many tensor operations allocate a new data buffer, producing a tensor that is fully independent from the original one.

This category includes:

Explicit copies such as clone()
Shape-altering operations that require materialization (e.g. slice, concat)
Reductions and aggregations (sum, mean, variance)
Non-linear transformations such as softmax
Mathematical operations like matmul and convolve

In all these cases, the resulting tensor owns its own memory and can be mutated safely without affecting the original tensor.

Cloning

Tensors support cloning through the clone() method. A cloned tensor is always a deep copy and never shares the underlying data buffer with the original tensor.

The cloning behavior is defined as follows:

The tensor values are fully copied into a new contiguous buffer.
The shape array is always copied
If the source tensor is contiguous, its strides are preserved.
If the source tensor is non-contiguous (e.g. after transpose), the clone is materialized into a contiguous layout with standard strides
The cloned tensor has no associated autograd context

Views and Aliasing (No Data Copy)

Some operations return a view of the original tensor. Views do not allocate new memory and instead share the same underlying data buffer, while exposing a different shape and/or stride configuration.

Common view-producing operations include:

reshape(...)
flatten()
transpose(...)
squeeze(...) and unsqueeze(...)

All these operations:

do not copy data
share the same memory buffer
reflect mutations across all aliases

As a consequence, modifying one view (for example via set(...) or map(...)) will also affect every other tensor that shares the same data.

The transpose(...) methods works accordingly to the chosen backend. If SIMD acceleration is active, full transposition is done by creating a new data array and moving the original values, otherwise a lazy-transposition is used.

This is prone to change in future releases.

Autograd-Safe Operations (`*Grad` Methods)

Operations with a *Grad suffix follow a strict rule:

If gradient tracking is disabled, they delegate to the non-gradient version
If gradient tracking is enabled, they never operate in-place

In the presence of autograd, these methods always create a new tensor to ensure correctness of the computation graph.

This guarantees that in-place data corruption cannot occur during gradient-based training.

All operations which do not end with the *Grad prefix, do not keep the autograd context.

The *Grad api will be removed in future releases and unified into the single implementations.

Examples

Basic tensor creation

// Scalars, vectors and matrices
Tensor scalar = Tensors.scalar(10);          // order 0
Tensor vector = Tensors.vector(1, 2, 3, 4);  // shape [4]
Tensor matrix = Tensors.matrix(
    2, 3,
    0.1f, 0.2f, 0.3f,
    0.4f, 0.5f, 0.6f
); // shape [2, 3]

// Higher-order tensors
Tensor zeros = Tensors.zeros(1, 2, 3); // shape [1, 2, 3], order 3

int[] shape = { 28, 28 };
float[] data = new float[28 * 28];
// you can also create tensors by specifing both the shape and the data
Tensor full = Tensors.create(shape, data);

Random tensors and matrix multiplication

Tensor a = Tensors.random(2, 3);
Tensor b = Tensors.random(3, 4);

// Matrix multiplication (inner dimensions must match)
Tensor c = a.matmul(b); // shape [2, 4]

Copy vs in-place operations

Tensor x = Tensors.random(10);
Tensor y = Tensors.random(10);

// Creates a new tensor (x is not modified)
Tensor z = x.times(y);

// In-place operation (x is modified)
x.mul(y); // also returns x

Reshaping tensors (views)

Tensor t = Tensors.random(2, 3);

// Reshape does not copy values, it stores the reference
Tensor reshaped = t.reshape(3, 2);

// Modifying the view affects the original tensor
reshaped.set(42, 0, 0); // value 42 at 0, 0

Slicing tensors

Tensor m = Tensors.random(4, 4);

// Extracts a sub-tensor (view)
Tensor row = m.slice(Range.point(1), Range.all()); // second row (1), all columns

Mapping values

Tensor r = Tensors.random(24, 10); // shape doesn't matter here

// In-place modification
r.map(x -> x * 2); // multiplies each element by 2 with a lambda

PreviousBasics NextUsing the GPU

Last updated 1 month ago

hashtagDevice placement

hashtagMemory Layout

hashtagBroadcasting

hashtagImportant notes

hashtagIn-place operations & Mutability

hashtagIn-place operations

hashtagObject Mutability vs Data Mutability

hashtagOperations That Allocate New Memory

hashtagCloning

hashtagViews and Aliasing (No Data Copy)

hashtagAutograd-Safe Operations (*Grad Methods)

hashtagExamples

Device placement

Memory Layout

Broadcasting

Important notes

In-place operations & Mutability

In-place operations

Object Mutability vs Data Mutability

Operations That Allocate New Memory

Cloning

Views and Aliasing (No Data Copy)

Autograd-Safe Operations (`*Grad` Methods)

Examples