Kernel¶

class probnum.randprocs.kernels.Kernel(input_shape, output_shape=())¶

Bases: ABC

(Cross-)covariance function(s)

Abstract base class representing one or multiple (cross-)covariance function(s), also known as kernels. A cross-covariance function

\begin{equation} k_{fg} \colon \mathcal{X}^{d_\text{in}} \times \mathcal{X}^{d_\text{in}} \to \mathbb{R} \end{equation}

is a function of two arguments \(x_0\) and \(x_1\), which represents the covariance between two evaluations \(f(x_0)\) and \(g(x_1)\) of two scalar-valued random processes \(f\) and \(g\) on a common probability space (or, equivalently, two outputs \(h_i(x_0)\) and \(h_j(x_1)\) of a vector-valued random process \(h\)). If \(f = g\), then the cross-covariance function is also referred to as a covariance function, in which case it must be symmetric and positive (semi-)definite.

An instance of a Kernel can compute multiple different (cross-)covariance functions on the same pair of inputs simultaneously. For instance, it can be used to compute the full covariance matrix

\begin{equation} C^f \colon \mathcal{X}^{d_\text{in}} \times \mathcal{X}^{d_\text{in}} \to \mathbb{R}^{d_\text{out} \times d_\text{out}}, C^f_{i j}(x_0, x_1) := k_{f_i f_j}(x_0, x_1) \end{equation}

of the vector-valued random process \(f\). To this end, we understand any Kernel as a tensor whose shape is given by output_shape, which contains different (cross-)covariance functions as its entries.

Parameters

input_shape (ShapeLike) – Shape of the Kernel’s input.
output_shape (ShapeLike) –
Shape of the Kernel’s output.

If output_shape is set to (), the Kernel instance represents a single (cross-)covariance function. Otherwise, i.e. if output_shape is a non-empty tuple, the Kernel instance represents a tensor of (cross-)covariance functions whose shape is given by output_shape.

Examples

>>> from probnum.randprocs.kernels import Linear
>>> D = 3
>>> k = Linear(input_shape=D)
>>> k.input_shape
(3,)
>>> k.output_shape
()

Generate some input data.

>>> N = 4
>>> xs = np.linspace(0, 1, N * D).reshape(N, D)
>>> xs.shape
(4, 3)
>>> xs
array([[0.        , 0.09090909, 0.18181818],
       [0.27272727, 0.36363636, 0.45454545],
       [0.54545455, 0.63636364, 0.72727273],
       [0.81818182, 0.90909091, 1.        ]])

We can compute kernel matrices like so.

>>> k.matrix(xs)
array([[0.04132231, 0.11570248, 0.19008264, 0.26446281],
       [0.11570248, 0.41322314, 0.7107438 , 1.00826446],
       [0.19008264, 0.7107438 , 1.23140496, 1.75206612],
       [0.26446281, 1.00826446, 1.75206612, 2.49586777]])

The Kernel.__call__() evaluations are vectorized over the “batch shapes” of the inputs, applying standard NumPy broadcasting.

>>> k(xs[:, None], xs[None, :])  # same as `.matrix`
array([[0.04132231, 0.11570248, 0.19008264, 0.26446281],
       [0.11570248, 0.41322314, 0.7107438 , 1.00826446],
       [0.19008264, 0.7107438 , 1.23140496, 1.75206612],
       [0.26446281, 1.00826446, 1.75206612, 2.49586777]])

No broadcasting is applied if both inputs have the same shape. For instance, one can efficiently compute just the diagonal of the kernel matrix via

>>> k(xs, xs)
array([0.04132231, 0.41322314, 1.23140496, 2.49586777])
>>> k(xs, None)  # x1 = None is an efficient way to set x1 == x0
array([0.04132231, 0.41322314, 1.23140496, 2.49586777])

and the diagonal above the main diagonal of the kernel matrix is retrieved through

>>> k(xs[:-1, :], xs[1:, :])
array([0.11570248, 0.7107438 , 1.75206612])

Kernels support basic arithmetic operations. For example we can add noise to the kernel in the following fashion.

>>> from probnum.randprocs.kernels import WhiteNoise
>>> k_noise = k + 0.1 * WhiteNoise(input_shape=D)
>>> k_noise.matrix(xs)
array([[0.14132231, 0.11570248, 0.19008264, 0.26446281],
       [0.11570248, 0.51322314, 0.7107438 , 1.00826446],
       [0.19008264, 0.7107438 , 1.33140496, 1.75206612],
       [0.26446281, 1.00826446, 1.75206612, 2.59586777]])

Attributes Summary

`input_ndim`	Syntactic sugar for `len(input_shape)`.
`input_shape`	Shape of single, i.e. non-batched, arguments of the covariance function.
`output_ndim`	Syntactic sugar for `len(output_shape)`.
`output_shape`	Shape of single, i.e. non-batched, return values of the covariance function.

Methods Summary

`__call__`(x0, x1)	Evaluate the (cross-)covariance function(s).
`matrix`(x0[, x1])	A convenience function for computing a kernel matrix for two sets of inputs.

Attributes Documentation

input_ndim¶: Syntactic sugar for len(input_shape).

input_shape¶: Shape of single, i.e. non-batched, arguments of the covariance function.

output_ndim¶: Syntactic sugar for len(output_shape).

output_shape¶

Shape of single, i.e. non-batched, return values of the covariance function.

If output_shape is (), the Kernel instance represents a single (cross-)covariance function. Otherwise, i.e. if output_shape is non-empty, the Kernel instance represents a tensor of (cross-)covariance functions whose shape is given by output_shape.

Methods Documentation

__call__(x0, x1)[source]¶

Evaluate the (cross-)covariance function(s).

The evaluation of the (cross-covariance) function(s) is vectorized over the batch shapes of the arguments, applying standard NumPy broadcasting.

Parameters

x0 (ArrayLike) – shape= batch_shape_0 + input_shape – (Batch of) input(s) for the first argument of the Kernel.
x1 (Optional[ArrayLike]) – shape= batch_shape_1 + input_shape – (Batch of) input(s) for the second argument of the Kernel. Can also be set to None, in which case the function will behave as if x1 = x0 (but it is implemented more efficiently).

Returns

shape= bcast_batch_shape + output_shape – The (cross-)covariance function(s) evaluated at (x0, x1). Since the function is vectorized over the batch shapes of the inputs, the output array contains the following entries:

k_x0_x1[batch_idx + output_idx] = k[output_idx](
    x0[batch_idx, ...],
    x1[batch_idx, ...],
)

where we assume that x0 and x1 have been broadcast to a common shape bcast_batch_shape + input_shape, and where output_idx and batch_idx are indices compatible with output_shape and bcast_batch_shape, respectively. By k[output_idx] we refer to the covariance function at index output_idx in the tensor of covariance functions represented by the Kernel instance.

Return type

k_x0_x1

Raises

ValueError – If one of the input shapes is not of the form batch_shape_{0,1} + input_shape.
ValueError – If the inputs can not be broadcast to a common shape.