Kernel¶
- class probnum.randprocs.kernels.Kernel(input_shape, output_shape=())¶
Bases:
ABC
(Cross-)covariance function(s)
Abstract base class representing one or multiple (cross-)covariance function(s), also known as kernels. A cross-covariance function
\begin{equation} k_{fg} \colon \mathcal{X}^{d_\text{in}} \times \mathcal{X}^{d_\text{in}} \to \mathbb{R} \end{equation}is a function of two arguments \(x_0\) and \(x_1\), which represents the covariance between two evaluations \(f(x_0)\) and \(g(x_1)\) of two scalar-valued random processes \(f\) and \(g\) on a common probability space (or, equivalently, two outputs \(h_i(x_0)\) and \(h_j(x_1)\) of a vector-valued random process \(h\)). If \(f = g\), then the cross-covariance function is also referred to as a covariance function, in which case it must be symmetric and positive (semi-)definite.
An instance of a
Kernel
can compute multiple different (cross-)covariance functions on the same pair of inputs simultaneously. For instance, it can be used to compute the full covariance matrix\begin{equation} C^f \colon \mathcal{X}^{d_\text{in}} \times \mathcal{X}^{d_\text{in}} \to \mathbb{R}^{d_\text{out} \times d_\text{out}}, C^f_{i j}(x_0, x_1) := k_{f_i f_j}(x_0, x_1) \end{equation}of the vector-valued random process \(f\). To this end, we understand any
Kernel
as a tensor whose shape is given byoutput_shape
, which contains different (cross-)covariance functions as its entries.- Parameters
input_shape (ShapeLike) – Shape of the
Kernel
’s input.output_shape (ShapeLike) –
Shape of the
Kernel
’s output.If
output_shape
is set to()
, theKernel
instance represents a single (cross-)covariance function. Otherwise, i.e. ifoutput_shape
is a non-empty tuple, theKernel
instance represents a tensor of (cross-)covariance functions whose shape is given byoutput_shape
.
Examples
>>> from probnum.randprocs.kernels import Linear >>> D = 3 >>> k = Linear(input_shape=D) >>> k.input_shape (3,) >>> k.output_shape ()
Generate some input data.
>>> N = 4 >>> xs = np.linspace(0, 1, N * D).reshape(N, D) >>> xs.shape (4, 3) >>> xs array([[0. , 0.09090909, 0.18181818], [0.27272727, 0.36363636, 0.45454545], [0.54545455, 0.63636364, 0.72727273], [0.81818182, 0.90909091, 1. ]])
We can compute kernel matrices like so.
>>> k.matrix(xs) array([[0.04132231, 0.11570248, 0.19008264, 0.26446281], [0.11570248, 0.41322314, 0.7107438 , 1.00826446], [0.19008264, 0.7107438 , 1.23140496, 1.75206612], [0.26446281, 1.00826446, 1.75206612, 2.49586777]])
The
Kernel.__call__()
evaluations are vectorized over the “batch shapes” of the inputs, applying standard NumPy broadcasting.>>> k(xs[:, None], xs[None, :]) # same as `.matrix` array([[0.04132231, 0.11570248, 0.19008264, 0.26446281], [0.11570248, 0.41322314, 0.7107438 , 1.00826446], [0.19008264, 0.7107438 , 1.23140496, 1.75206612], [0.26446281, 1.00826446, 1.75206612, 2.49586777]])
No broadcasting is applied if both inputs have the same shape. For instance, one can efficiently compute just the diagonal of the kernel matrix via
>>> k(xs, xs) array([0.04132231, 0.41322314, 1.23140496, 2.49586777]) >>> k(xs, None) # x1 = None is an efficient way to set x1 == x0 array([0.04132231, 0.41322314, 1.23140496, 2.49586777])
and the diagonal above the main diagonal of the kernel matrix is retrieved through
>>> k(xs[:-1, :], xs[1:, :]) array([0.11570248, 0.7107438 , 1.75206612])
Kernels support basic arithmetic operations. For example we can add noise to the kernel in the following fashion.
>>> from probnum.randprocs.kernels import WhiteNoise >>> k_noise = k + 0.1 * WhiteNoise(input_shape=D) >>> k_noise.matrix(xs) array([[0.14132231, 0.11570248, 0.19008264, 0.26446281], [0.11570248, 0.51322314, 0.7107438 , 1.00826446], [0.19008264, 0.7107438 , 1.33140496, 1.75206612], [0.26446281, 1.00826446, 1.75206612, 2.59586777]])
Attributes Summary
Syntactic sugar for
len(input_shape)
.Shape of single, i.e. non-batched, arguments of the covariance function.
Syntactic sugar for
len(output_shape)
.Shape of single, i.e. non-batched, return values of the covariance function.
Methods Summary
__call__
(x0, x1)Evaluate the (cross-)covariance function(s).
matrix
(x0[, x1])A convenience function for computing a kernel matrix for two sets of inputs.
Attributes Documentation
- input_ndim¶
Syntactic sugar for
len(input_shape)
.
- input_shape¶
Shape of single, i.e. non-batched, arguments of the covariance function.
- output_ndim¶
Syntactic sugar for
len(output_shape)
.
- output_shape¶
Shape of single, i.e. non-batched, return values of the covariance function.
If
output_shape
is()
, theKernel
instance represents a single (cross-)covariance function. Otherwise, i.e. ifoutput_shape
is non-empty, theKernel
instance represents a tensor of (cross-)covariance functions whose shape is given byoutput_shape
.
Methods Documentation
- __call__(x0, x1)[source]¶
Evaluate the (cross-)covariance function(s).
The evaluation of the (cross-covariance) function(s) is vectorized over the batch shapes of the arguments, applying standard NumPy broadcasting.
- Parameters
x0 (ArrayLike) – shape=
batch_shape_0 +
input_shape
– (Batch of) input(s) for the first argument of theKernel
.x1 (Optional[ArrayLike]) – shape=
batch_shape_1 +
input_shape
– (Batch of) input(s) for the second argument of theKernel
. Can also be set toNone
, in which case the function will behave as ifx1 = x0
(but it is implemented more efficiently).
- Returns
shape=
bcast_batch_shape +
output_shape
– The (cross-)covariance function(s) evaluated at(x0, x1)
. Since the function is vectorized over the batch shapes of the inputs, the output array contains the following entries:k_x0_x1[batch_idx + output_idx] = k[output_idx]( x0[batch_idx, ...], x1[batch_idx, ...], )
where we assume that
x0
andx1
have been broadcast to a common shapebcast_batch_shape +
input_shape
, and whereoutput_idx
andbatch_idx
are indices compatible withoutput_shape
andbcast_batch_shape
, respectively. Byk[output_idx]
we refer to the covariance function at indexoutput_idx
in the tensor of covariance functions represented by theKernel
instance.- Return type
k_x0_x1
- Raises
ValueError – If one of the input shapes is not of the form
batch_shape_{0,1} +
input_shape
.ValueError – If the inputs can not be broadcast to a common shape.
See also
matrix
Convenience function to compute a kernel matrix, i.e. a matrix of pairwise evaluations of the kernel on two sets of points.
Examples
See documentation of class
Kernel
.
- matrix(x0, x1=None)[source]¶
A convenience function for computing a kernel matrix for two sets of inputs.
This is syntactic sugar for
k(x0[:, None], x1[None, :])
. Hence, it computes the matrix (stack) of pairwise covariances between two sets of input points. Ifk
represents a single covariance function, then the resulting matrix will be symmetric positive-(semi)definite forx0 == x1
.- Parameters
x0 (ArrayLike) – shape=
(M,) +
input_shape
orinput_shape
– Stack of inputs for the first argument of theKernel
.x1 (Optional[ArrayLike]) – shape=
(N,) +
input_shape
orinput_shape
– (Optional) stack of inputs for the second argument of theKernel
. Ifx1
is not specified, the function behaves as ifx1 = x0
(but it is implemented more efficiently).
- Returns
shape=
batch_shape +
output_shape
– The matrix / stack of matrices containing the pairwise evaluations of the (cross-)covariance function(s) onx0
andx1
. Depending on the shape of the inputs,batch_shape
is either(M, N)
,(M,)
,(N,)
, or()
.- Return type
kernmat
- Raises
ValueError – If the shapes of the inputs don’t match the specification.
See also
__call__
Evaluate the kernel more flexibly.
Examples
See documentation of class
Kernel
.