Kernel¶
- class probnum.randprocs.kernels.Kernel(input_dim, shape=())¶
Bases:
abc.ABC
(Cross-)covariance function(s)
Abstract base class representing one or multiple (cross-)covariance function(s), also known as kernels. A cross-covariance function
\begin{equation} k_{fg} \colon \mathcal{X}^{d_\text{in}} \times \mathcal{X}^{d_\text{in}} \to \mathbb{R} \end{equation}is a function of two arguments \(x_0\) and \(x_1\), which represents the covariance between two evaluations \(f(x_0)\) and \(g(x_1)\) of two scalar-valued random processes \(f\) and \(g\) (or, equivalently, two outputs \(h_i(x_0)\) and \(h_j(x_1)\) of a vector-valued random process \(h\)). If \(f = g\), then the cross-covariance function is also referred to as a covariance function, in which case it must be symmetric and positive (semi-)definite.
An instance of a
Kernel
can compute multiple different (cross-)covariance functions on the same pair of inputs simultaneously. For instance, it can be used to compute the full covariance matrix\begin{equation} C^f \colon \mathcal{X}^{d_\text{in}} \times \mathcal{X}^{d_\text{in}} \to \mathbb{R}^{d_\text{out} \times d_\text{out}}, C^f_{i j}(x_0, x_1) := k_{f_i f_j}(x_0, x_1) \end{equation}of the vector-valued random process \(f\). To this end, we understand any
Kernel
with a non-emptyshape
as a tensor with the givenshape
, which contains different (cross-)covariance functions as its entries.- Parameters
input_dim (
Union
[int
,Integral
,integer
]) – Input dimension of the kernel.shape (
Union
[int
,Integral
,integer
,Iterable
[Union
[int
,Integral
,integer
]]]) – Ifshape
is set to()
, theKernel
instance represents a single (cross-)covariance function. Otherwise, i.e. ifshape
is non-empty, theKernel
instance represents a tensor of (cross-)covariance functions whose shape is given byshape
.
Examples
>>> D = 3 >>> k = pn.randprocs.kernels.Linear(input_dim=D)
Generate some input data.
>>> xs = np.repeat(np.linspace(0, 1, 4)[:, None], D, axis=-1) >>> xs.shape (4, 3) >>> xs array([[0. , 0. , 0. ], [0.33333333, 0.33333333, 0.33333333], [0.66666667, 0.66666667, 0.66666667], [1. , 1. , 1. ]])
We can compute kernel matrices like so.
>>> k.matrix(xs) array([[0. , 0. , 0. , 0. ], [0. , 0.33333333, 0.66666667, 1. ], [0. , 0.66666667, 1.33333333, 2. ], [0. , 1. , 2. , 3. ]])
Inputs to
Kernel.__call__()
are broadcast according to the “kernel broadcasting” rules detailed in the “Notes” section of theKernel._call__()
documentation.>>> k(xs[:, None, :], xs[None, :, :]) # same as `.matrix` array([[0. , 0. , 0. , 0. ], [0. , 0.33333333, 0.66666667, 1. ], [0. , 0.66666667, 1.33333333, 2. ], [0. , 1. , 2. , 3. ]])
A shape of
1
along the last axis is broadcast toinput_dim
.>>> xs_d1 = xs[:, [0]] >>> xs_d1.shape (4, 1) >>> xs_d1 array([[0. ], [0.33333333], [0.66666667], [1. ]]) >>> k(xs_d1[:, None, :], xs_d1[None, :, :]) # same as `.matrix` array([[0. , 0. , 0. , 0. ], [0. , 0.33333333, 0.66666667, 1. ], [0. , 0.66666667, 1.33333333, 2. ], [0. , 1. , 2. , 3. ]]) >>> k(xs[:, None, :], xs_d1[None, :, :]) # same as `.matrix` array([[0. , 0. , 0. , 0. ], [0. , 0.33333333, 0.66666667, 1. ], [0. , 0.66666667, 1.33333333, 2. ], [0. , 1. , 2. , 3. ]])
No broadcasting is applied if both inputs have the same shape. For instance, one can efficiently compute just the diagonal of the kernel matrix via
>>> k(xs, xs) array([0. , 0.33333333, 1.33333333, 3. ]) >>> k(xs, None) # x1 = None is an efficient way to set x1 == x0 array([0. , 0.33333333, 1.33333333, 3. ])
and the diagonal above the main diagonal of the kernel matrix is retrieved through
>>> k(xs[:-1, :], xs[1:, :]) array([0. , 0.66666667, 2. ])
Attributes Summary
Dimension of arguments of the covariance function.
If
shape
is()
, theKernel
instance represents a single (cross-)covariance function.Methods Summary
__call__
(x0, x1)Evaluate the (cross-)covariance function(s).
matrix
(x0[, x1])A convenience function for computing a kernel matrix for two sets of inputs.
Attributes Documentation
- shape¶
If
shape
is()
, theKernel
instance represents a single (cross-)covariance function. Otherwise, i.e. ifshape
is non-empty, theKernel
instance represents a tensor of (cross-)covariance functions whose shape is given byshape
.
Methods Documentation
- __call__(x0, x1)[source]¶
Evaluate the (cross-)covariance function(s).
The inputs are broadcast to a common shape following the “kernel broadcasting” rules outlined in the “Notes” section.
- Parameters
x0 (array-like) – An array of shape
()
or(Nn, ..., N2, N1, D_in)
, whereD_in
is either1
orinput_dim
, whose entries will be passed to the first argument of the kernel.x1 (array-like) – An array of shape
()
or(Mm, ..., M2, M1, D_in)
, whereD_in
is either1
orinput_dim
, whose entries will be passed to the second argument of the kernel. Can also be set toNone
, in which case the function will behave as ifx1 = x0
.
- Returns
k_x0_x1 – The (cross-)covariance function(s) evaluated at
x0
andx1
. Ifshape
is()
, this method returns an array of shape(Lk, ..., L2, L1)
whose entry at index(ik, ..., i2, i1)
contains the evaluation of the (cross-)covariance function at the inputsx0[ik, ..., i2, i1, :] and ``x1[il, ..., i2, i1, :])
. For any non-emptyshape
, it returns an array of shape(Sl, ..., S2, S1, Lk, ..., L2, L1)
, whereS
isshape
, whose entry at index(sl, ..., s2, s1, ik, ..., i2, i1)
contains evaluation of the (cross-)covariance function at index(sl, ..., s2, s1)
at the inputsx0[ik, ..., i2, i1, :]
andx1[ik, ..., i2, i1, :]
. Above, we assume thatx0
andx1
have been broadcast according to the rules described in the “Notes” section.- Return type
- Raises
ValueError – If the inputs can not be “kernel broadcast” to a common shape.
See also
matrix
Convenience function to compute a kernel matrix, i.e. a matrix of pairwise evaluations of the kernel on two sets of points.
Notes
A
Kernel
operates on its two inputs by a slightly modified version of Numpy’s broadcasting rules. First of all, the operation of the kernel is vectorized over all but the last dimension, applying standard broadcasting rules. An input with shape()
is promoted to an input with shape(1,)
. Additionally, a1
along the last axis of an input is interpreted as a (set of) point(s) with equal coordinates in all input dimensions, i.e. the inputs are broadcast toinput_dim
dimensions along the last axis. We refer to this modified set of broadcasting rules as “kernel broadcasting”.Examples
See documentation of class
Kernel
.
- matrix(x0, x1=None)[source]¶
A convenience function for computing a kernel matrix for two sets of inputs.
This is syntactic sugar for
k(x0[:, None, :], x1[None, :, :])
. Hence, it computes the matrix of pairwise covariances between two sets of input points. Ifk
represents a covariance function, then the resulting matrix will be symmetric positive (semi-)definite forx0 == x1
.- Parameters
x0 (array-like) – First set of inputs to the (cross-)covariance function as an array of shape
(M, D)
, whereD
is either 1 orinput_dim
.x1 (array-like) – Optional second set of inputs to the (cross-)covariance function as an array of shape
(N, D)
, whereD
is either 1 orinput_dim
. Ifx1
is not specified, the function behaves as ifx1 = x0
.
- Returns
kernmat – The matrix / stack of matrices containing the pairwise evaluations of the (cross-)covariance function(s) on
x0
andx1
as an array of shape(M, N)
ifshape
is()
or(S[l - 1], ..., S[1], S[0], M, N)
, whereS
isshape
ifshape
is non-empty.- Return type
- Raises
ValueError – If the shapes of the inputs don’t match the specification.
See also
__call__
Evaluate the kernel more flexibly.
Examples
See documentation of class
Kernel
.