杂项操作<a class="headerlink" href="#miscellaneous-ops" title="Permalink to this headline">¶

newton_step_1d(loss, x, trust_radius=None)[source]¶

执行牛顿更新步骤，以最小化批量一维变量上的损失，可选择正则化以约束在信任区域内。

详见 newton_step()。

参数

loss (torch.Tensor) – 要最小化的 x 的标量函数。
x (torch.Tensor) – 最右边维度大小为 1 的因变量。
trust_radius (float) – 可选的信任区域半径。此函数的更新值 mode 将在输入 x 的 trust_radius 范围内。

返回

一对 (mode, cov)，其中 mode 是与原始值 x 具有相同形状的更新张量，cov 是一个 1x1 协方差矩阵的估计，其形状为 cov.shape == x.shape[:-1] + (1,1)。

返回类型

newton_step_2d(loss, x, trust_radius=None)[source]¶

执行牛顿更新步骤，以最小化批量二维变量上的损失，可选择正则化以约束在信任区域内。

详见 newton_step()。

参数

loss (torch.Tensor) – 要最小化的 x 的标量函数。
x (torch.Tensor) – 最右边维度大小为 2 的因变量。
trust_radius (float) – 可选的信任区域半径。此函数的更新值 mode 将在输入 x 的 trust_radius 范围内。

返回

一对 (mode, cov)，其中 mode 是与原始值 x 具有相同形状的更新张量，cov 是一个 2x2 协方差矩阵的估计，其形状为 cov.shape == x.shape[:-1] + (2,2)。

返回类型

newton_step_3d(loss, x, trust_radius=None)[source]¶

执行牛顿更新步骤，以最小化批量三维变量上的损失，可选择正则化以约束在信任区域内。

详见 newton_step()。

参数

loss (torch.Tensor) – 要最小化的 x 的标量函数。
x (torch.Tensor) – 最右边维度大小为 2 的因变量。
trust_radius (float) – 可选的信任区域半径。此函数的更新值 mode 将在输入 x 的 trust_radius 范围内。

返回

一对 (mode, cov)，其中 mode 是与原始值 x 具有相同形状的更新张量，cov 是一个 3x3 协方差矩阵的估计，其形状为 cov.shape == x.shape[:-1] + (3,3)。

返回类型

特殊函数¶

safe_log(x)[source]¶: 类似于 torch.log()，但通过将其梯度限制在最大 1 / finfo.eps 来避免在 log(0) 处出现无限梯度。

log_beta(x, y, tol=0.0)[source]¶

计算对数 Beta 函数。

当 tol >= 0.02 时，此函数使用对数 Beta 函数的移位斯特林近似。该近似法调整了对数 Gamma 函数的斯特林近似

lgamma(z) ≈ (z - 1/2) * log(z) - z + log(2 * pi) / 2

来近似对数 Beta 函数

log_beta(x, y) ≈ ((x-1/2) * log(x) + (y-1/2) * log(y)
                  - (x+y-1/2) * log(x+y) + log(2*pi)/2)

该近似法通过使用递归迭代地移动对数 Gamma 近似来额外提高接近零时的准确性

lgamma(x) = lgamma(x + 1) - log(x)

如果此递归应用 n 次，则绝对误差上限为 error < 0.082 / n < tol，因此我们根据用户提供的 tol 选择 n。

参数

x (torch.Tensor) – 一个正张量。
y (torch.Tensor) – 一个正张量。
tol (float) – 最大绝对误差上限。默认为 0.1。对于非常小的 tol，此函数直接委托给 log_beta()。

返回类型

log_binomial(n, k, tol=0.0)[source]¶

计算对数二项式系数。

当 tol >= 0.02 时，此函数通过 log_beta() 使用对数 Beta 函数的移位斯特林近似。

参数

n (torch.Tensor) – 一个非负整数张量。
k (torch.Tensor) – 一个范围在 [0, n] 的整数张量。

返回类型

log_I1(orders: int, value: torch.Tensor, terms=250)[source]¶

计算前 n 个第一类修正贝塞尔函数的对数 .. math

\log(I_v(z)) = v*\log(z/2) + \log(\sum_{k=0}^\inf \exp\left[2*k*\log(z/2) - \sum_kk^k log(kk)
- \lgamma(v + k + 1)\right])

参数

orders – 对数修正贝塞尔函数的阶数。
value – 用于计算修正贝塞尔函数的值。
terms – 求和截断项数。

返回

0 到 orders 的修正贝塞尔函数

get_quad_rule(num_quad, prototype_tensor)[source]¶

获取具有指定数量积分点的高斯-埃尔米特积分规则的积分点及其对应的对数权重。

示例用法

quad_points, log_weights = get_quad_rule(32, prototype_tensor)
# transform to N(0, 4.0) Normal distribution
quad_points *= 4.0
# compute variance integral in log-space using logsumexp and exponentiate
variance = torch.logsumexp(quad_points.pow(2.0).log() + log_weights, axis=0).exp()
assert (variance - 16.0).abs().item() < 1.0e-6

参数

num_quad (int) – 积分点数量。
prototype_tensor (torch.Tensor) – 用于确定返回张量的 dtype 和 device。

返回

形式为 (quad_points, log_weights) 的 torch.Tensor`s 元组

sparse_multinomial_likelihood(total_count, nonzero_logits, nonzero_value)[source]¶

以下是等价的

# Version 1. dense
log_prob = Multinomial(logits=logits).log_prob(value).sum()

# Version 2. sparse
nnz = value.nonzero(as_tuple=True)
log_prob = sparse_multinomial_likelihood(
    value.sum(-1),
    (logits - logits.logsumexp(-1))[nnz],
    value[nnz],
)

张量工具¶

as_complex(x)[source]¶: 类似于 torch.view_as_complex()，但会在步长不是二的倍数时复制数据。

block_diag_embed(mat)[source]¶

输入形状为 (…, B, M, N) 的张量，返回形状为 (…, B x M, B x N) 的块对角张量。

参数: mat (torch.Tensor) – 一个具有 3 个或更多维度的输入张量
返回 torch.Tensor: 一个维度为 m.dim() - 1 的块对角张量

block_diagonal(mat, block_size)[source]¶

输入形状为 (…, B x M, B x N) 的块对角张量，返回形状为 (…, B, M, N) 的张量。

参数

mat (torch.Tensor) – 一个具有 2 个或更多维度的输入张量
block_size (int) – 块 B 的数量。

返回 torch.Tensor

一个维度为 mat.dim() + 1 的张量

periodic_repeat(tensor, size, dim)[source]¶

将周期大小为 period 的张量重复到给定大小 size。例如

>>> x = torch.tensor([[1, 2, 3], [4, 5, 6]])
>>> periodic_repeat(x, 4, 0)
tensor([[1, 2, 3],
        [4, 5, 6],
        [1, 2, 3],
        [4, 5, 6]])
>>> periodic_repeat(x, 4, 1)
tensor([[1, 2, 3, 1],
        [4, 5, 6, 4]])

这对于在时间序列模型中计算静态季节性非常有用。

参数

tensor (torch.Tensor) – 差值张量。
size (int) – 结果沿维度 dim 的所需大小。
dim (int) – 沿其重复的张量维度。

periodic_cumsum(tensor, period, dim)[source]¶

计算沿给定维度的周期性累积和。例如，如果 dim=0

for t in range(period):
    assert result[t] == tensor[t]
for t in range(period, len(tensor)):
    assert result[t] == tensor[t] + result[t - period]

这对于在时间序列模型中计算漂移季节性非常有用。

参数

tensor (torch.Tensor) – 差值张量。
period (int) – 重复周期。
dim (int) – 沿其累积的张量维度。

periodic_features(duration, max_period=None, min_period=None, **options)[source]¶

创建从 max_period 到 min_period 的周期性 (sin,cos) 特征。

这在时间序列模型中非常有用，其中长期的不规则季节性可以通过回归处理。当仅指定 max_period 时，此函数会生成所有长度尺度的周期性特征。当也指定 min_period 时，此函数会生成大长度尺度的周期性特征，但忽略高频特征。这对于将长期季节性的回归与 periodic_repeat() 和 periodic_cumsum() 等技术结合用于短期时间尺度非常有用。例如，要将年度季节性回归到一周的尺度，可以设置 max_period=365.25 和 min_period=7。

参数

duration (int) – 离散时间步长数量。
max_period (float) – 可选的最大周期，默认为 duration。
min_period (float) – 可选的最小周期（不包含），默认为 2 = 奈奎斯特截止频率。
**options – 张量构造选项，例如 dtype 和 device。

返回

一个形状为 (duration, 2 * ceil(max_period / min_period) - 2) 的特征张量，归一化至 [-1,1] 范围内。

返回类型

Tensor

next_fast_len(size)[source]¶

返回大于等于 size 的下一个最大数字 n，其素因数仅包含 2、3 或 5。这些大小对于快速傅里叶变换是高效的。等价于 scipy.fftpack.next_fast_len()。

参数: size (int) – 一个正数。
返回: 一个可能更大的数字。
Rtype int

convolve(signal, kernel, mode='full')[source]¶

使用 FFT 计算 signal 和 kernel 的一维卷积。这两个参数的最右侧维度应该相同，但除此之外可以任意广播。

参数

signal (torch.Tensor) – 用于卷积的信号。
kernel (torch.Tensor) – 卷积核。
mode (str) – 以下之一：'full'、'valid'、'same'。

返回

具有广播形状的张量。设 m = signal.size(-1) 且 n = kernel.size(-1)，结果的最右侧大小将是：如果 mode 为 'full' 则为 m + n - 1；如果 mode 为 'valid' 则为 max(m, n) - min(m, n) + 1；如果 mode 为 'same' 则为 max(m, n)。

Rtype torch.Tensor

repeated_matmul(M, n)[source]¶

输入一批矩阵 M，返回进行 n 次矩阵乘法 \(M\), \(M^2\), …, \(M^n\) 的堆叠结果。并行成本与 n 的对数成正比。

参数

M (torch.Tensor) – 形状为 (…, N, N) 的一批方张量。
n (int) – 最大乘积 \(M^n\) 的阶数。

返回 torch.Tensor

形状为 (n, …, N, N) 的一批方张量

dct(x, dim=- 1)[source]¶

第二类离散余弦变换，归一化为正交。

这是 idct_ii() 的逆变换，等价于 scipy.fftpack.dct() 并设置 norm="ortho"。

参数

x (Tensor) – 输入信号。
dim (int) – 沿其计算 DCT 的维度。

返回类型

Tensor

idct(x, dim=- 1)[source]¶

第二类逆离散余弦变换，归一化为正交。

这是 dct_ii() 的逆变换，等价于 scipy.fftpack.idct() 并设置 norm="ortho"。

参数

x (Tensor) – 输入信号。
dim (int) – 沿其计算 DCT 的维度。

返回类型

Tensor

haar_transform(x)[source]¶

离散哈尔变换。

沿最后一个维度执行哈尔变换。这是 inverse_haar_transform() 的逆变换。

参数: x (Tensor) – 输入信号。
返回类型: Tensor

inverse_haar_transform(x)[source]¶

沿最后一个维度执行逆哈尔变换。这是 haar_transform() 的逆变换。

参数: x (Tensor) – 输入信号。
返回类型: Tensor

safe_cholesky(x)[source]¶

cholesky_solve(x, y)[source]¶

matmul(x, y)[source]¶

matvecmul(x, y)[source]¶

triangular_solve(x, y, upper=False, transpose=False)[source]¶

precision_to_scale_tril(P)[source]¶

safe_normalize(x, *, p=2)[source]¶

相对于 p-范数安全地将向量投影到球面上。这通过将零映射到向量 [1, 0, 0, ..., 0] 来避免零点的奇异性。

参数

x (torch.Tensor) – 一个向量
p (float) – 范数指数，默认为 2，即欧几里得范数。

返回

一个归一化版本 x / ||x||_p。

返回类型

Tensor

张量索引¶

index(tensor, args)[source]¶

使用嵌套元组进行索引。

另请参阅方便的包装器 Index。

这对于编写兼容多种解释的索引代码非常有用，例如标量求值、向量化求值或形状重塑。

例如，假设 x 是一个参数，其 x.dim() == 2，并且我们希望将表达式 x[..., t] 泛化，其中 t 可以是以下任意一种

一个标量 t=1，如 x[..., 1]；
一个切片 t=slice(None)，等价于 x[..., :]；或
一个形状重塑操作 t=(Ellipsis, None)，等价于 x.unsqueeze(-1)。

虽然朴素索引对前两个示例有效，但第三个示例会导致嵌套元组 (Ellipsis, (Ellipsis, None))。此辅助函数会展平该嵌套元组并合并连续的 Ellipsis。

参数

tensor (torch.Tensor) – 要被索引的张量。
args (tuple) – 一个索引，作为传递给 __getitem__ 的参数。

返回

对 tensor[args] 的扁平化解释。

返回类型

class Index(tensor)[source]¶

基类: object

index() 的方便包装器。

以下是等价的

Index(x)[..., i, j, :]
index(x, (Ellipsis, i, j, slice(None)))

参数: tensor (torch.Tensor) – 要被索引的张量。
返回: 具有特殊方法 __getitem__() 的对象。

vindex(tensor, args)[source]¶

具有广播语义的向量化高级索引。

另请参阅方便的包装器 Vindex。

这对于编写兼容批处理和枚举的索引代码非常有用，特别是适用于使用离散随机变量选择混合成分。

例如，假设 x 是一个参数，其 x.dim() == 3，我们希望将表达式 x[i, :, j] 从整数 i,j 泛化到具有批处理维度和枚举维度（但没有事件维度）的张量 i,j。然后我们可以使用 Vindex 编写泛化版本

xij = Vindex(x)[i, :, j]

batch_shape = broadcast_shape(i.shape, j.shape)
event_shape = (x.size(1),)
assert xij.shape == batch_shape + event_shape

为了处理 x 也可能包含批处理维度的情况（例如，如果 x 在分板上下文中使用向量化粒子进行采样），vindex() 使用特殊约定，Ellipsis 表示批处理维度（因此 ... 只能出现在左侧，不能出现在中间或右侧）。假设 x 的事件维度为 3。然后我们可以写

old_batch_shape = x.shape[:-3]
old_event_shape = x.shape[-3:]

xij = Vindex(x)[..., i, :, j]   # The ... denotes unknown batch shape.

new_batch_shape = broadcast_shape(old_batch_shape, i.shape, j.shape)
new_event_shape = (x.size(1),)
assert xij.shape = new_batch_shape + new_event_shape

请注意，这种对 Ellipsis 的特殊处理与 NEP [1] 不同。

形式上，此函数假定

每个参数要么是 Ellipsis、slice(None)、一个整数，要么是一个批处理的 torch.LongTensor（即具有空的事件形状）。此函数不支持非平凡切片或 torch.BoolTensor 掩码。Ellipsis 只能作为 args[0] 出现在左侧。
如果 args[0] is not Ellipsis 为真，则 tensor 未进行批处理，其事件维度等于 len(args)。
如果 args[0] is Ellipsis 为真，则 tensor 已进行批处理，其事件维度等于 len(args[1:])。tensor 在事件维度左侧的维度被视为批处理维度，并将与张量参数的维度进行广播。

注意，如果所有参数中没有一个张量的 .dim() > 0 为真，则此函数 behaves like standard indexing

if not any(isinstance(a, torch.Tensor) and a.dim() for a in args):
    assert Vindex(x)[args] == x[args]

参考文献

[1] https://numpy.com.cn/neps/nep-0021-advanced-indexing.html: 引入 vindex 作为向量化索引的辅助函数。Pyro 实现类似于提议的记法 x.vindex[]，只是对 Ellipsis 的处理略有不同。

参数

tensor (torch.Tensor) – 要被索引的张量。
args (tuple) – 一个索引，作为传递给 __getitem__ 的参数。

返回

对 tensor[args] 的非标准解释。

返回类型

class Vindex(tensor)[source]¶

基类: object

vindex() 的方便包装器。

以下是等价的

Vindex(x)[..., i, j, :]
vindex(x, (Ellipsis, i, j, slice(None)))

参数: tensor (torch.Tensor) – 要被索引的张量。
返回: 具有特殊方法 __getitem__() 的对象。

张量收缩¶

contract(equation, *operands, **kwargs)[source]¶

对 opt_einsum.contract() 的包装器，可选择使用 Pyro 的廉价优化器并可选择缓存收缩路径。

参数: cache_path (bool) – 是否缓存收缩路径。默认为 True。

contract_expression(equation, *shapes, **kwargs)[source]¶

对 opt_einsum.contract_expression() 的包装器，可选择使用 Pyro 的廉价优化器并可选择缓存收缩路径。

参数: cache_path (bool) – 是否缓存收缩路径。默认为 True。

einsum(equation, *operands, **kwargs)[source]¶

通过张量变量消除实现的泛化分板求和-乘积算法。

这在两个方面泛化了 contract()

允许多个输出，并且可以共享中间结果。
输入和输出可以沿 plates 中给定的符号进行分板；沿 plates 的缩减是乘积缩减。

理解此函数的最佳方法是尝试下面的示例，其中展示了如何将 einsum() 调用实现为对 contract() 的多次调用（通常更昂贵）。

为了说明多输出，请注意以下是等价的

z1, z2, z3 = einsum('ab,bc->a,b,c', x, y)  # multiple outputs

z1 = contract('ab,bc->a', x, y)
z2 = contract('ab,bc->b', x, y)
z3 = contract('ab,bc->c', x, y)

为了说明分板输入，请注意以下是等价的

assert len(x) == 3 and len(y) == 3
z = einsum('ab,ai,bi->b', w, x, y, plates='i')

z = contract('ab,a,a,a,b,b,b->b', w, *x, *y)

当求和维度 a 总是与分板维度 i 一起出现时，a 对应于 a 的每个切片的一个独特符号。因此以下是等价的

assert len(x) == 3 and len(y) == 3
z = einsum('ai,ai->', x, y, plates='i')

z = contract('a,b,c,a,b,c->', *x, *y)

当这样的求和维度出现在输出中时，它必须伴随其所有分板维度，例如以下是等价的

assert len(x) == 3 and len(y) == 3
z = einsum('abi,abi->bi', x, y, plates='i')

z0 = contract('ab,ac,ad,ab,ac,ad->b', *x, *y)
z1 = contract('ab,ac,ad,ab,ac,ad->c', *x, *y)
z2 = contract('ab,ac,ad,ab,ac,ad->d', *x, *y)
z = torch.stack([z0, z1, z2])

注意，通过输出的每个分板切片在通过所有输入的任何分板切片中都是多线性的，因此例如批处理矩阵乘法将在没有 plates 的情况下实现，所以以下都是等价的

xy = einsum('abc,acd->abd', x, y, plates='')
xy = torch.stack([xa.mm(ya) for xa, ya in zip(x, y)])
xy = torch.bmm(x, y)

在所有有效等式中，一些计算的成本与输入张量大小呈多项式关系，另一些计算的成本与输入张量大小呈指数关系。当计算成本呈指数关系时，此函数会引发 NotImplementedError。

参数

equation (str) – 一个 einsum 等式，可选包含多个输出。
operands (torch.Tensor) – 张量集合。
plates (str) – 可选的分板符号字符串。
backend (str) – 可选的 einsum 后端，默认为 'torch'。
cache (dict) – 可选的 shared_intermediates() 缓存。
modulo_total (bool) – 可选允许 einsum 任意缩放每个结果分板，这可以显著减少计算量。当每个结果分板表示一个非归一化概率分布，且其总和并不重要时，设置此参数是安全的。

返回

请求形状的张量元组，每个输出一个条目。

返回类型

抛出

ValueError – 如果张量大小不匹配或某个输出请求了分板维度但没有该维度的分板。
NotImplementedError – 如果收缩成本相对于任何输入张量的大小呈指数关系。

ubersum(equation, *operands, **kwargs)[source]¶: 已弃用，请改用 einsum()。

高斯收缩¶

class Gaussian(log_normalizer: torch.Tensor, info_vec: torch.Tensor, precision: torch.Tensor)[source]¶

基类: object

非归一化高斯分布。

这表示任意半正定二次函数，可以解释为降秩的缩放高斯分布。精度矩阵可能具有零特征值，因此可能无法直接使用协方差矩阵。

参数

log_normalizer (torch.Tensor) – 一个归一化常数，主要用于在收缩过程中跟踪归一化项。
info_vec (torch.Tensor) – 信息向量，它是均值的缩放版本 info_vec = precision @ mean。我们使用此表示法使高斯收缩快速稳定。
precision (torch.Tensor) – 此高斯分布的精度矩阵。

dim()[source]¶

property batch_shape¶

expand(batch_shape) → pyro.ops.gaussian.Gaussian[source]¶

reshape(batch_shape) → pyro.ops.gaussian.Gaussian[source]¶

__getitem__(index) → pyro.ops.gaussian.Gaussian[source]¶: 对高斯分布的 batch_shape 进行索引。

static cat(parts, dim=0) → pyro.ops.gaussian.Gaussian[source]¶: 沿给定批处理维度连接高斯分布列表。

event_pad(left=0, right=0) → pyro.ops.gaussian.Gaussian[source]¶: 沿事件维度进行填充。

event_permute(perm) → pyro.ops.gaussian.Gaussian[source]¶: 沿事件维度进行置换。

__add__(other: Union[pyro.ops.gaussian.Gaussian, int, float, torch.Tensor]) → pyro.ops.gaussian.Gaussian[source]¶: 在对数密度空间中添加两个高斯分布。

log_density(value: torch.Tensor) → torch.Tensor[source]¶

在点 value 处评估此高斯分布的对数密度。

-0.5 * value.T @ precision @ value + value.T @ info_vec + log_normalizer

这主要用于测试。

rsample(sample_shape=torch.Size([]), noise: Optional[torch.Tensor] = None) → torch.Tensor[source]¶: 重参数化采样器。

condition(value: torch.Tensor) → pyro.ops.gaussian.Gaussian[source]¶

在其状态的尾部子集上对该高斯分布进行条件化。这应该满足

g.condition(y).dim() == g.dim() - y.size(-1)

注意，由于这是一个非归一化高斯分布，我们将 y 的密度包含在结果中。因此 condition() 类似于 functools.partial 的参数绑定

left = x[..., :n]
right = x[..., n:]
g.log_density(x) == g.condition(right).log_density(left)

left_condition(value: torch.Tensor) → pyro.ops.gaussian.Gaussian[source]¶

在其状态的头部子集上对该高斯分布进行条件化。这应该满足

g.condition(y).dim() == g.dim() - y.size(-1)

注意，由于这是一个非归一化高斯分布，我们将 y 的密度包含在结果中。因此 condition() 类似于 functools.partial 的参数绑定

left = x[..., :n]
right = x[..., n:]
g.log_density(x) == g.left_condition(left).log_density(right)

marginalize(left=0, right=0) → pyro.ops.gaussian.Gaussian[source]¶

边缘化事件维度两侧的变量

g.marginalize(left=n).event_logsumexp() = g.logsumexp()
g.marginalize(right=n).event_logsumexp() = g.logsumexp()

对于数据 x

g.condition(x).event_logsumexp()
= g.marginalize(left=g.dim() - x.size(-1)).log_density(x)

event_logsumexp() → torch.Tensor[source]¶: 积分消除所有潜在状态（即在事件维度上操作）。

class AffineNormal(matrix, loc, scale)[source]¶

基类: object

表示随机变量 Y 的条件对角正态分布，其均值是随机变量 X 的仿射函数。因此 X 的似然为

AffineNormal(matrix, loc, scale).condition(y).log_density(x)

等价于

Normal(x @ matrix + loc, scale).to_event(1).log_prob(y)

参数

matrix (torch.Tensor) – 从 X 到 Y 的变换。最右侧形状应为 (x_dim, y_dim)。
loc (torch.Tensor) – Y 均值的常数偏移。最右侧形状应为 (y_dim,)。
scale (torch.Tensor) – Y 的标准差。最右侧形状应为 (y_dim,)。

property batch_shape¶

condition(value)[source]¶

left_condition(value)[source]¶: 如果 value.size(-1) == x_dim，则返回 event_dim=1 的正态分布。应用此方法后，抽取样本的成本是 O(y_dim) 而不是 O(y_dim ** 3)。

rsample(sample_shape=torch.Size([]), noise: Optional[torch.Tensor] = None) → torch.Tensor[source]¶: 重参数化采样器。

to_gaussian()[source]¶

expand(batch_shape)[source]¶

reshape(batch_shape)[source]¶

__getitem__(index)[source]¶

event_permute(perm)[source]¶

__add__(other)[source]¶

marginalize(left=0, right=0)[source]¶

mvn_to_gaussian(mvn)[source]¶

将多变量正态分布转换为高斯分布。

参数: mvn (MultivariateNormal) – 一个多变量正态分布。
返回: 一个等价的高斯对象。
返回类型: 高斯

matrix_and_gaussian_to_gaussian(matrix: torch.Tensor, y_gaussian: pyro.ops.gaussian.Gaussian) → pyro.ops.gaussian.Gaussian[source]¶

构造一个 p(y|x) 的条件高斯分布，其中 y - x @ matrix ~ y_gaussian。

参数

matrix (torch.Tensor) – 一个右乘的变换矩阵。
y_gaussian (Gaussian) – 关于 y - x@matrix 噪声的分布。

返回类型

高斯

matrix_and_mvn_to_gaussian(matrix, mvn)[source]¶

将含噪声的仿射函数转换为高斯分布。含噪声的仿射函数定义为

y = x @ matrix + mvn.sample()

参数

matrix (张量) – 最右侧形状为 (x_dim, y_dim) 的矩阵。
mvn (MultivariateNormal) – 一个多变量正态分布。

返回

一个批量形状广播且 .dim() == x_dim + y_dim 的高斯分布。

返回类型

高斯

gaussian_tensordot(x: pyro.ops.gaussian.Gaussian, y: pyro.ops.gaussian.Gaussian, dims: int = 0) → pyro.ops.gaussian.Gaussian[source]¶

计算两个高斯分布的积分

(x @ y)(a,c) = log(integral(exp(x(a,b) + y(b,c)), b)),

其中 x 是关于变量 (a,b) 的高斯分布，y 是关于变量 (b,c) 的高斯分布，(a,b,c) 各自可以是零个或多个变量的集合，dims 是 b 的大小。

参数

x – 一个 Gaussian 实例
y – 一个 Gaussian 实例
dims – 要收缩的变量数量

sequential_gaussian_tensordot(gaussian: pyro.ops.gaussian.Gaussian) → pyro.ops.gaussian.Gaussian[source]¶

对一个右侧批量维度表示时间的高斯分布 x 进行积分，计算

x[..., 0] @ x[..., 1] @ ... @ x[..., T-1]

参数: gaussian (高斯分布) – 一个最右侧维度为时间的批处理高斯分布。
返回: 高斯分布沿着其时间维度进行马尔可夫乘积。
返回类型: 高斯

sequential_gaussian_filter_sample(init: pyro.ops.gaussian.Gaussian, trans: pyro.ops.gaussian.Gaussian, sample_shape: Tuple[int, ...] = (), noise: Optional[torch.Tensor] = None) → torch.Tensor[source]¶

通过并行扫描（前向滤波后向采样）从高斯分布的马尔可夫乘积中绘制重参数化样本。

参数

init (高斯分布) – 表示初始状态的高斯分布。
trans (高斯分布) – 表示一系列状态转移的高斯分布，其中时间是最右侧的批量维度。此分布的事件维度必须是 init 的两倍：trans.dim() == 2 * init.dim()。
sample_shape (元组) – 要绘制的样本的可选额外形状。
noise (torch.Tensor) – 可选的标准白噪声张量，形状为 sample_shape + batch_shape + (duration, state_dim)，其中 duration = 1 + trans.batch_shape[-1] 是要采样的时间点数量，state_dim = init.dim() 是状态维度。这对于计算均值（传递零张量）、改变温度（传递缩放的噪声）和对偶采样（传递 cat([z,-z])）很有用。

返回

重参数化的样本，形状为 sample_shape + batch_shape + (duration, state_dim)。

返回类型

统计工具¶

gelman_rubin(input, chain_dim=0, sample_dim=1)[source]¶

计算样本链上的 R-hat。要求 input.size(sample_dim) >= 2 且 input.size(chain_dim) >= 2。

参数

input (torch.Tensor) – 输入张量。
chain_dim (int) – 链维度。
sample_dim (int) – 样本维度。

返回 torch.Tensor

input 的 R-hat。

split_gelman_rubin(input, chain_dim=0, sample_dim=1)[source]¶

计算样本链上的 R-hat。要求 input.size(sample_dim) >= 4。

参数

input (torch.Tensor) – 输入张量。
chain_dim (int) – 链维度。
sample_dim (int) – 样本维度。

返回 torch.Tensor

input 的 split R-hat。

autocorrelation(input, dim=0)[source]¶

计算维度 dim 上样本的自相关性。

参考：https://en.wikipedia.org/wiki/Autocorrelation#Efficient_computation

参数

input (torch.Tensor) – 输入张量。
dim (int) – 计算自相关性的维度。

返回 torch.Tensor

input 的自相关性。

autocovariance(input, dim=0)[source]¶

计算维度 dim 上样本的自协方差。

参数

input (torch.Tensor) – 输入张量。
dim (int) – 计算自相关性的维度。

返回 torch.Tensor

input 的自相关性。

effective_sample_size(input, chain_dim=0, sample_dim=1)[source]¶

计算输入的有效样本量。

参考

[1] 《马尔可夫链蒙特卡洛方法导论》,: Charles J. Geyer
[2] 《Stan 参考手册版本 2.18》,: Stan 开发团队

参数

input (torch.Tensor) – 输入张量。
chain_dim (int) – 链维度。
sample_dim (int) – 样本维度。

返回 torch.Tensor

input 的有效样本量。

resample(input, num_samples, dim=0, replacement=False)[source]¶

从 input 中在维度 dim 上抽取 num_samples 个样本。

参数

input (torch.Tensor) – 输入张量。
num_samples (int) – 从 input 中抽取的样本数量。
dim (int) – 从 input 中抽取的维度。

返回 torch.Tensor

从 input 中随机抽取的样本。

quantile(input, probs, dim=0)[source]¶

计算 input 在 probs 处的分位数。如果 probs 是标量，输出将在 dim 维度被压缩。

参数

input (torch.Tensor) – 输入张量。
probs (list) – 分位数位置。
dim (int) – 从 input 中计算分位数的维度。

返回 torch.Tensor

input 在 probs 处的分位数。

weighed_quantile(input: torch.Tensor, probs: Union[List[float], Tuple[float, ...], torch.Tensor], log_weights: torch.Tensor, dim: int = 0) → torch.Tensor[source]¶

计算加权 input 样本在 probs 处的分位数。

参数

input (torch.Tensor) – 输入张量。
probs (list) – 分位数位置。
log_weights (torch.Tensor) – 样本权重张量。
dim (int) – 从 input 中计算分位数的维度。

返回 torch.Tensor

input 在 probs 处的分位数。

示例

>>> from pyro.ops.stats import weighed_quantile
>>> import torch
>>> input = torch.Tensor([[10, 50, 40], [20, 30, 0]])
>>> probs = torch.Tensor([0.2, 0.8])
>>> log_weights = torch.Tensor([0.4, 0.5, 0.1]).log()
>>> result = weighed_quantile(input, probs, log_weights, -1)
>>> torch.testing.assert_close(result, torch.Tensor([[40.4, 47.6], [9.0, 26.4]]))

pi(input, prob, dim=0)[source]¶

计算百分位数区间，该区间将相等的概率质量分配给区间的每个尾部。

参数

input (torch.Tensor) – 输入张量。
prob (float) – 区间内样本的概率质量。
dim (int) – 从 input 中计算百分位数区间的维度。

返回 torch.Tensor

input 在 probs 处的分位数。

hpdi(input, prob, dim=0)[source]¶

计算“最高后验密度区间”（HPDI），这是具有概率质量 prob 的最窄区间。

参数

input (torch.Tensor) – 输入张量。
prob (float) – 区间内样本的概率质量。
dim (int) – 从 input 中计算百分位数区间的维度。

返回 torch.Tensor

input 在 probs 处的分位数。

waic(input, log_weights=None, pointwise=False, dim=0)[source]¶

计算“广泛适用/渡边-赤池信息准则”（WAIC）及其对应的有效参数数量。

参考

[1] 《Stan 中的 WAIC 和交叉验证》，Aki Vehtari, Andrew Gelman

参数

input (torch.Tensor) – 输入张量，即模型的对数似然。
log_weights (torch.Tensor) – 沿 dim 维度的样本权重。
dim (int) – input 的样本维度。

返回元组

WAIC 和有效参数数量的元组。

fit_generalized_pareto(X)[source]¶

给定一个假设从广义帕累托分布中抽取的数据集 X，使用参考 [1] 中描述的技术的变体，如参考 [2] 中所述，估计分布参数 k 和 sigma。

参考资料 [1] 《广义帕累托分布的一种新的高效估计方法》。Zhang, J. and Stephens, M.A. (2009)。[2] 《帕累托平滑重要性抽样》。Aki Vehtari, Andrew Gelman, Jonah Gabry

参数: torch.Tensor – 输入数据 X
返回元组: 对应拟合参数的浮点数元组 (k, sigma)

crps_empirical(pred, truth)[source]¶

计算样本集 pred 与真实数据 truth 之间的负连续排名概率评分 CRPS* [1]。这使用了一种 n log(n) 时间算法来计算一个量，而朴素地计算该量的复杂度与样本数量 n 的平方成正比。

CRPS* = E|pred - truth| - 1/2 E|pred - pred'|
      = (pred - truth).abs().mean(0)
      - (pred - pred.unsqueeze(1)).abs().mean([0, 1]) / 2

注意，对于单个样本，这等同于绝对误差。

参考文献

[1] Tilmann Gneiting, Adrian E. Raftery (2007): 《严格正确的评分规则、预测和估计》 https://www.stat.washington.edu/raftery/Research/PDF/Gneiting2007jasa.pdf

参数

pred (torch.Tensor) – 在最右侧维度上进行批处理的一组样本预测。其形状应为 (num_samples,) + truth.shape。
truth (torch.Tensor) – 真实观测值的张量。

返回

形状为 truth.shape 的张量。

返回类型

energy_score_empirical(pred: torch.Tensor, truth: torch.Tensor) → torch.Tensor[source]¶

计算一组多元样本 pred 与真实数据向量 truth 之间的负能量评分 ES* (参见 [1] 中公式 22)。运行时间与样本数量 n 的平方成正比。在单变量样本的情况下，输出与 CRPS 一致。

ES* = E|pred - truth| - 1/2 E|pred - pred'|

注意，对于单个样本，这等同于样本 pred 与 truth 之间差值的欧几里得范数。

这是一个严格正确的评分，因此对于服从分布 \(P\) 的 pred 和服从分布 \(Q\) 的 truth，我们有 \(ES^{*}(P,Q) \ge ES^{*}(Q,Q)\)，当且仅当 \(P=Q\) 时等号成立，即如果 \(P\) 和 \(Q\) 具有相同的多元分布（仅有相同的边缘分布不足以使等号成立）。

参考文献

[1] Tilmann Gneiting, Adrian E. Raftery (2007): 《严格正确的评分规则、预测和估计》 https://www.stat.washington.edu/raftery/Research/PDF/Gneiting2007jasa.pdf

参数

pred (torch.Tensor) – 在左起第二个维度上进行批处理的一组样本预测。最左边的维度是多元样本的维度。
truth (torch.Tensor) – 真实观测值的张量，其形状与 pred 相同，但左起第二个维度可以有任意值或被省略。

返回

形状为 truth.shape 的张量。

返回类型

流式统计¶

class StreamingStats[source]¶

基类： abc.ABC

张量树流式统计的抽象基类。

派生类必须实现 update()、merge() 和 get() 方法。

abstract update(sample) → None[source]¶

从单个样本更新状态。

这会改变 self 并返回 None。更新应该与顺序无关，即样本应该是可交换的。

参数: sample – 一个样本值，它是嵌套的字典，叶子节点是 torch.Tensor。它可以具有任意的嵌套和形状，但假定形状在每次调用 .update() 时是常量。

abstract merge(other) → pyro.ops.streaming.StreamingStats[source]¶

合并两个聚合统计信息，例如来自不同 MCMC 链的统计信息。

这是一个纯函数：它返回一个新的 StreamingStats 对象，并且不会修改 self 或 other。

参数: other – 另一个相同类型的流式统计实例。

abstract get() → Any[source]¶: 返回聚合统计信息。

class StatsOfDict(types: Dict[Hashable, Callable[[], pyro.ops.streaming.StreamingStats]] = {}, default: Callable[[], pyro.ops.streaming.StreamingStats] = <class 'pyro.ops.streaming.CountStats'>)[source]¶

字典样本的统计信息，其中键集是固定的。

例如，以下是等效的

# Version 1. Hand encode statistics.
>>> a_stats = CountStats()
>>> b_stats = CountMeanStats()
>>> a_stats.update(torch.tensor(0.))
>>> b_stats.update(torch.tensor([1., 2.]))
>>> summary = {"a": a_stats.get(), "b": b_stats.get()}

# Version 2. Collect samples into dictionaries.
>>> stats = StatsOfDict({"a": CountStats, "b": CountMeanStats})
>>> stats.update({"a": torch.tensor(0.), "b": torch.tensor([1., 2.])})
>>> summary = stats.get()
>>> summary
{'a': {'count': 1}, 'b': {'count': 1, 'mean': tensor([1., 2.])}}

参数

default – 字典值的默认统计类型。默认为开销较小的 CountStats。
types (dict) – 将键映射到应记录该键对应值的统计类型的字典。

update(sample: Dict[Hashable, Any]) → None[source]¶

merge(other: pyro.ops.streaming.StatsOfDict) → pyro.ops.streaming.StatsOfDict[source]¶

get() → Dict[Hashable, Any][source]¶

返回: 统计信息字典。此字典的键与更新此对象的样本的键相同。
返回类型: dict

class StackStats[source]¶

将张量流收集到单个堆叠张量中的统计信息。

update(sample: torch.Tensor) → None[source]¶

merge(other: pyro.ops.streaming.StackStats) → pyro.ops.streaming.StackStats[source]¶

get() → Dict[str, Union[int, torch.Tensor]][source]¶

返回: 一个字典，包含键 count: int 和（如果已收集到任何样本）samples: torch.Tensor。
返回类型: dict

class CountStats[source]¶

仅跟踪样本数量的统计信息。

例如

>>> stats = CountStats()
>>> stats.update(torch.randn(3, 3))
>>> stats.get()
{'count': 1}

update(sample) → None[source]¶

merge(other: pyro.ops.streaming.CountStats) → pyro.ops.streaming.CountStats[source]¶

get() → Dict[str, int][source]¶

返回: 一个包含键 count: int 的字典。
返回类型: dict

class CountMeanStats[source]¶

跟踪单个 torch.Tensor 的数量和平均值的统计信息。

update(sample: torch.Tensor) → None[source]¶

merge(other: pyro.ops.streaming.CountMeanStats) → pyro.ops.streaming.CountMeanStats[source]¶

get() → Dict[str, Union[int, torch.Tensor]][source]¶

返回: 一个字典，包含键 count: int 和（如果已收集到任何样本）mean: torch.Tensor。
返回类型: dict

class CountMeanVarianceStats[source]¶