brails.processors.FoundationClassifier.csail_segmentation_tool.csail_seg.lib.nn.modules.batchnorm module

class brails.processors.FoundationClassifier.csail_segmentation_tool.csail_seg.lib.nn.modules.batchnorm.SynchronizedBatchNorm1d(num_features, eps=1e-05, momentum=0.001, affine=True)

Bases: _SynchronizedBatchNorm

Applies Synchronized Batch Normalization over a 2d or 3d input that is seen as a mini-batch.

\[y = \frac{x - mean[x]}{ \sqrt{Var[x] + \epsilon}} * gamma + beta\]

This module differs from the built-in PyTorch BatchNorm1d as the mean and standard-deviation are reduced across all devices during training.

For example, when one uses nn.DataParallel to wrap the network during training, PyTorch’s implementation normalize the tensor on each device using the statistics only on that device, which accelerated the computation and is also easy to implement, but the statistics might be inaccurate. Instead, in this synchronized version, the statistics will be computed over all training samples distributed on multiple devices.

Note that, for one-GPU or CPU-only case, this module behaves exactly same as the built-in PyTorch implementation.

The mean and standard-deviation are calculated per-dimension over the mini-batches and gamma and beta are learnable parameter vectors of size C (where C is the input size).

During training, this layer keeps a running estimate of its computed mean and variance. The running sum is kept with a default momentum of 0.1.

During evaluation, this running mean/variance is used for normalization.

Because the BatchNorm is done over the C dimension, computing statistics on (N, L) slices, it’s common terminology to call this Temporal BatchNorm

Args:

num_features: num_features from an expected input of size: batch_size x num_features [x width]
eps: a value added to the denominator for numerical stability.: Default: 1e-5
momentum: the value used for the running_mean and running_var: computation. Default: 0.1
affine: a boolean value that when set to True, gives the layer learnable: affine parameters. Default: True

Shape:

Input: \((N, C)\) or \((N, C, L)\)
Output: \((N, C)\) or \((N, C, L)\) (same shape as input)

Examples:

>>> # With Learnable Parameters
>>> m = SynchronizedBatchNorm1d(100)
>>> # Without Learnable Parameters
>>> m = SynchronizedBatchNorm1d(100, affine=False)
>>> input = torch.autograd.Variable(torch.randn(20, 100))
>>> output = m(input)

class brails.processors.FoundationClassifier.csail_segmentation_tool.csail_seg.lib.nn.modules.batchnorm.SynchronizedBatchNorm2d(num_features, eps=1e-05, momentum=0.001, affine=True)

Bases: _SynchronizedBatchNorm

Applies Batch Normalization over a 4d input that is seen as a mini-batch of 3d inputs

\[y = \frac{x - mean[x]}{ \sqrt{Var[x] + \epsilon}} * gamma + beta\]

This module differs from the built-in PyTorch BatchNorm2d as the mean and standard-deviation are reduced across all devices during training.

For example, when one uses nn.DataParallel to wrap the network during training, PyTorch’s implementation normalize the tensor on each device using the statistics only on that device, which accelerated the computation and is also easy to implement, but the statistics might be inaccurate. Instead, in this synchronized version, the statistics will be computed over all training samples distributed on multiple devices.

Note that, for one-GPU or CPU-only case, this module behaves exactly same as the built-in PyTorch implementation.

The mean and standard-deviation are calculated per-dimension over the mini-batches and gamma and beta are learnable parameter vectors of size C (where C is the input size).

During training, this layer keeps a running estimate of its computed mean and variance. The running sum is kept with a default momentum of 0.1.

During evaluation, this running mean/variance is used for normalization.

Because the BatchNorm is done over the C dimension, computing statistics on (N, H, W) slices, it’s common terminology to call this Spatial BatchNorm

Args:

num_features: num_features from an expected input of: size batch_size x num_features x height x width
eps: a value added to the denominator for numerical stability.: Default: 1e-5
momentum: the value used for the running_mean and running_var: computation. Default: 0.1
affine: a boolean value that when set to True, gives the layer learnable: affine parameters. Default: True

Shape:

Input: \((N, C, H, W)\)
Output: \((N, C, H, W)\) (same shape as input)

Examples:

>>> # With Learnable Parameters
>>> m = SynchronizedBatchNorm2d(100)
>>> # Without Learnable Parameters
>>> m = SynchronizedBatchNorm2d(100, affine=False)
>>> input = torch.autograd.Variable(torch.randn(20, 100, 35, 45))
>>> output = m(input)

class brails.processors.FoundationClassifier.csail_segmentation_tool.csail_seg.lib.nn.modules.batchnorm.SynchronizedBatchNorm3d(num_features, eps=1e-05, momentum=0.001, affine=True)

Bases: _SynchronizedBatchNorm

Applies Batch Normalization over a 5d input that is seen as a mini-batch of 4d inputs

\[y = \frac{x - mean[x]}{ \sqrt{Var[x] + \epsilon}} * gamma + beta\]

This module differs from the built-in PyTorch BatchNorm3d as the mean and standard-deviation are reduced across all devices during training.

For example, when one uses nn.DataParallel to wrap the network during training, PyTorch’s implementation normalize the tensor on each device using the statistics only on that device, which accelerated the computation and is also easy to implement, but the statistics might be inaccurate. Instead, in this synchronized version, the statistics will be computed over all training samples distributed on multiple devices.

Note that, for one-GPU or CPU-only case, this module behaves exactly same as the built-in PyTorch implementation.

The mean and standard-deviation are calculated per-dimension over the mini-batches and gamma and beta are learnable parameter vectors of size C (where C is the input size).

During training, this layer keeps a running estimate of its computed mean and variance. The running sum is kept with a default momentum of 0.1.

During evaluation, this running mean/variance is used for normalization.

Because the BatchNorm is done over the C dimension, computing statistics on (N, D, H, W) slices, it’s common terminology to call this Volumetric BatchNorm or Spatio-temporal BatchNorm

Args:

num_features: num_features from an expected input of: size batch_size x num_features x depth x height x width
eps: a value added to the denominator for numerical stability.: Default: 1e-5
momentum: the value used for the running_mean and running_var: computation. Default: 0.1
affine: a boolean value that when set to True, gives the layer learnable: affine parameters. Default: True

Shape:

Input: \((N, C, D, H, W)\)
Output: \((N, C, D, H, W)\) (same shape as input)

Examples:

>>> # With Learnable Parameters
>>> m = SynchronizedBatchNorm3d(100)
>>> # Without Learnable Parameters
>>> m = SynchronizedBatchNorm3d(100, affine=False)
>>> input = torch.autograd.Variable(torch.randn(20, 100, 35, 45, 10))
>>> output = m(input)