compressai.layers#
MaskedConv2d#
- class compressai.layers.MaskedConv2d(*args: Any, mask_type: str = 'A', **kwargs: Any)[source]#
Masked 2D convolution implementation, mask future “unseen” pixels. Useful for building auto-regressive network components.
Introduced in “Conditional Image Generation with PixelCNN Decoders”.
Inherits the same arguments as a nn.Conv2d. Use mask_type=’A’ for the first layer (which also masks the “current pixel”), mask_type=’B’ for the following layers.
GDN#
- class compressai.layers.GDN(in_channels: int, inverse: bool = False, beta_min: float = 1e-06, gamma_init: float = 0.1)[source]#
Generalized Divisive Normalization layer.
Introduced in “Density Modeling of Images Using a Generalized Normalization Transformation”, by Balle Johannes, Valero Laparra, and Eero P. Simoncelli, (2016).
\[y[i] = \frac{x[i]}{\sqrt{\beta[i] + \sum_j(\gamma[j, i] * x[j]^2)}}\]
GDN1#
- class compressai.layers.GDN1(in_channels: int, inverse: bool = False, beta_min: float = 1e-06, gamma_init: float = 0.1)[source]#
Simplified GDN layer.
Introduced in “Computationally Efficient Neural Image Compression”, by Johnston Nick, Elad Eban, Ariel Gordon, and Johannes Ballé, (2019).
\[y[i] = \frac{x[i]}{\beta[i] + \sum_j(\gamma[j, i] * |x[j]|}\]
ResidualBlock#
ResidualBlockWithStride#
ResidualBlockUpsample#
- class compressai.layers.ResidualBlockUpsample(in_ch: int, out_ch: int, upsample: int = 2)[source]#
Residual block with sub-pixel upsampling on the last convolution.
- Parameters:
in_ch (int) – number of input channels
out_ch (int) – number of output channels
upsample (int) – upsampling factor (default: 2)
AttentionBlock#
- class compressai.layers.AttentionBlock(N: int)[source]#
Self attention block.
Simplified variant from “Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules”, by Zhengxue Cheng, Heming Sun, Masaru Takeuchi, Jiro Katto.
- Parameters:
N (int) – Number of channels)
QReLU#
- class compressai.layers.QReLU(*args, **kwargs)[source]#
Clamping input with given bit-depth range. Suppose that input data presents integer through an integer network otherwise any precision of input will simply clamp without rounding operation.
Pre-computed scale with gamma function is used for backward computation.
More details can be found in “Integer networks for data compression with latent-variable models”, by Johannes Ballé, Nick Johnston and David Minnen, ICLR in 2019
- Parameters:
input – a tensor data
bit_depth – source bit-depth (used for clamping)
beta – a parameter for modeling the gradient during backward computation