compressai_vision.pipelines.fo_vcm.pipeline#
base#
- class compressai_vision.pipelines.fo_vcm.pipeline.base.EncoderDecoder[source]#
NOTE: virtual class that you need to subclass
An instance of this class encodes an image, calculates the number of bits and decodes the encoded image, resulting in “transformed” image.
Transformed image is similar to the original image, while the encoding+decoding process might have introduced some distortion.
The instance may (say, H266 video encoder+decoder) or may not (say, jpeg encoder+decoder) have an internal state.
- class compressai_vision.pipelines.fo_vcm.pipeline.base.VoidEncoderDecoder[source]#
Does no encoding/decoding whatsoever. Use for debugging.
compressai#
- class compressai_vision.pipelines.fo_vcm.pipeline.compressai.CompressAIEncoderDecoder(net, device='cpu', dump=False, m: int = 64, ffmpeg='ffmpeg', scale: int | None = None, half=False)[source]#
EncoderDecoder class for CompressAI
- Parameters:
net – compressai network, for example:
net = bmshj2018_factorized(quality=2, pretrained=True).eval().to(device)
- Parameters:
device – “cpu” or “cuda”
dump – (debugging) dump transformed images to disk. default = False
m – images should be multiples of this number. If not, a padding is applied before passing to compressai. default = 64
ffmpeg – ffmpeg command used for padding/scaling (as defined by VCM working group). Default: “ffmpeg”.
scale – enable the VCM working group defined padding/scaling pre & post-processings steps. Possible values: 100 (default), 75, 50, 25. Special value: None = ffmpeg scaling. 100 equals to a simple padding operation
dump – debugging option: dump input, intermediate and output images to disk in local directory
This class uses CompressAI model API’s
compress
anddecompress
methods, so if your model has them, then it is compatible with this particularEncoderDecoder
class, in detail:- ::
# CompressAI model API: # compression: out_enc = self.net.compress(x) bitstream = out_enc[“strings”][0][0] # compressed bitstream # decompression: out_dec = self.net.decompress(out_enc[“strings”], out_enc[“shape”]) x_hat = out_dec[“x_hat”] # reconstructed image
- BGR(bgr_image: array, tag=None) tuple [source]#
Return transformed image and nbits for a BGR image
- Parameters:
bgr_image – numpy BGR image (y,x,3)
tag – a string that can be used to identify & cache images (optional)
Returns number of bits and transformed BGR image that has gone through compressai encoding+decoding.
Scales the image if scaling is requested (1) [with ffmpeg]
Pads the image for CompressAI (2) [with ffmpeg - feel free to switch to torch if you want]
Runs the image through CompressAI model
Removes padding (2) [with ffmpeg]
Backscales (1) [with ffmpeg]
Necessary padding for compressai is added and removed on-the-fly
- toByte = ConvertImageDtype()#
- toFloat = ConvertImageDtype()#
vtm#
- class compressai_vision.pipelines.fo_vcm.pipeline.vtm.VTMEncoderDecoder(encoderApp=None, decoderApp=None, ffmpeg='ffmpeg', vtm_cfg=None, qp=47, scale=100, save=False, base_path='/dev/shm', cache=None, dump=False, skip=False, keep=False, warn=False)[source]#
EncoderDecoder class for VTM encoder
- Parameters:
encoderApp – VTM encoder command
decoderApp – VTM decoder command
vtm_cfg – path of encoder cfg file
ffmpeg – ffmpeg command used for padding/scaling
qp – the default quantization parameter of the instance. Integer from 0 to 63. Default=30.
scale – enable the VCM working group defined padding/scaling pre & post-processings steps. Possible values: 100 (default), 75, 50, 25. Special value: None = ffmpeg scaling. 100 equals to a simple padding operation
save – save intermediate steps into member
saved
(for debugging). Default: False.cache – (optional) define a directory where all encoded bitstreams are cached. NOTE: If scale is defined, “scale/qp/” is appended to the cache path. If no scale is defined, the appended path is “0/qp/”
dump – debugging option: dump input, intermediate and output images to disk in local directory
skip – if bitstream is found in cache, then do absolutely nothing. Good for restarting the bitstream generation. default: False. When enabled, method BGR returns (0, None). NOTE: do not use if you want to verify the bitstream files.
warn – warn always when a bitstream is generated. default: False.
This class tries always to use the cached bitstreams if they are available (for this you need to define a cache directory, see above). If the bitstream is available in cache, it will be used and the encoding step is skipped. Otherwise encoder is started to produce bitstream.
Example:
import cv2, os, logging from compressai_vision.evaluation.pipeline import VTMEncoderDecoder from compressai_vision.pipelines.fo_vcm.tools import getDataFile path="/path/to/VVCSoftware_VTM/bin" encoderApp=os.path.join(path, "EncoderAppStatic") decoderApp=os.path.join(path, "DecoderAppStatic") # enable debugging log to see explicitly all the steps loglev=logging.DEBUG quickLog("VTMEncoderDecoder", loglev) encdec=VTMEncoderDecoder(encoderApp=encoderApp, decoderApp=decoderApp, ffmpeg="ffmpeg", vtm_cfg=getDataFile("encoder_intra_vtm_1.cfg"), qp=47) nbits, img_hat = encdec.BGR(cv2.imread("fname.png"))
You can enable caching and avoid re-encoding of images:
encdec=VTMEncoderDecoder(encoderApp=encoderApp, decoderApp=decoderApp, ffmpeg="ffmpeg", vtm_cfg=getDataFile("encoder_intra_vtm_1.cfg"), qp=47, cache="/tmp/kokkelis") nbits, img_hat = encdec.BGR(cv2.imread("fname.png"), tag="a_unique_tag")
Cache can be inspected with:
encdec.dump()
- BGR(bgr_image, tag=None) tuple [source]#
- Parameters:
bgr_image – numpy BGR image (y,x,3)
tag – a string that can be used to identify & cache images (optional). Necessary if you’re using caching
Returns BGR image that has gone through VTM encoding and decoding process and all other operations as defined by MPEG/VCM.
Returns a tuple of (nbits, transformed_bgr_image)
This method is somewhat complex: in addition to perform the necessary image transformation, it also handles caching of bitstreams, inspection if bitstreams exist, etc. Error conditions from ffmpeg and/or from VTMEncoder/Decoder must be taken correctly into account.
VCM working group ops:
padded_hgt = math.ceil(height/2)*2 padded_wdt = math.ceil(width/2)*2 1. ffmpeg vf -i {input_tmp_path} -o {input_padded_tmp_path} vf depends on the scale: for 100%: -vf “pad=ceil(iw/2)*2:ceil(ih/2)*2” # NOTE: simply padding for 75%: -vf "scale=ceil(iw*3/8)*2:ceil(ih*3/8)*2" for 50%: -vf "scale=ceil(iw/4)*2:ceil(ih/4)*2" for 25%: -vf "scale=ceil(iw/8)*2:ceil(ih/8)*2" 2. ffmpeg -i {input_padded_tmp_path} -f rawvideo -pix_fmt yuv420p -dst_range 1 {yuv_image_path} 3. {VTM_encoder_path} -c {VTM_AI_cfg} -i {yuv_image_path} -b {bin_image_path} -o {temp_yuv_path} -fr 1 -f 1 -wdt {padded_wdt} -hgt {padded_hgt} -q {qp} --ConformanceWindowMode=1 --InternalBitDepth=10 4. {VTM_decoder_path} -b {bin_image_path} -o {rec_yuv_path} 5. ffmpeg -y -f rawvideo -pix_fmt yuv420p10le -s {padded_wdt}x{padded_hgt} -src_range 1 -i {rec_yuv_path} -frames 1 -pix_fmt rgb24 {rec_png_path} 6. ffmpeg -y -i {rec_png_path} -vf "crop={width}:{height}" {rec_image_path} # NOTE: This can be done only if scale=100%, i.e. to remove padding