This repo provides the official implementation of "Wavelet-Based Learned Scalable Video Coding".
The overall framework of WL-SVC is shown in the figure below. It consists of two parts: traditional modules (blue parts) and neural network modules (orange parts). Traditional modules come from Interframe EZBC-JP2K which is implemented in C, and neural network modules are implemented in python by us.
The code is implemented by pytorch and requires torch version >= 1.6.
The temporal subband coding module and inverse wavelet transform module are trainable, and the code is in train folder.
The input are original frames, motion vector and mask. You can obtain the motion vector and mask by modifying Interframe EZBC-JP2K yourself or refer to my modifications in the Interframe EZBC folder.
The test consists of two steps,
-
Use the modified Interframe EZBC-JP2K code to obtain the motion vector and the mask of the reference frame(index of the reference frame), and use Interframe EZBC-JP2K to encode them into a code stream. (You can modify Interframe EZBC-JP2K yourself or refer to my modifications in the Interframe EZBC folder.)
-
Use the test.py code in the test folder to code a video. The input is the YUV component of the original video, the motion vector and the mask of the reference frame. The input data of an example is stored in the network disk, and after downloading, place it in the test folder. The trained model is stored in the network disk, including the entropy coding model (model_all_encode.pth) and the wavelet inverse transform model (wave_post.pth). After downloading, place them in the test folder. (The code is implemented by pytorch and requires torch version >= 1.6.)