Low-complexity Image and Video Coding Based on an Approximate Discrete Tchebichef Transform
The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this work, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication-free and requires a reduced number of additions and bit-shifting operations. Image and video compression simulations in popular standards shows good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows 43.1% reduction of configurable logic blocks and ASIC place and route realization shows 57.7% reduction in the area-time figure when compared with the 2-D version of the exact DTT.
Approximate transforms, discrete Tchebichef transform, fast algorithms, image and video coding
Discrete variable orthogonal polynomials emerge as solutions of several hypergeometric difference equations [Nikiforov1991poly_discrete]. Classic applications of this class of orthogonal polynomials include functional analysis [Dragnev1997func_analysis] and graphs [Camara2009graphs]. Additionally, such polynomials are employed in the computation of moment functions [Zhu2010orth_moments], which are largely used in image processing [Goshtasby1985moment1, Heywood1995moment2, Markandey1992moment3]. For instance, the discrete Tchebichef moments [DTM], which are derived from the discrete Tchebichef polynomials, form a set of orthogonal moment functions. Such functions are not discrete approximation based on continuous functions; they are naturally orthogonal over the discrete domain.
The Tchebichef moments have been used for quantifying image block artifact [Leida2014artifact_tchebichef], image recognition [Rose2009shape_tchebichef, Zhang2010img_recog, Li2011recognition_tchebichef], blind integrity verification [Roux2012blind], and image compression [Swamy2008DTT, Swamy2013ITT, Mukundan2010img_compress, Li2012cs_tchebichef, Senapati2014listlessDTT]. In the data compression context, bi-dimensional (2-D) moments are computed by means of the 2-D discrete Tchebichef transform (DTT). In fact, the 8-point DTT can achieve better performance when comparison with the discrete cosine transform (DCT) [Ahmed1974DCT], in terms of average bit length as reported in [Ernawan2013quantization_tchebichef, Mukundan2010img_compress, Senapati2011DTT_coding]. Moreover, the 8-point DTT-based embedded encoder proposed in [Senapati2014listlessDTT], shows improved image quality and reduced encoding/decoding time in comparison with state-of-the-art DCT-based embedded coders. The 8-point DTT has also been employed in blind forensics, as a tool to determine the integrity of medical imagery subject to filtering and compression [Roux2012blind].
However, the exact DTT possesses high arithmetic complexity, due to its significant amount of additions and float-point multiplications. Such multiplications are known to be more demanding computational structures than additions or fixed-point multiplications, both in software and hardware. Thus, the higher computational complexity of the DTT precludes its applications in low power consumption systems [Ernawan2011mobile_tchebichef, Li2008WSN] and/or real-time processing, such as video streaming [Meng2005realtime_video, Friedman2013video_low_power]. Therefore, fast algorithms for the DTT could improve its computational efficiency. A comprehensive literature search reveals only two fast algorithms for the 4-point DTT [Mukundan2007FDTT, Swamy2008DTT] and one for the 8-point DTT [Swamy2013ITT]. Although these fast algorithms possess lower arithmetic complexities when compared with the direct DTT calculation, they still possess high arithmetic complexity, requiring a significant amount of additions and bit-shifting operations.
In a comparable scenario, the computation of DCT-based transforms—which has been employed in several popular coding schemes such as JPEG [Wallace1992JPEG], MPEG-2 [MPEG-2], H.261 [H.261], H.263 [H.263], H.264 [H264_book], HEVC [Sullivan2012HEVC, Bossen2012HEVC_impementation], and VP9 [VP9]—has profited from matrix approximation theory [Haweel2001SDCT, CB2011RDCT, CB2012MRDCT, cintra2014dct_aprox, BAS2008_EL, BAS2011, BAS2013]. In this context, discrete transforms are not exactly calculated, but instead an approximate, low-cost computation, is performed. The approximations are designed in such a way to allow similar spectral and coding characteristics as well as lower arithmetic complexity. Usually, approximations are multiplierless, requiring only addition and bit-shifting operations for its computation. In [Oliveira2015Tchebichef], a multiplierless approximation for the 8-point DTT is proposed. To the best of our knowledge, this is the only DTT aproximation archived in literature.
The aim of this work is to introduce an efficient low-complexity approximation for the 8-point DTT capable of outperforming [Oliveira2015Tchebichef]. To derive multiplierless approximate DTT matrix, a multicriteria optimization problem is sought, combining different coding metrics: coding gain and transform efficiency. Additionally, a fast algorithm for efficient computation of the sought approximation is also pursued. For coding performance evaluation, we propose two computational experiments: (i) a JPEG image compression simulation and (ii) a video coding experiment which consists of embedding the sought approximation into the H.264/AVC standard.
The paper unfolds as follows. Section 2 reviews the mathematical background of the DTT. Section LABEL:sec:approx introduces a parametrization of the DTT to derive a family of DTT approximations and sets up an optimization problem to identify optimal approximations. In Section LABEL:sec:eval, we assess the obtained approximation in terms of coding performance, proximity with the exact transform, and computation cost. Moreover, a fast algorithm for the proposed approximate DTT is introduced. Section LABEL:sec:img_compress shows the results of the image and video compression simulations. Section LABEL:section-hardware shows hardware resource consumption comparison with the exact DTT for both FPGA and ASIC realizations. A discussion and final remarks are shown in Section LABEL:section-conclusion.
2 Discrete Tchebichef Transform
2.1 Discrete Tchebichef Polynomials
The discrete Tchebichef polynomials are a set of discrete variable orthogonal polynomials [HTF53]. The th order discrete Tchebichef polynomials are given by the following closed form expression [Swamy2008DTT]:
where , is the generalized hypergeometric function and is the descendant factorial. Tchebichef polynomials can be obtained according to the following recursion [Swamy2008DTT]:
for and . Indeed, the set , , is an orthogonal basis in respect with the unit weight. Consequently, the discrete Tchebichef polynomials satisfy the following mathematical relation:
where and is the Kronecker delta function which yields , if , and , otherwise.
2.2 2-D Discrete Tchebichef Transform
Let , , be an intensity distribution from a discrete image of size pixels. The 2-D DTT of , denoted by , , is given by [DTM, Swamy2008DTT]:
where , , are the orthonormalized discrete Tchebichef polynomials given by .
Note that the transform kernel described in ( ‣ 2.2) is separable. Hence, the following is relation holds true:
for . Therefore, the transform-domain coefficients of can be calculated by the following matrix operation:
where is the -point unidimensional DTT matrix given by
The matrix operations induced by ( ‣ 2.2) represents the 2-D DTT. Because of the kernel separation property, the 2-D DTT can be calculated by means the successive applications of the 1-D DTT to the rows of ; and then to columns of the resulting intermediate matrix. The original intensity distribution can be recovered by the inverse procedure:
The last equality above stems from the DTT orthogonality property: [Swamy2008DTT]. Therefore, the same structure can be used at the forward transform as well in the inverse.
For and , we have the particular cases of interest in the context of image and video coding. Thus, the 4- and 8-point DTT matrices are, respectively, furnished by: