Fast DCT-II Algorithms - Sparse Fast Trigonometric Transforms

(ii) The cosine and sine matrices of type IV satisfy

S^IV_N =DNC^IV_NJN and S^IV_N =JNC^IV_NDN.

A proof of (i) can be found in [Jai79], Section III A. For a proof of (ii) see [Wan84], Section IV, equation (56).

4.2 Fast DCT-II Algorithms

Computing the DCT of a vector x ∈ R^N via the matrix-vector multiplications from Definition 4.1 has a runtime of O(N²). Fortunately, as for the DFT, there exist more efficient techniques for computing DCTs, which can achieve runtimes of O(NlogN).

Some of these methods use a divide-and-conquer approach like the one we explained in Section 1.1.1, some only employ real arithmetic, whereas others are based on existing FFT algorithms. For more details on an efficient algorithm for computing the DCT-II via FFTs see Section 5.2 of this thesis; for FFT-based algorithms for the other types of the DCT see, e.g., [PPST19], Section 6.3.1. It can also be shown that the order O(NlogN) for the runtime of the fast DCT is optimal for arbitrary input vectors of lengthN.

As the existence of fast algorithms for the DCT-II is integral for the methods we will present in Chapter 6, we now briefly sketch a fast algorithm for the DCT-II. This section, in which we present a method based on orthogonal matrix factorizations that only employs real arithmetic, is based on [PPST19], Section 6.3.2.

LetN ∈Nbe even withN ≥4. Our aim is to factorize the matrixC^II_N into a product of sparse orthogonal matrices that allow for divide-and-conquer steps. For the DFT the factorizations of F_N corresponding to the radix-2 algorithms which we sketched in Section 1.1.1 mainly depend onF^N

, a permutation matrix and a simple diagonal matrix, see, e.g., [PPST19], Section 5.2.3. However, the factorization of C^II_N we want to employ does not only depend onC^IIN

, but also onC^IVN 2

. Hence, we also require a factorization of C^IVN

into orthogonal matrices.

The following factorization of C^II_N was proved in [PT05], Lemma 2.2 (i). See also [PPST19], Theorem 6.32 (i), for more details.

Lemma 4.5 LetN ∈Nbe even, N ≥4, and let

be theeven-odd permutation matrix. Further, define

T_N := 1

whereI^N

denotes theidentity matrix of size ^N₂ ×^N₂ andJ^N

the counter identity matrix of size ^N₂ ×^N₂ from Theorem 4.4. Then C^II_N satisfies the following factorization,

C^II_N =P_N^T

i.e., multiplyingP_N by a vectorxreturns the evenly indexed entries ofxin the first half

and the oddly indexed ones in the second half. ♦

Proof of Lemma 4.5. Let us consider the matrixPNC^II_N. By Remark 4.6 we obtain that

PNC^II_N = as a block matrix consisting of four submatrices indexed from0 to ^N₂ −1 yields

PNC^II_N

4.2 Fast DCT-II Algorithms

we find for the top-right quadrant of the matrix in (4.4) that the bottom-right quadrant of the matrix in (4.4) can be written as

Combining (4.5) and (4.6) with (4.4), we obtain that

Note that both P_N, as a permutation matrix, and T_N are orthogonal, which implies that the factorization from Lemma 4.5 is indeed one into real orthogonal sparse matrices.

This factorization now provides us with the necessary tools for a first divide-and-conquer step. Let x ∈ R^N, where N ≥4 is even. Let us denote by x₍₀₎ and x₍₁₎ the first and second half ofx, respectively, i.e.,

x₍₀₎ := (x_k)

Consequently, the first subproblem of half size is to compute the DCT-II of the vector x₍₀₎+J^N

2x₍₁₎ ∈R

2, and the second subproblem of half size is to compute the DCT-IV of the vector x₍₀₎ −J^N

2x₍₁₎ ∈ R

2. Thus, we also require a factorization of the matrix C^IVN

into real orthogonal sparse matrices such that we can apply the divide-and-conquer paradigm.

The following factorization of C^IV_N was shown in [PT05], Lemma 2.4. See [PPST19], Section 6.3.2, Theorem 6.33, for more details.

4.2 Fast DCT-II Algorithms Then C^IV_N satisfies the following factorization,

C^IV_N =P_N^TA_N

Note thatA_N andT_N(1)are indeed orthogonal matrices. With the factorization from Lemma 4.7 the subproblem of size ^N₂ of computing the DCT-IV of the vectorx₍₀₎−JN

x₍₁₎ can be reduced to two subproblems of size ^N₄ of essentially computing DCT-IIs of length

These subproblems of size^N₄ can again be split into a DCT-II and a DCT-IV computation of length ^N₈ if 8 divides N. Continuing these splitting steps until the vectors in the subproblems have length 2, for which we compute the DCT-II and DCT-IV directly, yields fast algorithms for the DCT-II and DCT-IV. Careful consideration of the matrix factorizations given by Lemmas 4.5 and 4.7 shows that, for a vector x ∈ R^N,N = 2^J,

the fast DCT-II algorithm performs 4

3N J−8 9N −1

9(−1)^J+ 1 =O(N J) =O(Nlog₂N) complex additions and

N J−4 3N +1

3(−1)^J + 1 =O(Nlog₂N)

complex multiplications. The fast DCT-IV algorithm also has a runtime ofO(Nlog₂N), with similarly small constants. See [PPST19], Section 6.3.2, Theorem 6.39, for a proof of these runtime complexities.

Remark 4.8 Since C^III_N = C^II_N^T = C^II_N⁻¹, Lemma 4.5 also provides us with an or-thogonal factorization of the cosine matrix of type III. Thus, we directly obtain a fast algorithm with runtimeO(Nlog₂N)for the DCT-III, which is the same as the IDCT-II.

It can be shown that for the DCT-I there also exist fast algorithms with a runtime of O(Nlog₂N). For an overview of several fast methods for the different DCT and DST types see, e.g., [BYR06], Section 4.4.

Besides the already mentioned possibility of computing the DCT of a vector via the DFT, which will be explained for the DCT-II in detail in Section 5.2, there also exist fast DCT algorithms based on Chebyshev polynomials. These methods use factorizations of the cosine matrices which are not orthogonal, thus leading to less stable algorithms, but also only require real arithmetic, see, e.g., [Fei90, FW92, PM03, Ste92, ST91]. Further, the DCT-II can be computed via the Walsh-Hadamard transform, see, e.g., [AR75] and [BYR06], Section 4.4.3.3. There also exist split-radix methods for the DCT-II, see, e.g., [BYR06], Section 4.4.3.4. Other algorithms include, for example, [SH86, Wan84, Wan83,CSF77]. All of these methods have a runtime ofO(NlogN)for arbitrary vectors

of lengthN. ♦

4.3 2-Dimensional Discrete Cosine Transform

Some of the main areas of application for discrete cosine and sine transforms are digital image or video processing and compression, and transform-based coding applications.

All of these problems are at least 2-dimensional, so there has also been extensive re-search regarding the development of fast 2-dimensional DCT and DST algorithms, with particular focus on the DCT-II.

We now define the 2-dimensional discrete cosine transforms of types II and IV, see, e.g., [RY90], Chapter 5 and [BYR06], Section 4.5.

Definition 4.9 (2-Dimensional DCT-II and DCT-IV)LetA ∈R^M×N. Then the 2-dimensional discrete cosine transforms of types II and IV of A are defined as

A^II^b :=C^II_MAC^II_N^T and A^IV^c:=C^IV_MAC^IV_N.

The other 2-dimensional discrete trigonometric transforms are defined analogously. As for the 1-dimensional DCT, the 2-dimensional DCT of a real M ×N matrix can be calculated by applying a 2-dimensional DFT, see, e.g., [RY90], Section 5.4. However, there also exist direct approaches for computing 2-dimensional discrete trigonometric transforms. The first method, the so-called row-column method, is based on the ability

Im Dokument Sparse Fast Trigonometric Transforms (Seite 131-137)