• Keine Ergebnisse gefunden

Fast DCT-II Algorithms

Im Dokument Sparse Fast Trigonometric Transforms (Seite 131-137)

(ii) The cosine and sine matrices of type IV satisfy

SIVN =DNCIVNJN and SIVN =JNCIVNDN.

A proof of (i) can be found in [Jai79], Section III A. For a proof of (ii) see [Wan84], Section IV, equation (56).

4.2 Fast DCT-II Algorithms

Computing the DCT of a vector x ∈ RN via the matrix-vector multiplications from Definition 4.1 has a runtime of O(N2). Fortunately, as for the DFT, there exist more efficient techniques for computing DCTs, which can achieve runtimes of O(NlogN).

Some of these methods use a divide-and-conquer approach like the one we explained in Section 1.1.1, some only employ real arithmetic, whereas others are based on existing FFT algorithms. For more details on an efficient algorithm for computing the DCT-II via FFTs see Section 5.2 of this thesis; for FFT-based algorithms for the other types of the DCT see, e.g., [PPST19], Section 6.3.1. It can also be shown that the order O(NlogN) for the runtime of the fast DCT is optimal for arbitrary input vectors of lengthN.

As the existence of fast algorithms for the DCT-II is integral for the methods we will present in Chapter 6, we now briefly sketch a fast algorithm for the DCT-II. This section, in which we present a method based on orthogonal matrix factorizations that only employs real arithmetic, is based on [PPST19], Section 6.3.2.

LetN ∈Nbe even withN ≥4. Our aim is to factorize the matrixCIIN into a product of sparse orthogonal matrices that allow for divide-and-conquer steps. For the DFT the factorizations of FN corresponding to the radix-2 algorithms which we sketched in Section 1.1.1 mainly depend onFN

2

, a permutation matrix and a simple diagonal matrix, see, e.g., [PPST19], Section 5.2.3. However, the factorization of CIIN we want to employ does not only depend onCIIN

2

, but also onCIVN 2

. Hence, we also require a factorization of CIVN

2

into orthogonal matrices.

The following factorization of CIIN was proved in [PT05], Lemma 2.2 (i). See also [PPST19], Theorem 6.32 (i), for more details.

Lemma 4.5 LetN ∈Nbe even, N ≥4, and let

be theeven-odd permutation matrix. Further, define

TN := 1

whereIN

2

denotes theidentity matrix of size N2 ×N2 andJN

2

the counter identity matrix of size N2 ×N2 from Theorem 4.4. Then CIIN satisfies the following factorization,

CIIN =PNT

i.e., multiplyingPN by a vectorxreturns the evenly indexed entries ofxin the first half

and the oddly indexed ones in the second half. ♦

Proof of Lemma 4.5. Let us consider the matrixPNCIIN. By Remark 4.6 we obtain that

PNCIIN = as a block matrix consisting of four submatrices indexed from0 to N2 −1 yields

PNCIIN

4.2 Fast DCT-II Algorithms

we find for the top-right quadrant of the matrix in (4.4) that the bottom-right quadrant of the matrix in (4.4) can be written as

Combining (4.5) and (4.6) with (4.4), we obtain that

Note that both PN, as a permutation matrix, and TN are orthogonal, which implies that the factorization from Lemma 4.5 is indeed one into real orthogonal sparse matrices.

This factorization now provides us with the necessary tools for a first divide-and-conquer step. Let x ∈ RN, where N ≥4 is even. Let us denote by x(0) and x(1) the first and second half ofx, respectively, i.e.,

x(0) := (xk)

Consequently, the first subproblem of half size is to compute the DCT-II of the vector x(0)+JN

2x(1) ∈R

N

2, and the second subproblem of half size is to compute the DCT-IV of the vector x(0) −JN

2x(1) ∈ R

N

2. Thus, we also require a factorization of the matrix CIVN

2

into real orthogonal sparse matrices such that we can apply the divide-and-conquer paradigm.

The following factorization of CIVN was shown in [PT05], Lemma 2.4. See [PPST19], Section 6.3.2, Theorem 6.33, for more details.

4.2 Fast DCT-II Algorithms Then CIVN satisfies the following factorization,

CIVN =PNTAN

Note thatAN andTN(1)are indeed orthogonal matrices. With the factorization from Lemma 4.7 the subproblem of size N2 of computing the DCT-IV of the vectorx(0)−JN

2

x(1) can be reduced to two subproblems of size N4 of essentially computing DCT-IIs of length

N

These subproblems of sizeN4 can again be split into a DCT-II and a DCT-IV computation of length N8 if 8 divides N. Continuing these splitting steps until the vectors in the subproblems have length 2, for which we compute the DCT-II and DCT-IV directly, yields fast algorithms for the DCT-II and DCT-IV. Careful consideration of the matrix factorizations given by Lemmas 4.5 and 4.7 shows that, for a vector x ∈ RN,N = 2J,

the fast DCT-II algorithm performs 4

3N J−8 9N −1

9(−1)J+ 1 =O(N J) =O(Nlog2N) complex additions and

N J−4 3N +1

3(−1)J + 1 =O(Nlog2N)

complex multiplications. The fast DCT-IV algorithm also has a runtime ofO(Nlog2N), with similarly small constants. See [PPST19], Section 6.3.2, Theorem 6.39, for a proof of these runtime complexities.

Remark 4.8 Since CIIIN = CIINT = CIIN−1, Lemma 4.5 also provides us with an or-thogonal factorization of the cosine matrix of type III. Thus, we directly obtain a fast algorithm with runtimeO(Nlog2N)for the DCT-III, which is the same as the IDCT-II.

It can be shown that for the DCT-I there also exist fast algorithms with a runtime of O(Nlog2N). For an overview of several fast methods for the different DCT and DST types see, e.g., [BYR06], Section 4.4.

Besides the already mentioned possibility of computing the DCT of a vector via the DFT, which will be explained for the DCT-II in detail in Section 5.2, there also exist fast DCT algorithms based on Chebyshev polynomials. These methods use factorizations of the cosine matrices which are not orthogonal, thus leading to less stable algorithms, but also only require real arithmetic, see, e.g., [Fei90, FW92, PM03, Ste92, ST91]. Further, the DCT-II can be computed via the Walsh-Hadamard transform, see, e.g., [AR75] and [BYR06], Section 4.4.3.3. There also exist split-radix methods for the DCT-II, see, e.g., [BYR06], Section 4.4.3.4. Other algorithms include, for example, [SH86, Wan84, Wan83,CSF77]. All of these methods have a runtime ofO(NlogN)for arbitrary vectors

of lengthN. ♦

4.3 2-Dimensional Discrete Cosine Transform

Some of the main areas of application for discrete cosine and sine transforms are digital image or video processing and compression, and transform-based coding applications.

All of these problems are at least 2-dimensional, so there has also been extensive re-search regarding the development of fast 2-dimensional DCT and DST algorithms, with particular focus on the DCT-II.

We now define the 2-dimensional discrete cosine transforms of types II and IV, see, e.g., [RY90], Chapter 5 and [BYR06], Section 4.5.

Definition 4.9 (2-Dimensional DCT-II and DCT-IV)LetA ∈RM×N. Then the 2-dimensional discrete cosine transforms of types II and IV of A are defined as

AIIb :=CIIMACIINT and AIVc:=CIVMACIVN.

The other 2-dimensional discrete trigonometric transforms are defined analogously. As for the 1-dimensional DCT, the 2-dimensional DCT of a real M ×N matrix can be calculated by applying a 2-dimensional DFT, see, e.g., [RY90], Section 5.4. However, there also exist direct approaches for computing 2-dimensional discrete trigonometric transforms. The first method, the so-called row-column method, is based on the ability

Im Dokument Sparse Fast Trigonometric Transforms (Seite 131-137)