• Keine Ergebnisse gefunden

4.5 Simulation results

In this section, we present simulation results which verify the theoretical analysis. We use a synthetic room impulse response h(n) based on a statistical reverberation model, which generates a room impulse response as a realization of a nonstationary stochastic process h(n) = u(n)β(n)e−αn, where u(n) is a step function, β(n) is a zero-mean white Gaussian noise andαis related to the reverberation timeT60(the time for the reverberant sound energy to drop by 60 dB from its original value). In the following simulations, the length of the impulse response is set to 16 ms, the sampling rate is 16 kHz,α corresponds to T60 = 50 ms and β(n) is unit-variance zero-mean white Gaussian noise. We use a Hamming synthesis window with 50% overlap (L= 0.5N), and a corresponding minimum energy analysis window which satisfies the completeness condition [72]. The signals x(n) and ξ(n) are uncorrelated zero-mean white Gaussian. Figure 4.2 shows the mse curves, both in theory and in simulation, as a function of the ratio between the analysis window length and the impulse response length. Figure 4.2(a) shows the mse curves for SNR values of −10, 0 and 10 dB, obtained with a signal length of 3 seconds (corresponding to Nx=48,000), and Fig. 4.2(b) shows the mse curves for signal lengths of 3 and 15 sec, obtained with a −10 dB SNR. The experimental results are obtained by averaging over 100 independent runs. Clearly, the theoretical analysis well describes the mse performance achievable by using the MTF approximation. As the SNR or the signal length increases, a lower mse can be achieved by using a longer analysis window. Accordingly, as the power of the input signal increases or as the time variations in the system become slower (which enables one to use of a longer input signal), a longer analysis window should be used to make the MTF approximation appropriate for system identification in the STFT domain.

4.6 Conclusions

We have derived explicit relations between the mmse and the analysis window length, for a system identifier implemented in the STFT domain and relying on the MTF ap-proximation. We showed that the mmse does not necessarily decrease with increasing the

94 CHAPTER 4. MTF APPROXIMATION IN THE STFT DOMAIN window length, due to the finite length of the input signal. The optimal window length that achieves the mmse depends on the SNR and length of the input signal.

It is worthwhile noting, that the stationarity of the input signal should also be taken into account when determining the appropriate window length. For nonstationary input signals it may be necessary to use a shorter analysis window for more efficient representa-tion in the STFT domain. Furthermore, the performance analysis is evaluated based on a normalized mse in the STFT domain. One may also be interested to analyze the mse in the time-domain, which is a topic for further research.

4.6. CONCLUSIONS 95

(a)

10 20 30 40 50 60

−30

−25

−20

−15

−10

−5

SNR = −10 dB

SNR = 0 dB

SNR = 10 dB

N/Nh

ε [dB]

Simulation Theory

(b)

10 20 30 40 50 60

−22

−20

−18

−16

−14

−12

−10

−8

−6

−4

Nx = 3 sec

Nx = 15 sec

N/Nh

ε [dB]

Simulation Theory

Figure 4.2: Comparison of simulation (solid) and theoretical (dashed) mse curves as a function of the ratio between the analysis window length (N) and the impulse response length (Nh).

(a) Comparison for several SNR values (input signal length is 3 seconds); (b) Comparison for several signal lengths (SNR is−10 dB).

96 CHAPTER 4. MTF APPROXIMATION IN THE STFT DOMAIN

Chapter 5

Adaptive System Identification in

the STFT Domain Using Cross-MTF Approximation 1

In this chapter, we introduce cross-multiplicative transfer function (CMTF) approxima-tion for modeling linear systems in the short-time Fourier transform (STFT) domain.

We assume that the transfer function can be represented by cross-multiplicative terms between distinct subbands. We investigate the influence of cross-terms on a system iden-tifier implemented in the STFT domain, and derive analytical relations between the noise level, data length, and number of cross-multiplicative terms, which are useful for system identification. As more data becomes available or as the noise level decreases, additional cross-terms should be considered and estimated to attain the minimal mean-square error (mse). A substantial improvement in performance is then achieved over the conventional multiplicative transfer function (MTF) approximation. Furthermore, we derive explicit expressions for the transient and steady-state mse performances obtained by adaptively estimating the cross-terms. As more cross-terms are estimated, a lower steady-state mse is achieved, but the algorithm then suffers from slower convergence. Experimental results validate the theoretical derivations and demonstrate the effectiveness of the proposed approach as applied to acoustic echo cancellation.

1This chapter is based on [99].

97

98 CHAPTER 5. ADAPTIVE IDENTIFICATION USING CMTF

5.1 Introduction

Identifying linear time-invariant (LTI) systems in the short-time Fourier transform (STFT) domain has been studied extensively, and is of major importance in many appli-cations [3,19,21,22,35,65,100]. LTI system representation in the STFT domain generally requires crossband filters between subbands [16,65]. To avoid the crossband filters, a mul-tiplicative transfer function (MTF) approximation is often employed (e.g., [3, 35]). This approximation relies on the assumption that the support of the STFT analysis window is sufficiently large compared to the duration of the system impulse response, and that the transfer function of the system can be modeled as multiplicative. As the length of the analysis window increases, the MTF approximation becomes more accurate. However, the length of the input signal that can be employed for the system identification is usually finite to enable tracking during time variations of the system. Hence, as the length of the analysis window increases, fewer observations in each frequency bin become available.

Recently, we have investigated the influence of the analysis window length on the per-formance of a system identifier that relies on the MTF approximation [98]. We showed that the minimum mean-square error (mse) attainable under this approximation can be decomposed into two error terms. The first term, attributable to using a finite-support analysis window, is monotonically decreasing as a function of the window length, while the second term is a consequence of restricting the length of the input signal, and is monotonically increasing as a function of the window length. Therefore, system identifi-cation performance does not necessarily improve by increasing the length of the analysis window. The signal-to-noise ratio (SNR) and the input signal length determine the opti-mal length of the window. We showed that as the SNR or input signal length decreases, a shorter analysis window should be used.

In this chapter, we introduce cross-multiplicative transfer function (CMTF) approx-imation in the STFT domain. The transfer function of the system is represented by cross-multiplicative terms between distinct subbands, and data from adjacent frequency bins is used for the system identification. Two identification schemes are introduced: One is an off-line scheme in the STFT domain based on the least-squares (LS) criterion for estimating the CMTF coefficients. In the second scheme, the cross-terms are estimated

5.1. INTRODUCTION 99 adaptively using the least-mean-square (LMS) algorithm [10]. We analyze the perfor-mances of both schemes and derive explicit expressions for the obtainable minimum mse (mmse). The analysis reveals important relations between the noise level, data length, and number of cross-multiplicative terms, which are useful for system identification. As more data becomes available or as the noise level decreases, additional cross-terms should be considered and estimated to attain the mmse. In this case, a substantial improvement in performance is achieved over the conventional MTF approximation. For every data length and noise level there exists an optimal number of useful cross-multiplicative terms, so increasing the number of estimated cross-terms does not necessarily imply a lower mse.

Note that similar results have been obtained in the context of system identification with crossband filters [65].

The main contribution of this work is a derivation of an explicit convergence analysis of the CMTF approximation, which includes the MTF approach as a special case. We derive explicit expressions for the transient and steady-state mse in frequency bins for white Gaussian processes. At the beginning of the adaptation process, the length of the data is short, and only a few cross-terms should be estimated, whereas as more data become available more cross-terms can be used to achieve the mmse. Consequently, the MTF approach is associated with faster convergence, but suffers from higher steady-state mse. Estimation of additional cross-terms results in a lower convergence rate, but improves the steady-state mse with a small increase in computational cost. Experimental results with white Gaussian signals and real speech signals validate the theoretical results derived in this work, and demonstrate the relations between the number of useful cross-terms and transient and steady-state mse.

The chapter is organized as follows. In Section 5.2, we introduce the CMTF approxi-mation for system identification in the STFT domain. In Section 5.3, we consider off-line estimation of the cross-terms, and derive an explicit expression for the attainable mmse.

In Section 5.4, we present an adaptive implementation of the CMTF estimation, and an-alyze the transient and steady-state mse in subbands. Finally, in Section 5.5, we present experimental results which verify the theoretical derivations.

100 CHAPTER 5. ADAPTIVE IDENTIFICATION USING CMTF