• Keine Ergebnisse gefunden

Splitting approach and non-linear methods

2.4 Methods for non-negative dimensionality reduction

2.4.3 Splitting approach and non-linear methods

As already indicated, the splitting approach is not suitable for non-linear methods. This has several reasons. Even though the cost functionals for Isomap (2.16), LLE (2.18) and LE (2.21) are rotationally invariant (mostly due to the cyclic invariance of the trace), the sufficient condition from Theorem 2.33 is not applicable since these methods are in general not angle-preserving. The alternative condition from Theorem 2.38 is not helpful since we are neither aware of appropriate approximative left-inverses for these methods, nor are these given explicitly in the literature.

Another aspect concerns the formulation of the non-linear dimensionality reduction prob-lem of LE and LLE. In Section 2.2.4 we have seen that these probprob-lems could not be formulated as an optimization problem on the map P but only on the representation Y. In this setting, imposing the additional constraint Y ≥ 0 is very restrictive since the combination of Y DYT =Idd for a diagonal matrix D and Y ≥0 forces Y to have at most one non-vanishing entry per column. Thus, the solution is sparse which is not suitable for our applications.

Nevertheless, several authors have addressed problems with these constraints. In [78, 127, 132] they have been discussed for NNLE and NNLLE and the corresponding opti-mization problems were solved by update algorithms.

Many audio-related applications take advantage of the ability to separate sources from a mixture without a prior knowledge about the mixing process. Thus, the analysis and separation of audio signals into their source components is an important tool for the extraction of meta data from audio data as for example separating musical instruments from a polyphonic ensemble, music restoration or extracting speech from a noisy back-ground. In all these situations, an efficient method to analyze the auditory scene in order to extract essential information is needed. This concept is known as blind signal separation (BSS) and was the topic of many recent research projects as already discussed in the introduction of this thesis.

In the case of detection or separation of certain sources from a mixture of signals, time-frequency information about the data is collected and used to decompose the signal into different components corresponding each to one of the source signals. This decomposition is based on the assumption that the different source signals can be characterized by their frequency distribution. There are different methods for the decomposition of time-frequency data available (e.g. independent component analysis (ICA) or non-negative matrix factorization (NNMF)).

Time-frequency data is typically given by a spectrogram obtained from a signal trans-form, such as short-time Fourier transform (STFT). Of course, other transforms can be used for computing a time-frequency representation, but we will stick to the classical STFT. For high-energy signals, the time-frequency data is characterized by the high dimensionality of the Euclidean space in which the data is embedded. More precisely, the dimension of this space is defined by the frequency range of the original signal and the size of the signal transform. A standard value for the frequency range would be 256.

Therefore, it suggests itself that a reduction of the data’s dimensionality might improve the method and speed up the computation of the data analysis. We observe that in many cases not all information contained in the data points is relevant for understanding the underlying characteristics or properties of the data. Many signals can be sufficiently de-scribed by a few dominant frequencies. Also, low-dimensional data sets are much easier to operate with in view of classification, visualization or decomposition. As a conse-quence, we would like to reduce the dimensionality of the given data in a preprocessing step before we apply a decomposition method. Thus, we focus on the interaction of di-mensionality reduction and decomposition methods such as ICA or NNMF. The idea of combining dimensionality reduction and ICA is not a new concept (see [36, 48, 69, 117]).

But to improve these strategies, a better mathematical understanding of these proce-dures is needed. Also, the substitution of ICA by a non-statistic based method such as NNMF could improve the results.

An important aspect we have to take into account is non-negativity. The amplitude

spectrogram, output from the STFT, is non-negative. In this context, non-negativity refers to the fact that the data matrix is entry-wise non-negative. This fits very well with the decomposition by non-negative matrix factorization which requires, as the name suggests, non-negative input data. But the application of an intermediate dimensionality reduction step might cause negative entries in the low-dimensional representation. Thus, there is a need of reduction methods which are able to preserve non-negativity. To this end, we have developed an approach for non-negative dimensionality reduction (compare Chapter 2).

In the present section, we will introduce a signal separation procedure which includes this dimensionality reduction step. We will combine different techniques and discuss several numerical examples to illustrate the algorithm’s applications. We will focus on the separation of single channel drum and percussion tracks which are typically high-energy signals. There has been done some research in this direction (see [36, 117]) but so far the combination of non-negative dimensionality reduction and NNMF has not been considered by other authors.

It will turn out that the combination of PCA or our NNPCA with ICA yields the best separation. However, our new NNPCA combined with NNMF does perform almost as good as PCA and ICA, whereas Isomap followed by any of the two decomposition methods shows very poor separation qualities. In the latter, we used a naive kernel approach (compare [86]) for approximating the inverse reduction map.

In Section 3.1, we introduce the concept of signal detection and separation and review the involved methods in several subsections. Section 3.1.1 is dedicated to the generation of time-frequency data, where we introduce the short-time Fourier transform. In Section 3.1.2, we discuss some difficulties which arise when it comes to dimensionality reduc-tion in signal separareduc-tion. Thereafter, we study decomposireduc-tion techniques focusing on independent component analysis and non-negative matrix factorization (Section 3.1.3).

In Section 3.2, we will discuss three examples. We will use the before-explained algorithm in order to separate different mixtures of single-channel audio recordings. The examples are introduced in Section 3.2.1 and the results are discussed in 3.2.2.