Preliminaries for the Tensor Decomposition based analysis

vectors. Stacking these vectors next to each other leads to the so calledfactor matrices. E.g.

the first factor matrix of the tensor described in (5.17) is given by,U1 = [u1,u2,· · ·,uRU].

Factor matricesalong other dimensions can be described in a similar way. The tensorUcan be represented into a matrix and a vector. These processes are referred to as the matricization and the vectorization of the tensor. For a 3 dimensional tensor, we have the following three matrix unfoldings i.e. tensor data expanded into matrices along the individual dimensions,

U⁽¹⁾ = A(C⊙B)^T U⁽²⁾ = B(C⊙A)^T U⁽³⁾ = C(B⊙A)^T

whereA,Band theCare three factor matrices, and⊙is the Khatri-Rao product between the two matrices and is defined as,

A⊙B= [a1⊗b1|a2⊗b2| · · · |aN⊗bN] (5.9) Here⊗refers to the Kronecker product between two matrices, such that

a1⊗b1= [a⁽¹⁾₁ b1ka⁽²⁾₁ b1k · · · ka^(P₁ ¹⁾b1] (5.10) where|andkdenote the horizontal and the vertical concatenation, respectively. Alsoa^(j)_i refers to the jth element of the vectorai. Kronecker product between two matricesA∈R^P¹^×nand B∈R^m×P²is given byC∈R^mP¹^×nP²and is defined in a similar way such that,

A⊗B=







a1,1B · · · an,1B ... . .. ... aP1,1B · · · aP1,nB







(5.11)

For the case where the rank-1 tensor represents the data in N-dimensional space, there will be Nsuch matrices and vectors that the tensorUcan be folded into. The exact relationships are given below,

U⁽ⁱ⁾=Ui(UN⊙UN−1⊙ · · · ⊙Ui+1⊙Ui−1⊙ · · · ⊙U1)^T (5.12) The following identity can be used for the further unfolding of the matrix into a vector,

vec(ABC) = (C^T⊗A)vec(B) (5.13)

5.4 Preliminaries for the Tensor Decomposition based analysis

Owing to their size, tensors in high dimensions are not easy to work with. Hence alternative representations are sought after. One such representation is based on the concept of seperation of the variables. This will become the foundation of the unified framework for the solution of FPE in discretized form, as described later on in this chapter. Therefore, before undertaking that task we would like to explain some key concepts.

124 5 Tensors and the Log homotopy based particle flow

5.4.1 Variable separation

Representing a function in dimensions greater than 2 gets increasingly complex in terms of computation. It is because the number discretized points making up the function in a hypercu-bic space grows geometrically with the increase in the dimension d i.eN =n1×n2×· · ·×nd. In the context of numerics, this phenomenon, like in the Sequential Monte Carlo, is referred to as thecurse of dimensionality. Separation of variable representation has been noted as one of the key factor in effectively handling this issue for solving high dimensional discretized prob-lems. A very important contribution in this regard is [BM05], where authors have developed a framework to efficiently represent functions and operator in such scenarios and have applied it to compute the wavefunction for the multiparticle Schr¨odinger equation with one-dimensional particles and simplified potentials. In our work, we will heavily borrow from [BM05], as it is required to solve the discretized Fokker-Planck equation in higher spaces.

We start with a very basic separated representation for a d dimensional function u(x1, x2,· · ·, xd) :R^d→R,

u(x1, x2,· · ·, xd)≈u1(x1)u2(x2)· · ·ud(xd) (5.14) As it can be seen the function is separated into its constituent one dimensional functions. How-ever, this is a very simple expression and in most of the cases it is not sufficient for the accurate approximation. It is natural to extend the expression containing the sum ofRUsuch terms,

u(x1, x2,· · ·, xd) ≈

j=1

θju^j₁(x1)u^j₂(x2)· · ·u^j_d(xd)

j=1 d

i=1

θju^j_i(xi) (5.15)

This expression is more general and depending on the approximation rankRU, can approximate an arbitrary function much better. By having the functionu(x1, x2,· · ·, xd)in the form above, many linear operations in d dimensional spaces decompose into the corresponding operations in one dimensional space. The main task is to keep the approximation rank as low as possible.

5.4.2 Tensor decomposition

Once the function has been approximated in the above form, the next step is to discretize it, u(x1(η1), x2(η2),· · ·, xd(ηd)) =

j=1

θju^j₁(x^η₁¹)u^j₂(x^η₂²)· · ·u^j_d(x^η_d^d) (5.16) where{ηi}^di=1are the indices of the data points along the d dimensions. Each dimension is discretized usingnipoints. Since each axis is orthogonal, the discretized representation of the function in (5.15) becomes a d-dimensional tensor,

j=1

θju^j₁⊛u^j₂⊛· · ·⊛u^j_d (5.17) Discretized functionsudbecome the loading/basis vectors. Therefore, the tensorUis written as a sum ofRUrank-1 tensor, each composed of the outer products ofdloading vectors. In

5.4 Preliminaries for the Tensor Decomposition based analysis 125 [BM05], the expression in (5.17) is also termed as the vector in d dimension. Representing a tensor in 2D (array) by a summation ofRUrank-1 bears a special name, the singular value decomposition. For a real matrixU∈R^P¹^×P²,ai ∈R^P¹^×1 andbi ∈ R^P²^×1can be called the left and right singular values vectors if there exist a non-negative number σi such that Uai=σibiandU^Tbi=σiai. In that case the matrixUmay be expressed as,

U = AΣΣΣB

R_U

i=1

σi˜aib^T_i =

R_U

i=1

aib^T_i

whereRUis the decomposition rank. If such decomposition exist, it is referred to as the singular value decomposition (SVD) of the matrixU. The matricesAandBthen hold the left and right singular vectors ofU, which form a set of orthonormal basis functions.

≈

U a1 b1

+ a2 b2

+ · · · + aR_UbRU

Figure 5.1: A SVD representation for 2D arrays

For a three or a higher dimensional tensor, the similar decomposition is referred to as the Canon-ical decomposition or the Parallel factor decomposition (CANDECOMP-PARAFAC Decompo-sition or CPD) and is defined by the equation 5.17. There are some interesting properties of CP decomposition. First, the approximation rank defined as the smallest number of rank-1 tensors that exactly generate the full tensor, can only be known within certain bounds. That is, assuming such a sum actually exist in the first place. Secondly, as opposed to SVD, CPD can be uniquely determined under a much milder set of assumptions, except for the inherent permutation and scaling indeterminacies. Typically, the CPD is computed using the Alternating Least Squares (ALS) algorithm. For a very lucid description of tensor representation, decomposition and re-lated topics, please refer to [KB09]. A typical CPD representation of a three dimensional tensor is shown below,

U ≈ a1 b1

+ a2 b2

+ · · · + aRU

bRU

cRU

Figure 5.2: A CPD representation for a 3D tensor

5.4.3 Tensorization of a linear operator

A linear operatorAin dimension d can be defined as a linear mapA:F→F, whereFbeing the space of functions in dimension d. A matrixAin dimension d is a defined to be the discrete representation of a linear operator in dimension d. In the way similar to the representation of

126 5 Tensors and the Log homotopy based particle flow

the function in dimension d, the operatorAis first represented with the indices,

N_A

j=1

µjA^j₁(x^η¹^,η

′ 1

1 )A^j₂(x^η²^,η

′ 2

1 ,)· · ·A^j_d(x^η^d^,η

′ d

d ,) (5.18)

which is further discretized to yield the following representation,

N_A

j=1

µjA^j₁⊛A^j₂⊛· · ·⊛A^j_d (5.19)

Next, two important operations are defined. The first is the inner product between two d dimen-sional vectors (tensors),

hU,Vi=

N_U₁

j1=1 N_U₂

j2=1

θj1θj2hu^j₁¹v^j₁²ihu^j₂¹v^j₂²i · · · hu^j_d¹v^j_d²i (5.20)

Vectors along each dimensions are multiplied such that the result is a scaler. Next, the operation of a tensorized operator on a tensorized function is defined such that,

hA,Vi=

N_A

i=1 N_U

j=1

µiθj

Aⁱ₁v^j₁

⊗ Aⁱ₂v^j₂

⊗ · · · ⊗ Aⁱ_dv^j_d

(5.21)

It can be seen that the result of this product is another tensor, albeit with increased approxima-tion rankN_U^′ =NA×NU. The important thing to note is the operator has been decomposed into its constituent matrices, each of which operates along its dimension individually. In other words, a multi-dimensional operator has been decomposed into a sequence of single dimen-sional operators. This is a very major advantage of using the separated representation for func-tions and operators. This will be put to fullest of use in the description of a unified framework for the solution of FPE as per [SK15b], in the section 5.6.

Im Dokument Nonlinear Filtering based on Log-homotopy Particle Flow (Seite 133-136)