• Keine Ergebnisse gefunden

Modeling of multivariate skewness measure distribution

Margus Pihlak

Tallinn University of Technology (e-mail: margus.pihlak@ttu.ee)

Abstract. In this paper the distribution of random variables skewness measure is modeled. Firstly we present some results of matrix algebra useful in multivariate sta-tistical analyses. Then we apply the central limit theorem on modeling of multivariate skewness measure distribution. That skewness measure is introduced in [6].

Keywords: Central limit theorem, Multivariate skewness measure, Skewness mea-sure distribution.

1 Introduction and basic notations

In the firs section we introduce some notations used in the paper. The k-dimensional zero vector is denoted as0k.The transposed matrixAis denoted asA0.

Let us have random vectors Xi = (Xi1,Xi2, . . . ,Xik)0 where index i = 1,2, ..., nis for observations andkdenotes number of variables. These random vectors are independent and identically distributed copies (each copy for one observations) of a random k-vectorX.Let

x= 1 n

n

X

i=1

Xi

and

S= 1 n−1

n

X

i=1

(Xi−x)(Xi−x)0

be the estimators of the sample mean E(X) = µ and the covariance matrix D(X) =Σ respectively.

Now we present matrix operations used in this paper. One of the widely used matrix operation in multivariate statistics is Kronecker product (or tensor product) A⊗B of matrices A : m×n and B : p×q which is defined as a partitioned matrix

A⊗B= [aijB], i= 1,2, . . . , m;j= 1,2, . . . , n.

3rd SMTDA Conference Proceedings, 11-14 June 2014, Lisbon Portugal C. H. Skiadas (Ed)

c 2014 ISAST

By means of Kronecker product we can present the third and the fourth order moments of vectorX:

m3(X) =E(X⊗X0⊗X) and

m4(X) =E(X⊗X0⊗X⊗X0).

The corresponding central moments

m3(X) =E{(X−µ)⊗(X−µ)0⊗(X−µ)}

and

m4(X) =E{(X−µ)⊗(X−µ)0⊗(X−µ)⊗(X−µ)0}.

The third order moment of random vector X is k2×k-matrix and its fourth order moment isk2×k2-matrix.

The operation vec(X) denotes amn-vector obtained fromm×n-matrix by stacking its columns one under another in natural order. For the properties of Kronecker product and vec-operator the interested reader is referred to [2]

or [4]. In the next section skewness measure will be defined be means of the star-product of the matrices. The star-product was introduced in [7] where some basic properties of the operation were presented and proved.

Definition 1. Let us have matrix A : m×n and a partitioned matrix B :×ns consisting ofr×s-blocksBij, i= 1,2, . . . m;j = 1,2, . . . n.Then the star-productA∗Bis ar×s-matrix

A∗B=

n

X

i=1 m

X

j=1

aijBij.

The star product is inverse operation of Kronecker product in sense of increasing and decreasing of matrix dimensions. One of the star-product applications is presented in the paper [12]. Let us give an example how the star product works.

Example. Let us have matricesA: 2×2 and partitioned matrixB: 4×4 with 2×2-blocksB11, ..., B22. Then

A∗B =

a11 a12 a21 a22

B11 B12 B21 B22

=

=a11B11+...+a22B22.

We also use the matrix derivative defined following H. Neudecker in [10].

Definition 2. Let the elements of the matrix Y : r×s be functions of matrixX:p×q. Assume that for alli= 1,2, . . . p;j= 1,2, . . . q;k= 1,2, . . . r andl = 1,2, . . . spartial derivatives ∂ykl

∂xij exist and are continuous in an open set A. Then the matrix dY

dX is called matrix derivative of matrix Y:r×r by matrixX:p×q in a set A,if

dY

dX = d

dvec0(X)⊗vec(Y)

where

d

dvec0(X) = ∂x

11 · · · ∂x

p1 · · · ∂x

1q · · · ∂x

pq

.

Matrix derivative defined by Definition 2 is calledNeudeckermatrix derivative.

This matrix derivative has been in last 40 years a useful tool in multivariate statistics.

2 Multivariate measures of skewness

In this section we present multivariate skewness measure by means of matrix operation described above. A skewness measure in multivariate case was intro-duced in Mardia [8]. Mori et al [9] have introintro-duced a skewness measure as a vector. B. Klar in [3] has given thorough overview of the skewness problem. In this paper is also examined asymptotic distribution of different skewness char-acteristics. In Kollo [6] a skewness measure vector is introduced and applied in Independent Component Analyses (ICA).

The skewness measure in multivariate case is presented through the the third order moments as

s(X) =E(Y⊗Y0⊗Y) (1) where

Y=Σ−1/2(X−µ).

In Kollo [6] a skewness measure based on (1) is introduced by means of the star product:

b(X) =1k×k∗s(X) (2) where

1k×k=

1 · · · 1 ... . .. ... 1 · · · 1

.

In [5] the Mardia’s skewness measure is presented as through the third order moment:

β= tr(m03(Ym3(Y)

where operation tr denotes the trace of matrix. A sample estimateb(X) of thed skewness vector (2) we can present in the form:

b(X) =d 1k×k

n

X

i=1

(yi⊗y0i⊗yi) (3) where

yi=S−1/2(xi−x)

xand Sare the sample mean vector and the sample covariance matrix of the initial sample (x1,x2, . . . ,xn). The estimatorb(X) is randomd k-vector.

3 Modeling the multivariate skewness measure distribution

In this section we model the distribution of the random variableb(X) definedd by equality (3). From this equality concludes thatb(X) isd k-vector. Let us have a sequence of independent and identically distributed random vectors{Xn}n=1. LetE(Xn) =µandD(Xn) =Σ.Then according to the central limit theorem the distribution of the random vector √

n(Xn −µ) converges to the normal distributionN(0k,Σ) where0k denotesk-dimensional zero vector.

Let us introducek2+k-vector Zn=

x vec(S)

.

Applying the central limit this random vector we get the following convergence in distribution √

n(Zn−E(Zn))7→N(0k2+k,Π) where (k2+k)×(k2+k)-dimensional partitioned matrix

Π=

Σ m30(X) m3(X) Π4

.

Thek2×k2−blockΠ4=m4(X)−vec(Σ)vec0(Σ) ([11]). This convergence can be generalized by means of the following theorem.

Theorem 1. Let {Zn}n=1 be a sequence of k2 +k-component random vectors and ν be a fixed vector such that √

n(Zn −ν) has the limiting dis-tribution N(0k2+k,Π) as n → ∞. Let the function g : Rk2+k → Rk have continuous partial derivatives at zn =ν.Then the distribution of random vec-tor√

n(g(Zn)−g(ν))converges to the normal distributionN(0k2+k, g0z

nΠgzn) where(k2+k)×k-matrix

gzn= dg(zn) dzn

z

n

is Neudecker matrix derivative atzn=ν.The proof of Theorem 1 can be found in the book of T. W. Andreson ([1], page 132).

In our case the function g(zn) =g

x vec(S)

=b(X).d

Applying Theorem 1 we get the following convergence in distribution:

√n(b(X)d −b(X))7→N(0kb).

Here thek×k-matrix Σb=gz0

nΠgzn z

n=(µ vec0(Σ) )0

=

=

dS arek×k- andk×k2-dimensionalNeudeckermatrix derivatives respectively.

Knowing the skewness measure distribution enables to estimate asymmetry ofk-dimensional data. We can find forα-confidence interval for skewness vector b(X).The problem of asymmetry is actual on environmental data for example.

Acknowledgement

This paper is supported by Estonian Ministry of Education and Science target financed theme Nr. SF0140011s09.

References

1. Anderson, T. W. (2003)An Introduction to Multivariate Statistical Analysis. Wiley, New York.

2. Harville, A. (1997)Matrix Algebra from a Statistican’s Perspective. Springer, New York.

3. Klar, B. (2002) A Treatment of Multivariate Skewness, Kurtosis, and Related Statistics.Journal of Multivariate Analysis,83, 141-165.

4. Kollo, T., von Rosen, D. (2005) Advanced Multivariate Statistics with Matrices.

Springer, Dordrecht.

5. Kollo T., Srivastava M. S. (2004) Estimation and testing of parameters in multi-variate Laplace distribution.Comm. Statist.,33, 2363-2687.

6. Kollo, T. (2008) Multivariate skewness and kurtosis measures with an application in ICA.Journal of Multivariate Analyses,99, 2328-2338.

7. MacRae, E. C. (1974) Matrix derivatives with an applications to an adaptive linear decision problem.The Annals of Statistics,7, 381-394

8. Mardia, K. V. (1970) Measures of multivariate skewness and kurtosis measures with applications.Biometrika,57, 519-530.

9. Mori, T. F., Rohatgi, V. K., Szkely. (1993) On multivariate skewness and kurtosis.

Theory Probab. Appl,38, 547-551.

10. Neudecker, H. (1969) Some theorems on matrix differentiations with special ref-erence to Kronecker matrix products.J.Amer.Stat.Assoc.,64, 953-963.

11. Parring, A-M. (1979) Estimation asymptotic characteristic function of sample (in Russian).Acta et Commetationes Universitatis Tartuensis de Mathematica,492, 86-90.

12. Pihlak, M. (2004) Matrix integral.Linear Algebra and Its Applications,388, 315-325

Diffusion Maps in the Reconstruction of