Appendix: Technical materials for the asymptotic results

In this appendix, we provide the detailed derivations for the asymptotic results under IRa.

The derivations for the asymptotic results under IRb and IRc as well as the theorems in Section 5 are delegated in the supplement. Throughout the appendix, we use ¯K to denote K+ 1 and ¯T to denote T−K−1. To facilitate the analysis, we introduce the following the auxiliary identification condition (an intermediate step analysis)

AU1 The underlying parameter values θ^∗ = (Λ^∗,Γ^∗, F^∗,Φ^∗,Σee) satisfy: _N¹Λ^∗′Σ⁻¹_eeΛ^∗ = Q^∗,_T¹ ^P^T_t=1f_t^∗f_t^∗′ =I_r1 and _T¹ ^P^T_t=1f_t^∗g^′_t= 0, where Q^∗ is a diagonal matrix, whose diagonal elements are distinct and arranged in descending order.

Appendix A: The asymptotic results of the QMLE

In this appendix, we show that the QMLE ˜λi,σ˜²_i and ˜ftare respectively consistent estimator ofλ^∗_i, σ²_i and f_t^∗ in AU1.

Proposition A.1 Under Assumptions A-D, together with AU1,

λ˜_i−λ^∗_i =1 T

XT t=1

f_t^∗f_t^∗′⁻¹1 T

XT t=1

f_t^∗e_it+O_p(N⁻^1/2T⁻^1/2) +O_p(T⁻¹), (A.1)

γi−γ_i^∗ =1 T

XT t=1

gtgt

−11 T

XT t=1

gteit

, (A.2)

f˜_t−f_t^∗ =1 N

XT i=1

σ²_iλ^∗_iλ^∗′_i ⁻¹1 N

XN i=1

σ²_iλ^∗_ie_it+O_p(N⁻^1/2T⁻^1/2) +O_p(T⁻¹), (A.3)

σ²_i −σ²_i = 1 T

XT t=1

(e²_it−σ²_i) +O_p(N^−1/2T^−1/2) +O_p(T⁻¹), (A.4)

Proof of Proposition A.1Write z_t= Λ^∗f_t^∗+ Γ^∗g_t+e_t into matrix form,

Z = Λ^∗F^∗′+ Γ^∗G^′+e. (A.5)

Post-multiplyingM_G=I_T −G(G^′G)⁻¹G^′ on both sides, together withF^∗′G= 0 by AU1, we have

ZM_G= Λ^∗F^∗′+eM_G.

LetY =ZM_G and y_tdenotes the t-th column ofY. The above equation is equivalent to y_t= Λ^∗f_t^∗+e_t−eG(G^′G)⁻¹g_t (A.6) Bai and Li (2012) derive the asymptotic representations of ˜λi,f˜t,˜σ_i² under the case that g_t ≡ 1. However, when g_t is a general random variable, as like in the present context, the derivation is the same since term eG(G^′G)⁻¹g_t is essentially negligible. Using the arguments of Bai and Li (2012) under IC3, we obtain (A.1), (A.3) and (A.4). Consider (A.2). Substituting zit = λ^∗′_i f_t^∗+γ_i^∗′gt+eit into ˜γi = (^P^T_t=1gtg^′_t)⁻¹ ^P^T_t=1gt(zit−˜λ^′_if˜t), we have

γ_i−γ_i^∗ = XT t=1

g_tg^′_t⁻¹ XT t=1

g_te_it− XT t=1

g_tg_t^′⁻¹ XT t=1

g_tf_t^∗′(˜λ_i−λ^∗_i)

− XT t=1

g_tg^′_t⁻¹ XT t=1

g_t( ˜f_t−f_t^∗)^′λ˜_i

The second term of the right hand side is zero by ^P^T_t=1g_tf_t^∗′ =G^′F^∗ = 0. Consider the third term. Notice

f˜_t= (˜Λ^′Σ˜⁻¹_eeΛ)˜ ⁻¹Λ˜^′Σ˜⁻¹_eey_t

= (˜Λ^′Σ˜⁻_ee¹Λ)˜ ⁻¹Λ˜^′Σ˜⁻_ee¹^hz_t− XT s=1

z_sg^′_s XT s=1

g_sg^′_s⁻¹g_tⁱ

= (˜Λ^′Σ˜⁻¹_eeΛ)˜ ⁻¹Λ˜^′Σ˜⁻¹_ee^hΛ^∗f_t^∗+e_t− XT s=1

e_sg_s^′ XT s=1

g_sg_s^′⁻¹g_tⁱ Then it follows

f˜_t−f_t^∗ = −A^∗′f_t^∗+ (˜Λ^′Σ˜⁻¹_eeΛ)˜ ⁻¹Λ˜^′Σ˜⁻¹_eee_t

−(˜Λ^′Σ˜⁻_ee¹Λ)˜ ⁻¹Λ˜^′Σ˜⁻_ee¹ XT s=1

esg_s^′ XT s=1

gsg_s^′⁻¹gt

(A.7)

whereA^∗ = (˜Λ−Λ^∗)^′Σ˜⁻_ee¹Λ(˜˜ Λ^′Σ˜⁻_ee¹Λ)˜ ⁻¹.

Given the above expression, together with^P^T_t=1g_tf_t^∗′= 0, we have 1

T XT t=1

g_t( ˜f_t−f_t^∗)^′ = 0 (A.8) Then (A.2) follows. This completes the proof.

Lemma A.1 Under Assumptions A-D, as Lemma C.1(e) of Bai and Li (2012). Given these results, the second term of J11 is O_p(N⁻¹) +O_p(T⁻¹). The last term can be proved to be the same magnitude by the similar arguments. Summarizing these results, we haveJ₁₁=O_p(N⁻¹) +O_p(T⁻¹). TermsJ₁₂and J₂₁ can be proved to beO_p(N^−1/2T^−1/2) +O_p(T⁻¹) similarly as J₁₁. Then (a) follows.

Consider (b). The left hand side of (b) is equal to

"₁

The two none-zero terms of the above are O_p(N⁻¹) +O_p(T⁻¹), which are shown in (a).

Then (b) follows.

Consider (c). The left hand side of (c) is equal to

"₁ So it suffices to consider term ¹_¯

T +O_p(T⁻¹) similarly as Lemma C.1(e) of Bai and Li (2012). Given these results, together withA=Op(N⁻^1/2T⁻^1/2) +Op(T⁻¹) and (˜Λ^′Σ˜⁻_ee¹Λ)˜ ⁻¹ =Op(N⁻¹), we have

Proposition A.2 Under Assumptions A-D, together with the identification condition AU1, for each k= 1,2, . . . , K, we have Φ is obtained by running the regression˜

˜h_t= Φ₁˜h_t₋₁+ Φ₂˜h_t₋₂+· · ·+ Φ_K˜h_t₋_K+error

By Lemma A.1(a) and (b), h1

T¯ XT t= ¯K

(˜h_t−h^∗_t) ˜ψ_t^′^ih1 T¯

XT t= ¯K

ψ˜_tψ˜_t^′ⁱ⁻¹ =O_p(N⁻¹) +O_p(T⁻¹)

h1 T¯

XT t= ¯K

( ˜ψt−ψ_t^∗) ˜ψ^′_t^ih1 T¯

XT t= ¯K

ψ˜tψ˜^′_tⁱ⁻¹ =Op(N⁻¹) +Op(T⁻¹) By Lemma A.1(a) and (c),

h1 T¯

XT t= ¯K

u^∗_tψ˜^′_t^ih1 T¯

XT t= ¯K

ψ˜_tψ˜^′_tⁱ⁻¹ =^h1 T¯

XT t= ¯K

u^∗_tψ_t^∗′^ih1 T¯

XT t= ¯K

ψ^∗_tψ_t^∗′ⁱ⁻¹ +O_p(N⁻¹) +O_p(T⁻¹)

Given this result, we have Φ˜ −Φ^∗ =

XT t= ¯K

u^∗_tψ_t^∗′1 T¯

XT t= ¯K

ψ^∗_tψ_t^∗′⁻¹+O_p(N⁻¹) +O_p(T⁻¹) Post-multiplyingi_k⊗I_r on both sides gives Proposition A.2.

Now we consider the following condition (denoted by AU2), in which the loading re-strictions are the same as AU1 but factor rere-strictions are imposed on the population.

AU2 The underlying parameter values θ^⋆ = (Λ^⋆,Γ^⋆, F^⋆,Φ^⋆,Σ_ee) satisfy: _N¹Λ^⋆^′Σ⁻_ee¹Λ^⋆ = Q^⋆, E(f_t^⋆f_t^⋆′) =I_r1 and E(f_t^⋆g^′_t) = 0, whereQ^⋆ is a diagonal matrix, whose diagonal elements are distinct and arranged in descending order.

Note that the superscript “stars” in θ^⋆ and θ^∗ are different. Different identification restrictions imply different notations. Because AU1 and AU2 are asymptotically the same (the former with sample moment restriction _T¹ ^P_tftf_t^′ =Ir1 and the latter with population moment restriction E(f_tf_t^′) = I_r1), θ^⋆ and θ^∗ are also asymptotically the same. That is why the MLE is also consistent forθ^⋆, which will be proved below.

The following lemma is useful to our analysis.

Lemma A.2 Let Q be anr×r matrix satisfying QQ^′=I_r Q^′V Q=D

whereV is anr×r diagonal matrix with strictly positive and distinct elements, arranged in decreasing order, andDis also diagonal. Then Qmust be a diagonal matrix with elements either−1 or 1 and V =D.

Lemma A.2 is proved in Bai and Li (2012). The following proposition summarizes the asymptotic results under AU2. It shows that the limiting distributions under AU2 have been changed.

Proposition A.3 Under Assumptions A-D, together with the identification condition AU2, when N, T → ∞, we have

Proof of Proposition A.3. Notice that

λ˜_i−λ^⋆_i = (˜λ_i−λ^∗_i) + (λ^∗_i −λ^⋆_i).

We showλ^∗_i and λ^⋆_i are close to each other because AU1 and AU2 are asymptotically the same. Different identification restrictions imply different rotations. LetR^⋆ be the rotation matrix, which transform (λ^∗′_i , γ_i^∗′)^′ to (λ^⋆_i^′, γ_i^⋆^′)^′. Then we have

As mentioned in the main text, due to the fact that the factors g_t are observed, matrix R^⋆′₁₂ is fixed to 0 and matrix R^⋆′₂₂ is fixed toI_r₂. So equation (A.12) reduces to The last equation of (A.13) can also be written as

f_t^∗ =R^⋆₁₁^′f_t^⋆+R^⋆₂₁^′gt. (A.14)

Post-multiplyingg_t^′ on both sides and taking summation overt, by^P^T_t=1g_tf_t^∗′= 0, we have elements either 1 or−1. Since we assume that the sign problem is precluded in our analysis, it followsR^⋆₁₁⁻¹−→^p Ir1. Let where Ndg{·} denotes the non-diagonal elements of the argument. Neglecting the terms U^⋆Q^⋆U^⋆′ and U^⋆′U^⋆ since they are of smaller order than U^⋆, we can uniquely determine matrixU^⋆ by solving the equation system (A.20) and (A.21). Let V^⋆ be the leading term

of U^⋆. It is easy to see that U^⋆ =O_p(T^−1/2), V^⋆ =O_p(T^−1/2) and U^⋆ = V^⋆+O_p(T⁻¹).

Now consider the asymptotic representation of ˜λi−λ^⋆_i. Notice λ˜_i−λ^⋆_i = ˜λ_i−R^⋆₁₁λ^∗_i = (˜λ_i−λ^∗_i)−(R^⋆₁₁−I_r₁)λ^∗_i By (A.1), the above result is equivalent to

λ˜i−λ^⋆_i =^h1

ByR^⋆′−1₁₁ =I_r₁ +U^⋆′, the above equation is equal to

f˜_t−f_t^⋆ = ( ˜f_t−f_t^∗)−U^⋆′f_t^∗+R^⋆₁₁^′−¹R^⋆′₂₁g_t Substituting (A.3) into the above result, we have

f˜_t−f_t^⋆ = −U^⋆^′f_t^∗+R^⋆₁₁^′−¹R^⋆₂₁^′g_t+

Given the above results, by (A.26), we have the last expression of Proposition A.3. This completes the proof of Proposition A.3.

Proposition A.4 Under Assumptions A-D, together with the identification condition AU2, we have

However, byR^⋆−1₁₁ =I_r₁+V^⋆+O_p(T⁻¹) andR^⋆₂₁=−W^⋆+O_p(T⁻¹), we have

By Proposition A.2, we can rewrite the above result as Φ˜_k−Φ^⋆_k=

Appendix B: The asymptotic results and their proofs under IRa

As in the main text, we use (Λ,Γ, F) to denote the underlying parameters satisfying IRa.

LetR be the rotation matrix which transforms (λ^⋆′_i , γ_i^⋆′)^′ into (λ^′_i, γ^′_i)^′. Then we have The last equation in (B.2) can be written as

f_t^⋆ =R₁₁^′ ft+R^′₂₁gt. (B.3)

Note that the rotation matrix R is nonrandom. To see this, both AU2 and IRa impose restrictions on the loadings and the covariance of h_t. So the rotation matrix R, which transform the underlying parameters from AU2 to IRa, only involves loadings and covari-ance ofh_t. Thus it is nonrandom. This is in contrast withR^⋆, which is random since AU1 involvesf_t.

Post-multiplyingg_t^′ on both sides and taking the expectation, byE(f_t^⋆g_t^′) = 0, we have R21=−∆⁻_gg¹∆_gfR11.

Defineφt=R^′−₁₁¹f_t^⋆. From the above results,φhas an alternative expression

φ_t=f_t−∆_{f g}∆⁻_gg¹g_t. (B.4) The following lemmas will be used in the subsequent proof.

Lemma B.1 For any compatible matrices A and B and their corresponding estimates Aˆ andBˆ, we have

AˆBˆ⁻¹Aˆ^′− AB⁻¹A^′ = ( ˆA − A)B⁻¹A^′+AB⁻¹( ˆA − A)^′− AB⁻¹( ˆB − B)B⁻¹A^′+R where

R=−( ˆA − A) ˆB⁻¹( ˆB − B)B⁻¹A^′+ ( ˆA − A) ˆB⁻¹( ˆA − A)^′

+ABˆ⁻¹( ˆB − B)B⁻¹( ˆB − B)B⁻¹Aˆ^′− ABˆ⁻¹( ˆB − B)B⁻¹( ˆA − A)^′. Lemma B.1 can be proved easily by matrix algebra.

Lemma B.2 Under Assumptions A-D, we have (a) 1

T¯

H˜^′M_Ψ_˜H˜ − 1

T¯H^∗′M_Ψ^∗H^∗ =O_p(N⁻¹) +O_p(T⁻¹) (b) 1

T¯H^⋆′M_Ψ^⋆H^⋆− 1

T¯H^∗′M_Ψ^∗H^∗ =B^⋆′Ω^⋆+ Ω^⋆B^⋆+O_p(T⁻¹) (c) 1

T¯H˜^′MΨ˜H˜ − 1

T¯H^⋆^′M_Ψ^⋆H^⋆ =−B^⋆^′Ω^⋆−Ω^⋆B^⋆+O_p(N⁻¹) +O_p(T⁻¹) where ¹_¯

TH˜^′MΨ˜H˜ is defined as 1

T¯

H˜^′M_Ψ_˜H˜ = 1 T¯

XT t= ¯K

˜h_t˜h^′_t−1 T¯

XT t= ¯K

˜h_tψ˜^′_t1 T¯

XT t= ¯K

ψ˜_tψ˜_t^′⁻¹1 T¯

XT t= ¯K

ψ˜_t˜h^′_t,

and _T¹_¯H^∗′M_Ψ^∗H^∗ and _T¹_¯H^⋆^′M_Ψ^⋆H^⋆ are defined similarly.

Proof of Lemma B.2. Consider (a). By Lemma A.1(a), we have 1

T¯ XT t= ¯K

h˜_t˜h^′_t− 1 T¯

XT t= ¯K

h^∗_th^∗′_t =O_p(N⁻¹) +O_p(T⁻¹)

1 Given the above results, together with Lemma B.1, we have (a).

Consider (b). By h^⋆_t =R^⋆^′−¹h^∗_t, we have ψ_t^⋆ = (I_K⊗R^⋆^′−¹)ψ_t^∗. This gives Substituting (B.6) into (B.5), we have (b).

Result (c) is a direct result of (a) and (b). This completes the proof.

Proposition B.1 Under Assumption A-D, together with the identification condition IR1, we have

XN i=1

σ_i²λ_iλ^′_i⁻¹1 N

XT i=1

σ_i²λ_ie_it−V^′f_t−W^′g_t+O_p(N⁻¹) +O_p(T⁻¹) wherevec(V) =B⁻¹

Q P₁D⁺_r₁_T¹_¯^P^T_{t= ¯}_K[εt⊗εt−vec(Ir1)];φt=ft−∆_{f g}∆⁻¹_gggt;∆_φφ=E(φtφ^′_t);

W = Ω⁻υυ¹ 1 T

t= ¯Kυ_tε^′_t; η_t=g_t−∆_gf∆⁻_{f f}¹f_t; ∆_ηη =E(η_tη_t^′).

Proof of Proposition B.1. Consider the VAR expression under AU2:

h^⋆_t = Φ^⋆₁h^⋆_t₋₁+ Φ^⋆₂h^⋆_t₋₂+· · ·+ Φ^⋆_Kh^⋆_t₋_K+u^⋆_t. Pre-multiplyingR^′−1 gives

h_t= R^′−¹Φ^⋆₁R^′h_t−1+· · ·+ R^′−¹Φ^⋆_KR^′h_t−K+R^′−¹u^⋆_t. So we have Φ_i =R^′−¹Φ^⋆_iR^′ fori= 1,2, . . . , K and u_t=R^′−¹u^⋆_t. Then we have

εt=R^′−₁₁¹ε^⋆_t −R^′−₁₁¹R^′₂₁υ^⋆

t, υ_t=υ^⋆

t. (B.7)

Post-multiplyingυ^′

t on both sides and taking the expectation, byE(εtυ^′

t) = 0, we have R₂₁= Ω^⋆υυ⁻¹Ω^⋆υε, (B.8) Substituting the proceeding result into (B.7), byE(ε_tε^′_t) =I_r₁, we have

Ω^⋆_εε·υ= Ω^⋆_εε−Ω^⋆_ευΩ^⋆υυ⁻¹Ω^⋆υε=R^′₁₁R11. (B.9) where Ω^⋆_εε=E(ε^⋆_tε^⋆_t^′),Ω^⋆υυ =E(υ^⋆_tυ^⋆_t^′) and Ω^⋆_ευ =E(ε^⋆_tυ^⋆_t^′). In addition, the identification condition also requires that

Q= 1

NΛ^′Σ⁻¹_eeΛ =R₁₁1

NΛ^⋆′Σ⁻¹_eeΛ^⋆R^′₁₁. This is equivalent to

Q^⋆ = 1

NΛ^⋆′Σ⁻¹_eeΛ^⋆=R⁻₁₁¹QR^′−₁₁¹. (B.10) However, our estimation procedure implies that the estimators of R₁₁, R₂₁, denoted by Rˆ₁₁,Rˆ₂₁, satisfy

Rˆ₂₁= ˜Ω⁻¹υυΩ˜υε (B.11) Rˆ^′₁₁Rˆ₁₁= ˜Ω_εε·υ= ˜Ω_εε−Ω˜_ευΩ˜⁻¹υυΩ˜υε (B.12)

Rˆ11

NΛ˜^′Σ˜⁻_ee¹Λ˜Rˆ₁₁^′ = diag (B.13) where ˜Ω_εε,Ω˜υυ,Ω˜_ευ are submatrices of ˜Ω, which is defined as

Ω =˜ 1 T¯

XT t= ¯K

˜ u_tu˜^′_t

with ˜u_t being the residuals of the regression

The above result can be rewritten as Ω˜ −Ω^⋆=

By (B.15), we have ˜Ω−Ω^{⋆ p}−→0. Then it follows ˜Ω_εε_·υ−Ω^⋆_εε·υ

−p

→0, where ˜Ω_εε_·υ and Ω^⋆_εε·υ

are defined in (B.9) and (B.12). Thus

Rˆ^′₁₁Rˆ₁₁R⁻₁₁¹R^′−₁₁¹ −→^p I_r1, which, by the fact thatAB=I thenBA=I, leads to

( ˆR₁₁R⁻₁₁¹)^′( ˆR₁₁R⁻₁₁¹)−→^p I_r1 (B.19) Furthermore, by (B.13), we have

( ˆR₁₁R₁₁⁻¹)^hR₁₁1

NΛ˜^′Σ˜⁻¹_eeΛ˜R^′₁₁ⁱ( ˆR₁₁R⁻¹₁₁)^′ = diag

By _N¹Λ˜^′Σ˜⁻_ee¹Λ˜ −_N¹Λ^⋆^′Σ⁻_ee¹Λ^⋆ =o_p(1) andR₁₁_N¹Λ^⋆^′Σ⁻_ee¹Λ^⋆R^′₁₁= _N¹Λ^′Σ⁻_ee¹Λ =Q, we have ( ˆR₁₁R⁻¹₁₁)Q( ˆR₁₁R⁻¹₁₁)^′ = diag (B.20) NoticeQis a diagonal matrix by identification. Applying Lemma A.2 to (B.19) and (B.20), we have ˆR₁₁R₁₁⁻¹ converges to a diagonal matrix whose diagonal elements are either 1 or

−1. However, the possibility of−1 is precluded by our sign restrictions. Given this result, we have ˆR11−R11 p

−

→ 0. Henceforth, we use ∆R^d11 to denote ˆR11−R11. Apparently

∆Rd₁₁−→^p 0. By (B.9) and (B.12), we have

Rˆ^′₁₁Rˆ₁₁−R^′₁₁R₁₁= ˜Ω_εε−Ω^⋆_εε−( ˜Ω_ευΩ˜⁻¹υυΩ˜υε−Ω^⋆_ευΩ^⋆−1υυ Ω^⋆υε)

Substituting (B.16)-(B.18) into the above equation, together with Lemma B.1, we have

∆Rd^′₁₁R₁₁+R^′₁₁∆R^d₁₁+∆R^d^′₁₁∆R^d₁₁=−V^⋆′Ω^⋆_εε_·υ−Ω^⋆_εε_·υV^⋆ +1

T¯ XT t= ¯K

h(ε^⋆_t −Ω^⋆_ευΩ^⋆υυ⁻¹υ^⋆

t)(ε^⋆_t −Ω^⋆_ευΩ^⋆υυ⁻¹υ^⋆

t)^′−Ω^⋆_εε·υ

i+O_p(N⁻¹) +O_p(T⁻¹).

However, by (B.7) and (B.8), we haveR^′₁₁ε_t=ε^⋆_t−Ω^⋆_ευΩ^⋆υυ⁻¹υ^⋆_t. Given this result, together with (B.9), we have

∆Rd^′₁₁R₁₁+R^′₁₁∆R^d₁₁+∆R^d^′₁₁∆R^d₁₁=−V^⋆′R^′₁₁R₁₁−R^′₁₁R₁₁V^⋆ +R^′₁₁^h1

T¯ XT t= ¯K

ε_tε^′_t−I_r1

iR₁₁+O_p(N⁻¹) +O_p(T⁻¹). (B.21)

Pre-multiplying R^′−1₁₁ and post-multiplying R⁻¹₁₁ on both sides, and neglecting the smaller order termR^′−₁₁¹∆R^d^′₁₁∆R^d11R⁻₁₁¹, we have

∆Rd₁₁R⁻¹₁₁ +R₁₁V^⋆R₁₁⁻¹+∆R^d₁₁R₁₁⁻¹+R₁₁V^⋆R⁻¹₁₁^′

= 1 T¯

XT t= ¯K

(ε_tε^′_t−I_r1) +O_p(N⁻¹) +O_p(T⁻¹).

(B.22)

Now consider 1

NΛ˜^′Σ˜⁻_ee¹Λ˜− 1

NΛ^⋆^′Σ⁻_ee¹Λ^⋆ = 1 N

XN i=1

σ²_i(˜λi−λ^⋆_i)˜λ^′_i+ 1 N

XN i=1

σ²_iλ˜i(˜λi−λ^⋆_i)^′

− 1 N

XN i=1

σ_i²(˜λ_i−λ^⋆_i)(˜λ_i−λ^⋆_i)^′+ 1 N

XN i=1

λ^⋆_iλ^⋆′_i ( 1

˜ σ²_i − 1

σ_i²).

The last term is Op(N⁻^1/2T⁻^1/2) +Op(T⁻¹) which is shown in Bai and Li (2012). The third term isO_p(T⁻¹). The first two terms areV^⋆Q^⋆+Q^⋆V^⋆^′+O_p(N⁻^1/2T⁻^1/2)+O_p(T⁻¹) by Proposition A.3. Then it follows

NΛ˜^′Σ˜⁻_ee¹Λ˜− 1

NΛ^⋆^′Σ⁻_ee¹Λ^⋆ =V^⋆Q^⋆+Q^⋆V^⋆^′+O_p(N⁻^1/2T⁻^1/2) +O_p(T⁻¹). (B.23) Given the above results, (B.13) is equivalent to

Ndg

Rˆ₁₁(Q^⋆+V^⋆Q^⋆+Q^⋆V^⋆′) ˆR^′₁₁

=O_p(N^−1/2T^−1/2) +O_p(T⁻¹).

Substituting (B.10) into the proceeding equation, we have Ndg

Rˆ₁₁(R⁻¹₁₁QR^′−1₁₁ +V^⋆R⁻¹₁₁QR^′−1₁₁ +R⁻¹₁₁QR^′−1₁₁ V^⋆′) ˆR^′₁₁

=O_p(N^−1/2T^−1/2) +O_p(T⁻¹).

Replace ˆR₁₁=∆R^d₁₁+R₁₁, the left hand side is (neglecting Ndg)

Q+∆R^d₁₁R₁₁⁻¹Q+Q(∆R^d₁₁R⁻₁₁¹)^′+∆R^d₁₁R⁻₁₁¹Q(∆R^d₁₁R⁻₁₁¹)^′Q +R11V^⋆R⁻¹₁₁Q+∆R^d11V^⋆R₁₁⁻¹Q+ ˆR11V^⋆R⁻¹₁₁Q(∆R^d11R⁻¹₁₁)^′ +QR^′−1₁₁ V^⋆^′R^′₁₁+QR^′−1₁₁ V^⋆^′∆R^d^′₁₁+ (∆R^d₁₁R₁₁⁻¹)QR^′−1₁₁ V^⋆^′Rˆ^′₁₁

By neglecting the terms of smaller magnitude and noticing that Ndg(Q) = 0, we have Ndgⁿ∆R^d₁₁R⁻₁₁¹+R₁₁V^⋆R⁻₁₁¹Q (B.24) +Q∆R^d₁₁R₁₁⁻¹+R₁₁V^⋆R⁻¹₁₁^′^o=O_p(N^−1/2T^−1/2) +O_p(T⁻¹).

Let V = ∆R^d₁₁R⁻¹₁₁ +R₁₁V^⋆R₁₁⁻¹. Taking the half-vectorization operation vech(·) which stacks the elements on and below the diagonal of the argument into a vector on both sides of (B.22), we get

vech(V+V^′) = vech^h1 T¯

XT t= ¯K

(εtε^′_t−Ir1)ⁱ+Op(N⁻¹) +Op(T⁻¹).

By the definitions of duplication matrixD_r1 and its Moore-Penrose inverseD⁺_r₁, and sym-metrizer matrixS_r₁ = (I_r²

1+K_r₁)/2, the left hand side of the above equation can be written as

vech(V+V^′) =D⁺_r₁vec(V+V^′) = 2D_r⁺₁Sr1vec(V) = 2D_r⁺₁vec(V),

where the last equation is due toD⁺_r₁S_r₁ =D⁺_r₁, we have 2D_r⁺₁vec(V) =D_r⁺₁vec^h1

T¯ XT t= ¯K

(εtε^′_t−Ir1)ⁱ+Op(N⁻¹) +Op(T⁻¹). (B.25) Let veck(M) be the operation which stacks the elements below the diagonal into a vector.

Let D₁ be the matrix such that veck(M) =D₁vec(M) for any symmetric matrix M. By (B.24), we have

veck(VQ+QV^′) =O_p(N⁻^1/2T⁻^1/2) +O_p(T⁻¹).

implying

D₁[Q⊗I_r₁+ (I_r₁⊗Q)K_r₁]vec(V) =O_p(N⁻^1/2T⁻^1/2) +O_p(T⁻¹). (B.26) The preceding two equations imply

2D_r⁺₁

D₁[Q⊗I_r₁+ (I_r₁ ⊗Q)K_r₁]

vec(V) =

D_r⁺₁vec¹_¯

P_T

t= ¯K(ε_tε^′_t−I_r₁) 0

+O_p(N⁻¹)+O_p(T⁻¹).

LetB_Qbe the matrix before vec(V) andP₁ = [Ip,0p×q]^′, then the above result is equivalent to

vec(V) =B⁻¹_Q P₁D⁺_r₁vec^h1 T¯

XT t= ¯K

(ε_tε^′_t−I_r1)ⁱ+O_p(N⁻¹) +O_p(T⁻¹).

DefineV by

vec(V) =B⁻¹_Q P₁D_r⁺₁vec^h1 T¯

XT t= ¯K

(ε_tε^′_t−I_r₁)ⁱ. Then by the definition ofV,

∆Rd11R⁻₁₁¹+R11V^⋆R₁₁⁻¹ =V + +Op(N⁻¹) +Op(T⁻¹). (B.27) Post-multiplyingR₁₁on both sides of (B.27), we have

∆Rd₁₁=−R₁₁V^⋆+V R₁₁+O_p(N⁻¹) +O_p(T⁻¹) =O_p(T^−1/2) +O_p(N⁻¹) (B.28) sinceV^⋆ =O_p(T⁻^1/2) and V =O_p(T⁻^1/2).

Now consider ˆλi−λi. By ˆλi = ˆR11λ˜i and λi=R11λ^⋆_i, we have

λˆ_i−λ_i = ˆR₁₁λ˜_i−R₁₁λ^⋆_i =∆R^d₁₁λ^⋆_i +R₁₁(˜λ_i−λ^⋆_i) +∆R^d₁₁(˜λ_i−λ^⋆_i).

The last term of right hand side isO_p(T⁻¹) +O_p(N⁻²) by ˜λ_i−λ^⋆_i =O_p(T^−1/2) +O_p(N⁻¹) and ∆R^d₁₁=O_p(T^−1/2) +O_p(N⁻¹). By (B.28) and (A.24), together with λ_i =R₁₁λ^⋆_i, we have

λˆ_i−λ_i =V λ_i+R₁₁1 T

XT t=1

f_t^⋆f_t^⋆′⁻¹1 T

XT t=1

f_t^⋆e_it+O_p(N⁻¹) +O_p(T⁻¹).

Using (B.4), the above expression can be rewritten as

To derive the remaining asymptotic results, we first consider∆R^d21= ˆR21−R21. Notice Rˆ₂₁−R₂₁= ˜Ω⁻υυ¹Ω˜υε−Ω^⋆υυ⁻¹Ω^⋆υε=−Ω^⋆υυ⁻¹( ˜Ωυυ−Ω^⋆υυ)Ω^⋆υυ⁻¹Ω^⋆υε+ Ω^⋆υυ⁻¹( ˜Ωυε−Ω^⋆υε)

−( ˜Ω⁻¹υυ−Ω^⋆−1υυ )( ˜Ωυυ−Ω^⋆υυ)Ω^⋆−1υυ Ω^⋆υε+ ( ˜Ω⁻¹υυ−Ω^⋆−1υυ )( ˜Ωυε−Ω^⋆υε).

The last two terms of the right hand side are Op(N⁻²) +Op(T⁻¹). Substituting (B.17) and (B.18) into the above result, we have

∆Rd21= Ω^⋆υυ⁻¹ Substi-tuting (A.25), (A.24) and (B.29) into the above result, we have

Consider the last equation of (B.2). Post-multiplyingg_t^′ on both sides and taking expecta-tion, by E(f_t^⋆g^′_t) = 0, we have

R21R⁻₁₁¹=−∆⁻_gg¹∆_gf.

The preceding two results imply that the third expression of (B.30) is equal to

−∆⁻¹_gg∆_gf∆⁻¹_φφ^h1 T

XT t=1

φ_te_itⁱ+O_p(T⁻¹).

Let Ξ_t=∆_gf∆⁻_φφ¹φ_t. Given the above result, the asymptotic representation of ˆγ_i−γ_i can be rewritten as

γ_i−γ_i =∆⁻¹_gg^h1 T

XT t=1

(g_t−Ξ_t)e_itⁱ+W λ_i+O_p(N⁻¹) +O_p(T⁻¹). (B.31) The above asymptotic representation has an alternative expression. First, we define

η_t=g_t−E(g_tf_t^′)[E(f_tf_t^′)]⁻¹f_t=g_t−∆_gf∆⁻_{f f}¹f_t. (B.32) which implies that

∆_ηη =∆_gg−∆_gf∆⁻¹_{f f}∆_{f g} By the Woodbury formula, we have

∆⁻_gg¹ =∆⁻_ηη¹−∆⁻_ηη¹∆_gf(∆_{f f} +∆_{f g}∆⁻_ηη¹∆_gf)⁻¹∆_{f g}∆⁻_ηη¹ (B.33) With (B.33) and the relation thatgt−Ξt=ηt+∆_gf∆⁻_{f f}¹ft−∆_gf∆⁻_φφ¹φt, we can rewrite the first term of the right hand side of (B.31) as

∆⁻¹_gg^h1 T

XT t=1

(g_t−Ξ_t)e_itⁱ=∆⁻¹_ηη 1 T

XT t=1

η_te_it+∆⁻¹_ηη∆_gf1 T

XT t=1

(∆⁻¹_{f f}f_t−∆⁻¹_φφφ_t)e_it

−∆⁻¹_ηη∆_gf(∆_{f f}+∆_{f g}∆⁻¹_ηη∆_gf)⁻¹∆_{f g}∆⁻¹_ηη 1 T

XT t=1

η_te_it

−∆⁻¹_ηη∆_gf(∆_{f f} +∆_{f g}∆⁻¹_ηη∆_gf)⁻¹∆_{f g}∆⁻¹_ηη∆_gf 1 T

XT t=1

(∆⁻¹_{f f}f_t−∆⁻¹_φφφ_t)e_it

Consider the term (∆⁻¹_{f f}f_t−∆⁻¹_φφφ_t). From the definition ofφ_t=f_t−∆_{f g}∆⁻_gg¹g_t, we have

∆_φφ=∆_{f f}−∆_{f g}∆⁻_gg¹∆_gf (B.34) which can be used to derive

φ_t=f_t−∆_{f g}∆⁻¹_gg(η_t+∆_gf∆⁻¹_{f f}f_t) =∆_φφ∆⁻¹_{f f}f_t−∆_{f g}∆⁻¹_ggη_t Then

(∆⁻_{f f}¹ft−∆⁻_φφ¹φt) =∆⁻_φφ¹∆_{f g}∆⁻_gg¹ηt

With the above equation, the first term of the right hand side of (B.31) can be further rewritten as

∆⁻_gg¹^h1 T

XT t=1

(g_t−Ξ_t)e_itⁱ=∆⁻_ηη¹1 T

XT t=1

η_te_it+∆⁻_ηη¹∆_gf∆⁻_φφ¹∆_{f g}∆⁻_gg¹1 T

XT t=1

η_te_it (B.35)

−∆⁻_ηη¹∆_gf(∆_{f f}+∆_{f g}∆⁻_ηη¹∆_gf)⁻¹∆_{f g}∆⁻_ηη¹1 T

XT t=1

η_te_it

−∆⁻_ηη¹∆_gf(∆_{f f}+∆_{f g}∆⁻_ηη¹∆_gf)⁻¹∆_{f g}∆⁻_ηη¹∆_gf∆⁻_φφ¹∆_{f g}∆⁻_gg¹1 T

XT t=1

ηteit

From the two basic facts that

∆⁻¹_φφ =∆⁻¹_{f f} +∆⁻¹_{f f}∆_{f g}∆⁻_ηη¹∆_gf∆⁻¹_{f f}, and

∆⁻¹_{f f}∆_{f g}∆⁻¹_ηη =∆⁻¹_φφ∆_{f g}∆⁻¹_gg.

we can rewrite the 2nd, 3rd and 4th terms on the right hand side of (B.35) as

∆⁻¹_ηη∆_gf∆⁻_φφ¹−∆⁻_{f f}¹−∆⁻_{f f}¹∆_{f g}∆⁻¹_gg∆⁻_gf¹∆⁻_φφ¹∆_{f g}∆⁻¹_gg 1 T

XT t=1

η_te_it

which equals zero by (B.34). So we can alternatively write the asymptotic representation of ˆγ_i−γ_i as

γ_i−γ_i =∆⁻¹_ηη^h1 T

XT t=1

η_te_itⁱ+W λ_i+O_p(N⁻¹) +O_p(T⁻¹).

We proceed to consider ˆft−ft. Notice ˆft = ˆR₁₁^′−¹f˜t−Rˆ^′−₁₁¹Rˆ^′₂₁gt and ft = R^′−₁₁¹f_t^⋆ − R^′−₁₁¹R^′₂₁g_t. Then

fˆ_t−f_t= ˆR^′−1₁₁ f˜_t−Rˆ^′−1₁₁ Rˆ^′₂₁g_t−R^′−1₁₁ f_t^⋆−R^′−1₁₁ R^′₂₁g_t

=−R^′−₁₁¹( ˆR₁₁^′ −R^′₁₁)R^′−₁₁¹f_t^⋆+R^′−₁₁¹( ˜f_t−f_t^⋆)−R^′−₁₁¹( ˆR^′₂₁−R^′₂₁)g_t+R^′−₁₁¹( ˆR^′₁₁−R₁₁^′ )R^′−₁₁¹R^′₂₁g_t

−( ˆR^′−1₁₁ −R^′−1₁₁ )( ˆR^′₁₁−R^′₁₁)R^′−1₁₁ f_t^⋆+ ( ˆR^′−1₁₁ −R^′−1₁₁ )( ˜f_t−f_t^⋆)−( ˆR^′−1₁₁ −R^′−1₁₁ )( ˆR₂₁−R₂₁)g_t +( ˆR^′−1₁₁ −R^′−1₁₁ )( ˆR^′₁₁−R^′₁₁)R^′−1₁₁ R^′₂₁g_t

The last four terms of the above expression areO_p(N⁻²) +O_p(T⁻¹). Given this result, by (B.28), (B.29) and Proposition A.3, we have

fˆ_t−f_t=−V^′(R^′−1₁₁ f_t^⋆−R^′−1₁₁ R^′₂₁g_t)−W^′g_t +^h 1

N XN i=1

σ_i²λ_iλ^′_iⁱ⁻¹1 N

XT i=1

σ²_iλ_ie_it+O_p(N⁻¹) +O_p(T⁻¹)

Byf_t=R^′−1₁₁ f_t^⋆−R^′−1₁₁ R^′₂₁g_t, we have fˆ_t−f_t=1

N XN i=1

σ_i²λ_iλ^′_i⁻¹1 N

XT i=1

σ_i²λ_ie_it−V^′f_t−W^′g_t+O_p(N⁻¹) +O_p(T⁻¹).

This completes the proof.

Proposition B.2 Under Assumptions A-D, together with the identification condition IR1, we have

Φˆ_k−Φ_k= XT t= ¯K

u_tψ_t^′ XT t= ¯K

ψ_tψ^′_t⁻¹(i_k⊗I_r)−B^′Φ_k+ Φ_kB^′+O_p(N⁻¹) +O_p(T⁻¹)

Proof of Proposition B.2. Consider ˆΦ_k −Φ_k. Notice ˆΦ_k = R^′−1Φˆ_kRˆ^′ and Φ_k = R^′−¹Φ^⋆_kR^′. Thus

Φˆ_k−Φ_k=R^′−1Φˆ_kRˆ^′−R^′−1Φ^⋆_kR^′ =R^′−1Φ^⋆_k∆R^d^′−R^′−1∆R^d^′R^′−1Φ^⋆_kR^′+R^′−1( ˆΦ_k−Φ^⋆_k)R^′+V where

V = (R^′−¹−R^′−¹) ˆΦ_k∆R^d^′+( ˆR^′−¹−R^′−¹)( ˆΦ_k−Φ^⋆_k)R^′−( ˆR^′−¹−R^′−¹)∆R^d^′R^′−¹+R^′−¹( ˆΦ_k−Φ^⋆_k)∆R^d^′ However, notice

∆Rd = ˆR−R=

∆Rd₁₁ 0

∆Rd21 0

−R₁₁V^⋆+V R₁₁ 0 W R₁₁−W^⋆−R₂₁V^⋆ 0

+O_p(N⁻¹) +O_p(T⁻¹)

=BR−RB^⋆+O_p(N⁻¹) +O_p(T⁻¹) (B.36)

where

B =

V 0 W 0

, B^⋆ =

V^⋆ 0 W^⋆ 0

and W = (^P^T_t=1υ_tυ^′_t)⁻¹(^P^T_t=1υ_tε^′_t). Then ∆R^d is O_p(T⁻^1/2) since both B and B^⋆ are O_p(T^−1/2). This result together with ˆΦ_k −Φ^⋆_k = O_p(T^−1/2) +O_p(N⁻¹) implies V = O_p(N⁻²) +O_p(T⁻¹). Given this result, together with Φ_k=R^′−1Φ^⋆_kR^′, we have

Φˆ_k−Φ_k= Φ_kR^′−1∆R^d^′−R^′−1∆R^d^′Φ_k+R^′−1( ˆΦ_k−Φ^⋆_k)R^′+O_p(N⁻²) +O_p(T⁻¹) (B.37) Substituting (B.36) into the above equation, together withu_t=R^′−¹u^⋆_t, h_t =R^′−¹h^⋆_t and Proposition A.4, we have

Φˆ_k−Φ_k= XT t= ¯K

u_tψ_t^′ XT t= ¯K

ψ_tψ^′_t⁻¹(i_k⊗I_r)−B^′Φ_k+ Φ_kB^′+O_p(N⁻¹) +O_p(T⁻¹) This completes the proof.

References

Anderson, T. W. (2003)An Introduction to Multivariate Statistical Analysis, John Wily &

Sons.

Anderson, T. W. and H. Rubin (1956) Statistical inference in factor analysis, In Proceedings of the third Berkeley Symposium on mathematical statistics and probability: contributions to the theory of statistics, University of California Press.

Bai, J. (2003) Inferential theory for factor models of large dimensions.Econometrica,71(1), 135–171.

Bai, J. and K. Li (2012a) Statistical analysis of factor models of high dimension, The Annals of Statistics,40:1, 436–465.

Bai, J. and K. Li (2012b) Maximum likelihood estimation and inference for approximate factor models of high dimension, manuscript.

Bai, J. and S. Ng (2002) Determining the number of factors in approximate factor models, Econometrica,70:1, 191–221.

Bai, J. and S. Ng (2013) Principal components estimation and identification of static factors, Journal of Econometrics,176, 18-29.

Bernanke, B. S. and J. Boivin (2003) Monetary policy in a data-rich environment,Journal of Monetary Economics,50:3, 525–546.

Bernanke, B. S., J. Boivin, and P. Eliasz (2005) Measuring the effects of monetary policy:

a factor-augmented vector autoregressive (FAVAR) approach,The Quarterly Journal of Economics,120:1 387–422

Bianchi, F., H., Mumtaz, and P. Surico (2009) The great moderation of the term structure of U.K. interest rates, Jounral of Monetary Economics, 56, 856–871.

Boivin, J., M.P. Giannoni, and I. Mihov (2009) Sticky prices and monetary policy: evidence from disaggregated US data, American Economic Review,99:1, 350–384.

Chamberlain, G. and M. Rothschild (1983) Arbitrage, factor structure, and mean-variance analysis on large asset markets,Econometrica,51:5, 1281–1304.

Christiano, L. J., M. Eichenbaum and C.L. Evans (1999) Monetary policy shocks: What have we learned and to what end? J. B. Taylor and M. Woodford (ed.), Handbook of Macroeconomics,1:1, 65-148.

Chen, L., J. J. Dolado, and J. Gonzalo (2011): Detecting Big Structural Breaks in Large Factor Models, Manuscript, Universidad Carlos III de Madrid.

Cheng, X, Z. Liao, F. Schorfheide (2013). Shrinkage estimation of high-dimensional factor models with structural instabilities, Department of Economics, U. Pennsylvania.

Doan, T., R.B. Litterman, and C.A. Sims (1984) Forecasting and policy analysis using realistic prior distributions,Econometric Reviews,3:1-100.

Doz, C., D. Giannone, and L. Reichlin (2012) A quasi-maximum likelihood approach for large approximate dynamic factor models, Review of economics and statistics, 94(4), 1014–1024.

Doz, C., D. Giannone, and L. Reichlin (2011), A Two-Step estimator for large approximate dynamic factor models based on Kalman filtering, Journal of Econometrics, 164:1, 188–

205.

Fan, J. , Liao, Y., and Mincheva, M. (2011) High Dimensional Covariance Matrix Estima-tion in Approximate Factor Models. The Annals of Statistics, 39, 3320-3356.

Fan, J., Y., Liao and M, Mincheva (2013) Large covariance estimation by thresholding principal orthogonal complements, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(4), 603–680.

Forni, M. and L., Gambetti (2010) The dynamic effects of monetary policy: A strctural factor model approach, Jounral of Monetary Economics, 57(2), 203–216.

Forni, M., M. Hallin, M. Lippi and L. Reichlin (2000) The generalized dynamic-factor model: Identification and estimation. Review of Economics and Statistics,82(4), 540-554.

Goncalves, S., and Perron, B. (2014) Bootstrapping factor-augmented regression models, Journal of Econometrics, 182(1), 156–173.

Hamiltion, J. (1994) Time series analysis,Princeton university press Princeton

Han, X. (2014) Tests for overidentifying restrictions in Factor-Augmented VAR models, Journal of Econometrics, Forthcoming.

Han, X., and A. Inoue (2011): Tests for Parameter Instability in Dynamic Factor Models, Manuscript, North Carolina State University.

Lawley D. N. and A. E. Maxwell (1971)Factor Analysis as a Statistical Method, New York:

American Elsevier Publishing Company.

Leeper, E. M., C. A. Sims, and T. Zha (1996) What does monetary policy do? Brookings Papers on Economic Activity,2, 1-63.

Litterman, R. B. (1986) Forecasting with Bayesian vector autoregressions: five years of experience, Journal of Business and Economic Statistics,4:25-38.

Ludvigson, S. C. and S. Ng (2009) Macro factors in bond risk premia,Review of Financial Studies, 22(12), 5027-5067.

Moench, E. (2008) Forecasting the yield curve in a data-rich environment: A no-arbitrage factor-augmented VAR approach,Journal of Econometrics 146(1), 26-43.

Quah, D. and T. Sargent (1992). A dynamic index model for large cross-section. Federal Reserve Bank of Minneapolis, Discussion Paper 77.

Shintani, M., and Guo, Y. (2011) Finite sample performance of principal components estimators for dynamic factor models: Asymptotic vs. bootstrap approximations, Manuscript, Vanderbilt University

Sims, C. A. (1980) Macroeconomics and Reality,Econometrica,48:1-48.

Sims, C. A. (1992) Interpreting the macroeconomic time series facts: the effects of monetary policy"European Economic Review,36, 975-1000.

Sims, C. A. (1993) A nine-variable probabilistic macroeconomic forecasting model, J.H.

Stock and M.W. Watson, eds., Business Cycles, Indicators, and Forecasting (University of Chicago Press for the NBER, Chicago), Ch. 7:179-204.

Stock, J. H. and M. W. Watson (2002) Forecasting using principal components from a large number of predictors,Journal of the American Statistical Association ,97, 1167–1179.

Stock, J. H. and M. W. Watson (2005) Implications of Dynamic Factor Models for VAR Analysis,manuscript.

Tsai, H., and R. S., Tsay (2010) Constrained factor models, Jounral of the American Statistical Association, 105, 1593-1605.

Watson, M. W. and R. F., Engle (1983) Alternative algorithms for the estimation of dy-namic factor, mimic and varying coefficient regression models,Journal of Econometrics, 23(3), 385–400.

Yamamoto Y. (2011) Bootstrap inference of impulse response functions in factor-augmented vector regression.Manuscript.

Im Dokument EstimationandinferenceofFAVARmodels Bai,JushanandLi,KunpengandLu,Lina MunichPersonalRePEcArchive (Seite 24-48)