x x y ¿¿ tr ¿ x ¿¿ tr ¿ Var X Y = y = Σ − Σ Σ Σ E X Y = y = μ + Σ Σ y − μ X ∨ Y cov X,Y = Σ x ¿¿ tr ¿ x ¿¿ tr ¿

(1)

Supplementary Notes

The derivation of distribution of

tr

¿^T

¿¿

x

^¿

We derive the distribution of

tr

¿^T

¿

x

^¿

using the formula for conditional distribution of two

multivariate normal random vectors. For example, if X N

(

^μX, Σ_X

)

and Y N

(

^μY, Σ_Y

)

, and

cov ( X ,Y )=Σ

_XY , then the distribution of

X ∨Y

also follows a multivariate normal

distribution with mean and covariance matrix

E ( X | Y = y )=μ

_X

+ Σ

_XY

Σ

_Y⁻¹

( ^y− ^μ

Y

)

Var ( X | Y = y )=Σ

_X

−Σ

_XY

Σ

_Y⁻¹

Σ

_YX Based on this property, since

x

^T

y

∼

N ( NE ( X

^T

Y ) , NVar( X

^T

Y )) tr

¿^T

¿

x

¿^¿

and the covariance between

tr

¿^T

¿¿

x

^¿

and x^Ty ^is

(2)

tr

¿^T

(

¿

y

^(tr)

, x

^T

y )

tr

¿^T

¿

tr

¿^T

¿

v

¿^T

(

¿

y

^(v)

)

tr

¿^T

(

¿

y

^(tr)

)

x

^¿

=(N − n)Var ( X

^T

Y )

¿¿

x

^¿

x

^¿

=cov

¿

cov

¿¿

Then we can write the expectation of the distribution of

tr

¿^T

¿

x

^¿

as

tr

¿^T

X

^T

Y

¿⁻¹

( x

^T

y − NE ( X

^T

Y )) (¿ y

^(tr⁾

∨ x

^T

y )=( N −n) E ( X

^T

Y )+ N −n

N Var ( X

^T

Y )Var

¿

x

^¿

E

¿

Replace

NE ( X

^T

Y )

by the observed vector

x

^T

y

^{, we get}

tr

¿^T

(¿ y

^(tr⁾

∨ x

^T

y )= N −n N x

^T

y x

^¿

E

¿

And the variance of the conditional distribution is

tr

¿^T

(

¿

y

^(tr)

∨x

^T

y ) X

^T

Y

¿⁻¹

( N −n) Var ( X

^T

Y )

¿

x

^¿

=(N − n)Var ( X

^T

Y )− N −n

N Var ( X

^T

Y ) Var

¿

Var

¿ ¿

(3)

Replace

Var ( X

^T

Y )

by the observed covariance matrix

Σ

, then

tr

¿^T

¿

( y

⁽^tr⁾^|

x

^T

y

¿

)= ( N −n) n

N Σ

x

^¿

Var

¿

(4)

Supplementary Figures

Fig S1: Comparison of two model-tuning strategies in WTCCC samples under alpha = -2.

(A) PUMAS performance under a causal variant proportion of 0.001. (B) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.001. (C) PUMAS performance under a causal variant proportion of 0.1. (D) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.1. The X-axis shows the log-transformed p-value thresholds. The Y-axis shows the predictive performance quantified by

(5)

average

R

² across four folds.

(6)

Fig S2: Comparison of two model-tuning strategies in WTCCC samples under alpha = -1.

(A) PUMAS performance under a causal variant proportion of 0.001. (B) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.001. (C) PUMAS performance under a causal variant proportion of 0.1. (D) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.1. The X-axis shows the log-transformed p-value thresholds. The Y-axis shows the predictive performance quantified by average

R

(7)

Fig S3: Comparison of two model-tuning strategies in WTCCC samples under alpha = 1.

R

(8)

Fig S4: Comparison of two model-tuning strategies in WTCCC samples under alpha = 2.

R

(9)

Fig S5: Comparison of two model-tuning strategies for binary traits in WTCCC samples under alpha = -2. (A) PUMAS performance under a causal variant proportion of 0.001. (B) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.001. (C) PUMAS performance under a causal variant proportion of 0.1. (D) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.1.

The X-axis shows the log-transformed p-value thresholds. The Y-axis shows the predictive performance quantified by average

R

² for PUMAS and AUC for repeated learning across four folds.

(10)

Fig S6: Comparison of two model-tuning strategies for binary traits in WTCCC samples under alpha = -1. (A) PUMAS performance under a causal variant proportion of 0.001. (B) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.001. (C) PUMAS performance under a causal variant proportion of 0.1. (D) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.1.

R

(11)

Fig S7: Comparison of two model-tuning strategies for binary traits in WTCCC samples under alpha = 0. (A) PUMAS performance under a causal variant proportion of 0.001. (B) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.001. (C) PUMAS performance under a causal variant proportion of 0.1. (D) Repeated learning approach with individual-level data as input under a causal variant proportion of 0.1.

R

(12)

R

² for PUMAS and AUC for repeated learning across

(13)

four folds.

(14)

R

(15)

Fig S10: PUMAS result using clumped IGAP 2013 AD GWAS as input.

(16)

Fig S11: Improvement of predictive R² of optimized 45 traits. (A) PUMAS’s increase in predictive R² comparing to PRS of P=0.01 and P=1 (B) PUMAS’s percentage improvement in predictive R² comparing to PRS of P=0.01 and P=1. The percentage improvement of RA’s predictive performance by PUMAS comparing to its PRS at P=0.01 is truncated to be 2000% in panel B.

(17)

(18)

Fig S12: Computation time for the analysis of 65 GWAS traits. The X-axis shows the number of SNPs in the pruned GWAS. The Y-axis shows the elapsed computation time in seconds.

Fig S13: QQ plot for p-values of LDSC intercept estimates between non-imaging AD- proxy GWAS and UK Biobank imaging traits. P-value for the one-sample t-test of null hypothesis that the mean of LDSC intercepts equals zero is 0.3191.

(19)

Fig S14: QQ plot for associations between breast cancer and UK Biobank neuroimaging traits.

(20)

Fig S15: Comparison of PUMAS’s approximated Σ and theoretical Σ in 8

simulation settings. (A-H) scatter plots of approximated diagonal and off-diagonal elements versus theoretical diagonal and off-diagonal elements in each setting. Details on simulations settings are discussed in the Methods section.

x x y ¿¿ tr ¿ x ¿¿ tr ¿ Var X Y = y = Σ − Σ Σ Σ E X Y = y = μ + Σ Σ y − μ X ∨ Y cov X,Y = Σ x ¿¿ tr ¿ x ¿¿ tr ¿

tr

x

tr

x

(

)

(

)

cov ( X ,Y )=Σ

X ∨Y

E ( X | Y = y )=μ

+ Σ

Σ

( y− μ

)

Var ( X | Y = y )=Σ

−Σ

Σ

Σ

x

y

N ( NE ( X

Y ) , NVar( X

Y )) tr

x

tr

x

tr

(

y

, x

y )

tr

tr

v

(

y

)

tr

(

y

)

x

=(N − n)Var ( X

Y )

x

x

=cov

cov

tr

x

tr

X

Y

( x

y − NE ( X

Y )) (¿ y

∨ x

y )=( N −n) E ( X

Y )+ N −n

N Var ( X

Y )Var

x

E

NE ( X

Y )

x

y

tr

(¿ y

∨ x

y )= N −n N x

y x

E

tr

(

y

∨x

y ) X

( ^y− ^μ