• Keine Ergebnisse gefunden

fo f lo

N/A
N/A
Protected

Academic year: 2022

Aktie "fo f lo"

Copied!
44
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)Traditional frequentist statistics ariauce. poiutestimatiou.ba We assume that date is. generated. of distributions. family. F. Nlp this. More generally. F. particular. a. for example. 1 year. o. o. r. 0. typically denoted. is. family. f fo lo T. by. as. follows. e Ipace. one particular. of all possible parameters. parameter. f one Gawain. ofparamets. space. parameter. stashes. par The. family F. is called. We. are given a. sample. theft Xi. r. Xu. n. fo. typically. iid. but the true underlying 0 is unknown The goal at point estimation is Para space. Cou. Po Ea. refers. distribution. to. estimate. true parameter. 0. 8. to the probability and execration undo the. density. fo. on estimated. parameter.

(2) Ginn. Det and. a. a. Q Def. kn. in. sample. parameter G is. of. F. statistical model. a. F e F. v. function. g xn. is. fo l o e Apointator In. Xu. the bias of such. estimator. an. E. biased. the. Def. wrt. the distribution. on. for. emeineininnin. new. me and away estimate. true para. estimate. eyestatieu. An. is defined as. the estimate. In. d if its bias is zero. is. ator is defined. e. the. In. Varo. corresponding. as. standard deviation. Typically standarderrorsee the called is sen estimated is unknown but it can be a. o r. Exauplei. Fn. Xu. Xp. In. E. n. ki. II. Bernoulli an. m. parameter. p. estimate. of. p. p. am C Co a. re.

(3) thus. pi. I E. Xi. can. of this. p. Ep ti. a. for. example. e. estimate. p. O. is. xT VIVwp. VVarp.CAT. de. Example. p. p. standard error se. we. Ei. pin is unbiased because. Ep the. Fn. Ep. pin. Ep. p. u. estimate. pT fph t T. it by. th. V. h. weight. of baby. Iii by I 2900. Do it. a. gramm. 6. 6 Estimate mean weight. Soo. 6. of. second time. babies brakes Measure weightof. too. 2950 g 2890 g. e. se. estimate isibuh'em of the. tie. 2900. fu. 6am.

(4) Defy. the. mean squared. NSE. error. of. an. estimate. is. the quantity. MSE E ol. theorem. Eu. Eo. EI. deterministic. bias variance decomposition biaico. nl. MI. Egoa. varo.co. estimate. Proof. Eo Ea. Eff. Eu Eu. 01 ran. E En. In EE Y. t. in. E En. 2Ea. G. EI. Eu Edu. El. EILEEN. deterministic. 2. EE. E. Eu EEu. Ef. E. Eo. O. O. Ee H var. E. Eu. full. XE. ol J. EE. deterministic. Eoin. o. 12. bias Eu. Dan. HY.

(5) Example. Model. Ncp F. F Xn. Sample. I r. r. I. i. E. Nlp 6. Xu. n. E. rue. f ok. Let's compute nm. El. 62. o. with unknown p oh. of y. 2. para. r. so. the bias is. f. o. o4. In n. µsE En. n. bias. f. var. fi. o't y. N. Z n r. MSE. iid. ul. Ei. MSE. o. unbiased. 2. Ei. var. R. Enlai il. n. Var. f. x. isn. ie. ECE. e. is an unbiased estimate. In Enki Z. l y. Ei. L. NSE. IE. 64.

(6) Defy. A point estimator. theory. 1h se. 0. am. in. probability. as. u. 0. u. no. a s. a. estimate satisfies as. is consistent. if. strongly consistent. Eu. Ct. of. bias. the the. o estimate. and is consistent.

(7) Confidence sets. DEI Q E. A Ce. IR. is. au. a. of. the. Pe. a. confidence interval. an. interval. Ctr. X. I'tumiuidic. First Kr. such. Xu I. Q c on. au. b in. bn. Xu. sample. on. in. a. parameter. abn. where on. functions. that. for all Ge. x. r. random. truedich. experiment. Ku. for. MI. N N second upeinehi. Xr. Xu. Mi true distr. Niu in. la al of he. inside the red interval. repetitious. the true µ is.

(8) Exampley. p. C. Can. In. Fn. Proof Hoeffding. PC 1 pan. En. fut En for any. inequality. p. p. l. To this end. t. E. inhval. the. then. Zn. cu. r. o. confidence level R. Zia. log. En. a. Can but. on. define. now want to. By. Choon. Xi. i. P X. p. fp. tu. y. t. Want to estimate it. unknown. l Observe Xp. Fn. PC X. with. Coin flips. CI. is. a. t. we. with. coverage. n x. Zapf. have. 2ut. d. t un. 2epf 2ut2 log. 2 ut. E. Chou. East. pitch e. puE p. 2n DM. s. E I. r. pn. cog. Va. 2h.

(9) estimator. Maximum likelihood. utu. Example. 1. A. F. A symmetric. adjacency matrices. the graph of length 1. random wakeproduces a sequence. one. Xp of. Xn Xz. among. vertices. My. estimate A. Goal reconstruct. all adjacency matrices A you have. waves. Parametric. iid points in. The likelihood. Pe. F select the. observed. Maximum likelihood. bare. c. O_O. to have produced the has the highest likelihood. one that. random. far. graphs. walks how. Observe krandom. Idea. of. c. ai. X. Xu. II. Xu. Phil. P. Oo. F. family n. the data given. of. approach. X. fo a. e. fo. I Ge. F. parameter. Xu l. O_O. O_O. is.

(10) To such. estimate the true parameter 0 we that this likelihood is maximized. f. is. P kn. argmae. now. select of. Xu 1 o. OE. Xi. the. problem. argmas G This is equivalent to. I. Gl. ft P. argmax. log. II. argmax. 2. log. Philol P. tile C. which. ear. log likelihood is equivalent to minimizing the negative. afgmin te is is. Can. Ei. log PCtilo Z O. the maximum likelihood approach. this optimization problem is easy it nicht be atle to solve it analytically. Sometimes. if. convex you are lucky it is. Most typically. it is not convex. rare.

(11) Example. analytich. X. I. p. this. d. Poisson. means. t. tee. that. it has Earl bar. A. X. h. Poisson C t Xu Xn Oben estimator to construct the ML. wait. LCH c. E. I. I. s. E. E. II. 33. ki. Now want to. optimize. n. f. E. CH. ion. I. log. 1. for d. E. tog. Cri. It. Take the chivalric. Eiti. n. the ML estimate. of 1. I. n. x. t. log likelihood. I. EE. x. So. Xi. log. u. E. it. Kla. Plea. In. E. Xi. is. 1TExample. O.

(12) MLE. From the theory side. but not always. often. has nice properties. 1. 2. F consists of nice feuchbus is co f the MLE based on an iid sample the model. If. then. the MLE estimate F consists of nice function. Face.is. If. phohayuomal. F. the. in distr. Q. N. and. o n. se. r. 0. MCE. in. distr. N fo. al. Te 3. this. can. Ime. Cu. construct confidence iuhue.cn. be caused to. Zay. sie. µ e. t. Zam. en c E. E where. 2. 0. an. II. E. mass Ne. e mom. Iv. r. Te mad V2. t 2. 42. b. a. 1. t. 1. 412.

(13) Cu is Poe. an. approximate. Q c an. r. CI x. in the as. sense. a su. that.

(14) Duffie Intuition. Sufficiency identifiability sample Xp. ginn. We typically count the. T. we. Can. to. sample. F. E. statistic. a. in the extreme case. Xu. Xa. Question. large. fo. n. tu. recover. the true. one. Q four. parameter. number. Neo's. statistic. infer the. same. when. know. we. two T. Xu. TC Xn. and. observe. when we. Intuition. then. we. calculate. th. want to. the. data. the. Formal definition is. k't. n. then we can. Xu. Tun. likelihood of. Xn. tr. Xu. Xp. samples. technical. lidentifiasilityJ Sometimes. ways. Deff. families. of. distribution. with diflet sets A. parameter 0. identifiable if. pdfs. in. F. i. of for. parameters. a. G. al. fo. F. family. distinct values. G t. be described in d peer. can. of. fo. 0. Qe. correspond. t. fo. is to distinct.

(15) Mixture distributions. Example. F. E ai. N. Cri OE. with. Sai. r. i. female N. e. In. 0.5N. You. observe. one. oil male whole population. 12. Miri. t. 0 5 Nlm o. samples from the whole population to model the data is in tour at the. way mixture male female. as. i.nm.oene. Ii L.is Y 0.3 N pg. or. adore. t. 0.7. Ncpa og.

(16) Hypotheristesting Example. De Dz. Two drugs. days to. number. measure. we. recovery treated. Xu. kn. with Dr. days. l. drug 2 for. draft. ct5. days to. no. Drug 1. Drug 2 Question. average. drug 2. better than. Drug 1. is. days. ur. 10 drug. of recovery drug. 4. 1. 2. ay distribution. of. EIU. of. Fr. want to test. WWE. Ito. whether a coin is. coin is. thhypohesis. Ha i. coin. flips. Sample wavy. coin. anti fair. fair is unfair. and estimate. fu. An. E Xi. recover.

(17) at. pin. the. 0.5. ht S such that 2. e s. In. the null hypothesis. reject. fu. like a conf set. Ppl. f under. of. dinkitution. the. 0.5. retain. far away from. is. far away. Question. Look. fu. if. reject. we want to. gtfo. Moreformatp F. Statistical model o. C. n. fo. C. l. O E. Ho. o. against. Hy i. hut hyp Sample. a. data from the unknown. T Xn such that. Xu. Now. T Cen T Cen. we. Xu. El. that. on HQ. I. want to test G E. Assume. fo. construct. G E n. hyp. compute a a. testohah.hr. rejection region. C. Ru. reject Ho. E. Ra. retain Ho. Rn.

(18) of the form. hypotheses are the. Typical Ho Ho. i. O. O_O. G. C O_O. Two types of. Hr. vs. Hy. vs. E 2. o_O. occur. can. error. G t o_O. i. m. Ha true. Defy. Type E. of. The power function 2 is. If. test with rejeoh.eu region R. E E. f. PCQ. f co. Ideally. we say that o. we. end up. in. R. error. hope that TCH C. R. So. error. is large. test is of level 2. P. Sup C. a. should not. should be small. P Type II. r. C R. Tee. PC Type I. fo. then. E C OH. plot. Ted. then. o. Ideally. Po. e. For such 0. If. a. the function. p. Bet. error. O. E A. if. intuition worst case guarantee who'd a a o we pick. woman. the type I than d. error. is not law.

(19) Intuition to remember we. standard procedure. for. 0.05. example. Then we can. the level 2. fix. E type I. a. of. hit in advance. 0.01. or. I also look at the type. level. For example. error. you several tests of among smallest type II the has that one a. unfit. now. choose. error. the. m. a. error. i. when we test ko evacuated test is a typically the Remark power of 0 C He concrete hypothesis a against level in for testing Let T be a set of tests of. Def. Ho. Q E. A test. in. Hr. vs. o. Q. El. Otto. T with pour function PCQ UMP if. is. pwul. uui. f. fol. I. f. CG. for. all. and for all p that other tests in T. E e on. oc. power functions. for. Kimo. he practice it is often impossible to find. an. UMP test.

(20) awd. s. Ney. likelihood's Suppose. theorem. we. test. Ho E. against Hy. Oo. Q. G. Consider. KOI. T. i. I. f. L Coo. reject Ho if T choose k such that. Assume we. If. we. likelihood ratio. fo k for. Pq. some. 41. k. T. d. lead a test then this is the most powerful. rati. Xihehhood o. Parameter space consider. c. h.fi oc. Hon. then. the test statistic T. L. sup. QE. o. sup a. T. f off. Q. a. or. L Ca. oL. Lca. even. simpler. we.

(21) and is. we. of. determine a parameter. the form. we practice the. f. R. difficulties. T. R. fin. t. E 1. are. compute the suprema. fix. such that the rejechic region. in. practiceI. in theory.

(22) va. p Consider. a. test. level. at. The smaller we. PC type I. L. ofhn. the. 2 ever. denote its. Ra. rejection region as Recall. and. d. more. error. difficult. have that. d. c. does. it get to reject Ho. I. Ra. C. Ry. m rejection np.eu for. Defy. p i. e. defined. is. the. int. f. the smallest. d. I for. Tin. I. as. Xu. which the. C. level. Ra a test. would. reject the null hypothesis are better values intuition p leess error rejecting the wall sneaker. more. evidence. for.

(23) Example. baby boys and. girls. weight 3000g. Sample. many baby girls. many. t. 1. Fg. busy boys. shes. mean. wight. fb. µ. I 3000. 2995. for. u. 1am hsr will find. significant. difference. m. a. statistically small p. a. sea. s.

(24) Multipletesling Example. patients with us. data. expression. genre. control group. cancer. na 20. 20. gene 1. gene 2. go. gm.ie. o. oygqq. FFS. ring a bell. gene toff m. Assume. PL. run. we. for each. test i makes. type. have. tests. Now. we. P at least. one. P ta makes A. I. P. i. in. error. in. I. ta. ti. err. level a. of. 5. error. tn and. p no error in. test. a. the kite makes. of. no error. I. gene. a. type I I. E. no error. in Gz. 1. error. and. Co.gs. TT. tm makes error affiliepender. I 7 m soo. n.

(25) 0.05. men. 0.40. m.io. 50. me. EEE3. 92. O. A. BoufvouiicouholiugFWER.DE Consider. fiuihu. family of. a. I. type. one. error. P ta makes type. FIER. I. occurs. The. the probability in the. family. of. Cz error. i. i. Assume. run. we. FWER. the. is. error. Gm makes type I. a. tests. EWER. familywirate that at least. m. in. N. individual hits. then. we. have. e. tests. and. 2. 0.05. g. with. lent. we. want to achieve Then. m. we. e. run. single.

(26) at least one type I. F WERE p. p E. t. Ee. error. p. ti. error. t. E. E M. makes error. dangle. M Em. d single. simple. Advantage. Disadvantage. correct. too conservative. wide type I. low power. the test barely discovers anything. BenjaminifHodibergiGuholingFDRf. Def. family of Hefalsenjects. Assume we lean. a. II all rejections. thefdsediate. ur. tests. FD R. We. call. error. L.

(27) 8. i FDR. Fix. ppwai. in advance. 2. Ran me in individual p values. thresholds. li. pcio. d a. p. e. Elio. jet. below the red line t. Reject the hypotheses. the. and. the distribution we. Newark. obtain. similar. Benjamin. Hochberg. independent null hypotheses are true. the tests. how many. i. io. index retain. reject to. others. If the. theorem. e. ti. e. for. retain all. io. ie n. pay. E m. c. Find the largest index io such that. 2. E pen E Paz pas E. Sort p values increasingly. Deline. and evaluate their. tests. procedure is. then regardless. on. of. y. FDR approach. values E. applied. of. and regardless of. when. he. will. is false. L. also works without indepencence. am. uh'en.

(28) modifications exist. many. y Under the null hypothesis the p values a l uniform distribution on. 0. Ii. it. it i. n. If. durah. lean some. nt as. undo Hr. Ho and. Hr. some. mark look like. would. it. density of y values. density of p values underHo. we. being hue. i. i. it. a. we. set. we. want. Kus ke. out. that. here we have hopefully many. Goal. a. i ideally it dwald i win like this. 1. n. always have. also have. threshold. t. some. of. the H. s. but. Hos. such that. FDR satisfies what.

(29) I. t. X. X. Hn. tr. tr. Luhprod of. the. pink. corresponding. lutreqral. By. of. moving. t. to Hy. are. 0 b. from. below t. Ho. area. 1z. G. Mah. r. we. control the. than. Bociferoui. FDR. the FDR is small. tr. For. blue. Expednd number of pvalues. area. large. i. BH BH. heads. to lean. controls. FDR. war. not. power. FWER. overalltype I error.

(30) we.hr best in spara esh reject the null. Blt. Blt gives. guarantees on. regime where. FDR but in. out. general does. minimize it When. all the Ho. are. true. few. RH Te Bouferoui. not.

(31) Non parametrictests parametric scenario. Standard. fo I. F. Statistical model. µ. µ Obarr data compute the meau F s. Need. reject. the suples. test statistics in. the. distribution. Ah. Construct rejectieu. is in this. of. Tn statistics forexample. the distribution of. to know. under the null. dirtibution. hit. a. GE. region. m'In reject. threshold fregion. reject. the null. hyp. if. the obwired Ty.

(32) tumor. goodues. Goal is to test whether. of fttests comes. from a particular. Ho. Hy i. ko. of f.it. distribution Fo. Feel. True distribution. v. dataset. fo. Cx. that. generated. the data. Ce. t. Fo. we. consider. F ca. a. the. calf To. Ll emirical. f. cdf Fn. 1 Fo. calf of. the given distribution. Fu. calf of. the data. u. sup X C IR. By. l Fu al. Foo. the acinuko Cautilli theorem. the wall hypothesis. Fu. we. know that under. Fo uniformly. a s.

(33) It is possible to. compute. independent of. is From. this. and design. it. Fo. just. of Du and it. depends. on. u. Mendiolds compute rejection. can. we. the distribution. test. a. looks like hue. Example. to. the data does not come from Fo. Hauu. wilcoxou. Whitueytesttwosabaudounmes1wosa N. Xu. iXni.r. For. first sample. a. distributed according to Fr. Fz. Ym. Ya Question. Hn 0. Fr. Fn. a. second. sample distributed. ace. Fa. Fz. I. Hy. F. I. Fz. to. Fz.

(34) Test. pool the. a. sample. Sort the pooled. of. the rank. 1 I. 2. 2 at. T. T. Yin. order. sample in increasing. rank. all points. I. racial. Xu Y. Xp. i. rauh. ER. and retrieve. ti. hit t. 7. rank 2. f. t. Compute the rank sums. red. group. for both i. if. I hired. Wheel. rank. i c blue pop. is. ceil. c red population. I. Wblue. If. grapes. E. Wmd. e. small large. we. nanh. Yi. retain. Ho. reject Ho.

(35) usienst. Exhuaientoamultivariansetting neighbors. Two. we. samples. t. x. 1. pool them. t. l t. Il id. 5 NN. t. t. t. s. t. t. For each point. we. the colors at the. look at. h nearest neighbors Under. the null hypothesis we repeat. numb. of nd neighbors. I. t. x. k. e. r. f. y. A. number. a. X. x. t. t. t. that the. of. blue neighbors.

(36) tests. randomization. Permutation. In. Sample. a. compute. Xn. Xu. 4. a. grou. obarred statistics Toy wud. Pool the Saple ro For her Compute the. m. VA. Tn Iz. shuffle the gray. difece. l. I. Tok. a. I. mean. I. mean. blue. unembudedp. red. mean. mean. colors blue. Trooo distribution. Find. red. mean. mean. quantile to. of Ti. determine rejection. Check whew the absurd. Tobserved. on. Mndrold. the true data is. f t.

(37) Bootstrap Motivation i Xp want to estimate generate. to. know. The. first If. d. look. to. thing. Q. based. I. Xp. on. at. rare. You. Xu. want. f. ou. we can. E. the. error se. analytically. ie. µyuhw. if I. O We could also. try. to. ki. ri. y. xd. i. Ys. cu. obtain many. samples. w. Xu. and then estimate the dinhiluhlou Problem. F. F. the standard. is. the dishnjuhlen of. this is. t. ou. is. have assumphtens. compute. knowledge. no. parameter. reliable. how. we. a. estimate. an. F. Xu. of. we'd too. 2. many Sayler. E.

(38) Idea. of the. bootstrap. Draw. a. I. estimate. Xu. Ginn the saple Xn. Xu. subsauple de Xn repeat. orig. coupun. WY other. of histogram band sampled data. 3. q. 3. of F. It. of. hirhograin. Hope. 2. which is close to. standard Example estimate the. Input For. b. th B. r. same. i. y. number. by. of bootstrap replications. In't wiwnriaeemYE. oh. standard error. the the standard dev of. si. estimate. an. gives us. Estimate the parameter Estimate the. of. error. e. number of original sample points. Algorithm in pseudocode. Xn. to histogram. colon. is. E. fee. f of. bootstrap. the original estimate. replicates. lot. a. a. D. replicates. i.n.io.

(39) p. Does. it always work. the standard consistency of the estimate of. n. Assume that. tu. tr. 114112. E. Let. Eu. n. F. and. oo. g Xn. Assume that g. Xu. be. the parameter. that we estimate. continuously differentiable. is. Etr of p estimate of Then the bookshop neighborhood. strongly. iid. error. with. in. gradient. zero the standard error is a. a. non. courishut. g. E. Xn. Uniform. Xu. Want to. the largest member. I. Estimating problematic. where Q unknown. C. Co 1T. we obwve. Xi. max. ii i Estimating the. OT. the ML estimate of Q is simply. 0. estimate. 0. in. ie by. bootstrap is. traits or extreme values. going to fail. by. boot hp is.

(40) Confidence. sets 6y bootstrap. method i. Bootstrap percentile. estimate. Xu. Sayle ly Generate booting. Given. It. 7. replicates. the book at the histogram of. Igf i. iI. i. HMMM I. hist. 1. i. E. ai. n. E 2. bi. Quantile. I I I. b. coverage. 1. d. blank. E. G. CI. 2. f. cap. i. a. Po. A E. g. f. CE. It has. i. approximately. n y. because u is. fining.

(41) Subsequently. can. you. tests in the obvious. construct bootstrap. way. I r. i. i. p. Ho. i. INE. i. r. b. a. E. 0. us. Hn. I. O.

(42) Bayesiaustatistics Frequentistotalisticsia. limiting. probability parameters 0. frequency. are coushaurs. we. cannot assign pubabities to them. her statistics beham well when repealed of. Bayesian's parameters do. have based. a. of belief. degree. probability. have probabilities. about the world belief prior on observed data. updah it. si. Bay. statistical model. Assume a we. can. i density. Goal. f. x. fo I. e e. 10 1. the likelihood. r. the paranek Q. data. Investigate 0. Iparameter. of. the data giver.

(43) We assume that. we have a. prior belief about. i. the parameters. tributien. ff lp.io posterior. pier tea. 0. Xu. Oban data Xr Now. we. Iid. belief il. update ow. 300. compute the. we. f fol. using Bayes rule. posterior. likelihood posterior. f Coler. au. f. kn tn. tu. prior. f. auto. fix. te. o. lol. lfCo ldQ. normalizing constant out anymore. does not depend. The posterior Now you. can. is. a. distribution. make shakub based. If you want to nhru a use you could. due mat mean. on. the posher'er. best guess. of. posterior. of. porwior. for. 0. MAP.

(44) You. construct confidence. can. find. b. a. P. o e. such. Ca. in Wuab. that. b. 95. Advantages easy mom. to interpret. natural way to incorporate. prior knowledge. Disadvantages to solve analytic solubles are rare typically you lean compuhahlealy hard problems need. to. choose a. prior.

(45)

Referenzen

ÄHNLICHE DOKUMENTE

Phylogenetic analyses were performed using ML and BI-based methods, which were carried out on the concatenated nucleotide alignments obtained from 1,255 single-copy orthologs

This concept about the constructing and the effects of artistic means of color and form became fundamental for Albers.. Far-reaching for his own approach and art theoretical

A dormancy value or duration defines the period potato tubers can be stored before initiating sprouting. Characterization of dormancy value provides useful information to

[r]

Turning to our main objective of testing for the direction of causality, from Table 6 we observe the presence of bidirectional causality for Morocco, Tunisia, and Turkey

green dot represents the maximum force and the point where the displacement of the test finishes.. Photographs of specimens. Photographs before, during and after tests were taken

The Harris-Todaro hypothesis replaces the equality of wages by the equality of ‘expected’ wages as the basic equilibrium condition in a segmented but homogeneous labour market, and

Overall, our analysis suggests that the asymmetric evolution of the labor force participation of young-age and old-age women after the 1940's and the associated changes in