• Keine Ergebnisse gefunden

Statistical Programming Languages – Day 1 SVN-revision: 50

N/A
N/A
Protected

Academic year: 2021

Aktie "Statistical Programming Languages – Day 1 SVN-revision: 50"

Copied!
45
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

SVN-revision: 50

Uwe Ziegenhagen

Institut für Statistik and Ökonometrie Humboldt-Universität zu Berlin http://www.uweziegenhagen.de

(2)

About this Course

Uwe Ziegenhagen (ziegenhagen@wiwi.hu-berlin.de) consultation hours: upon agreement

80% Exam (90 minutes)

20% Presentation (15-20 minutes) Presentation: practical data analysis Moodle registration is required dates as announced in Moodle

(3)

What is R ?

S language, developed by Becker and Chamber in 1984 GNU implementation of S 1992 by R. Ihaka and R. Gentleman in NZ

great variety of packages covering all fields of statistics versions for Win32, Linux/Unix, Mac OS

(4)

Course Overview

Introduction R as a calculator

Exploratory Data Analysis Graphics

Regression and Testing ProgrammingR

(5)
(6)
(7)

Emacs and ESS

If: you have no idea what Emacs is, skip this. . .

Else: download ESS, extract to EMACS home directory add lisp code (below) to .emacs

Run R by M-x R

1 (l o a d " c :/ emacs - 2 2 . 1 / ess - 5 . 3 . 7 / l i s p / ess - si t e ") ( s e t q i n f e r i o r -R- program - n a m e " c :/ P r o g r a m m e / R / R

(8)

Getting Help. . .

1 h e l p() #

2 h e l p(sin) # h e l p for sin ()

3 ?sin # h e l p for sin

4 h e l p.s t a r t() # H T M L h e l p

5 l i b r a r y() # s h o w i n s t a l l e d l i b r a r i e s

6 l i b r a r y(h e l p ="<p a c k a g e>")

7 h e l p.s e a r c h(" sin ")

(9)

Basic R stuff

1 s e t w d(" x :/ myR ") # set w o r k i n g d i r e c t o r y

2 g e t w d() # get w o r k i n g d i r e c t o r y

3 s a v e.i m a g e() # s a v e s w o r k s p a c e to . R d a t a

4 s a v e h i s t o r y() # s a v e c o m m a n d h i s t o r y

5 l o a d(" . R d a t a ") ) # l o a d s w o r k s p a c e

6 l o a d h i s t o r y() # as the n a m e s a y s

7 s o u r c e(" m y f i l e . r ") # r e a d c o m m a n d s f r o m f i l e

8 s i n k(" o u t p u t . txt ") # w r i t e o u t p u t to f i l e s i n k() # o u t p u t to s c r e e n

(10)

Installing new packages

(may require administrator rights)

1 # d o w n l o a d and i n s t a l l

2 i n s t a l l.p a c k a g e s(" m u l t t e s t ")

3 # u p d a t e l o c a l p a c k a g e s

4 u p d a t e.p a c k a g e s()

5 # s h o w i n s t a l l e d l i b r a r i e s

6 l i b r a r i e s ()

7 # l o a d l i b r a r y

8 l i b r a r y( name , lib . loc=[ l o c a t i o n ])

9 # l o a d l i b r a r y in f u n c t i o n s

10 r e q u i r e() # r e t u r n T R U E or F A L S E

(11)

Customizing R

1 # s h o w g l o b a l o p t i o n s

2 o p t i o n s()

3 # set o p t i o n

4 o p t i o n s(p r o m p t =" : -) ")

5 # get s p e c . o p t i o n

6 g e t O p t i o n(p r o m p t)

(12)

Customizing R

Unix/Linux: local.Rprofile Windows: globalRprofile.site

Look & Feel changes can be made inRconsole. Unix/Linux allows different .Rprofile files.

(13)

Basic Calculations with R

1 1+2

2 1*2

3 1/2

4 1 -2

5 5 %/% 2 # 2; int d i v i s i o n

6 5 %% 2 # 1; m o d u l o d i v i s i o n

(14)

Basic Calculations with R

1 < # s m a l l e r

2 < = # s m a l l e r or e q u a l

3 > # b i g g e r

4 > # b i g g e r or e q u a l

5 != # u n e q u a l

6 = = # e q u a l

7 & # l o g i c a l AND ( v e c t o r )

8 | # l o g i c a l OR ( v e c t o r )

9 && # l o g i c a l AND ( no v e c t o r )

10 || # l o g i c a l OR ( no v e c t o r )

(15)

Basic Calculations with R

1 2^2

2 s q r t(2)

3 sin( pi ) # cos , tan

4 a c o s(0) # asin , a t a n a t a n 2

(16)

Basic Calculations with R

1 c < - 1:3 * pi

2 c # [1] 3 . 1 4 1 5 9 3 6 . 2 8 3 1 8 5 9 . 4 2 4 7 7 8

3 f l o o r(c) # [1] 3 6 9

4 c e i l i n g(c) # [1] 4 7 10

5 t r u n c( pi ) # 3

6 t r u n c( - pi ) # -3

7 f l o o r( - p ) # -4

8 r o u n d( pi ) # 3

(17)

Variable names

sequence always alphabetic => numeric

strings of alphabetic characters: a, b2, abc.de, a1, a1_23 names are case sensitive ’a123’ is not equal to ’A123’

pi is a constant, cannot be used as variable name print(x)prints content ofx

(18)

Overview of Objects I

1 # s h o w l o a d e d p a c k a g e s and d a t a

2 s e a r c h()

3 ls(2) # s h o w f u n c t i o n s for s p e c . p a c k a g e

4 ls() # l i s t o b j e c t s

5 o b j e c t s() # l i k e ls ()

6 rm( n a m e ) # r e m o v e s n a m e f r o m w o r k s p a c e

7 rm(l i s t = ls() ) # r e m o v e s all o b j e c t s

(19)

R main structures

vectors just vectors of lengthm, one type matrices m×n arrays, one type

dataframes usually read from files, different types

(20)

R classes for vectors

useclass(object) for the type character vector of strings

numeric vector of real numbers integer vector of signed integer

logical vector of TRUE or FALSE* complex vector of complex numbers

list vector of R objects

factor sets of labelled observations, pre-defined set of labels NA not available, missing value

(21)

R main structures

matrix(’vector’) converts to matrix of m×1 matrix(’vector’, ncol=1) does the same

matrix(’vector’, nrow=1) converts to matrix of 1×n as.data.frame(’matrix’) converts to data frame as.matrix(’data frame’) converts to matrix

as.vector(’matrix’) converts to vector, if matrix has only one

(22)

Variables, Vectors and Matrices

1 a < - 2 # d o u b l e n u m b e r a = 2

2 b < - 1:3 # i n t e g e r v e c t o r b = [1 2 3]

3 c < - 1: pi # i n t e g e r v e c t o r c = [1 2 3]

4 d < - c(1 ,2 ,3 ,4) # d o u b l e v e c t o r [1] 1 2 3 4

5 t( d ) # r e t u r n s d as row v e c t o r ( t r a n s p o s e s d )

(23)

Variables, Vectors and Matrices

1 a = 1:3

2 b = 2:4

3 c( a , b ) # [1] 1 2 3 2 3 4

4 c(1 ,1:3) # [1] 1 1 2 3

5 seq(1 ,3) # [1] 1 2 3

6 seq(3) # [1] 1 2 3

7 seq(1 ,2 ,by =0 . 1 ) [1] 1.1 1.2 1.3 1.4 1.5 ...

8 seq(1 ,3 ,0.5) # [1] 1.0 1.5 2.0 2.5 3

(24)

Variables, Vectors and Matrices

1 a < - l e t t e r s [ 1 : 3 ]

2 a

3 b < - L E T T E R S [ 1 : 3 ]

4 b

5 c < - m o n t h . abb [ 1 : 6 ]

6 c

7 d< - m o n t h . n a m e [ 1 : 1 2 ]

8 d

(25)

Variables, Vectors and Matrices

1 > m a t r i x(1:12 , n r o w =3)

2 [ ,1] [ ,2] [ ,3] [ ,4]

3 [1 ,] 1 4 7 10

4 [2 ,] 2 5 8 11

5 [3 ,] 3 6 9 12

6 > m a t r i x(1:12 , n r o w =3 , b y r o w = T )

7 [ ,1] [ ,2] [ ,3] [ ,4]

8 [1 ,] 1 2 3 4

9 [2 ,] 5 6 7 8

10 [3 ,] 9 10 11 12

(26)

Variables, Vectors and Matrices

1 > m a t r i x(1:12 , 3 ,4)

2 [ ,1] [ ,2] [ ,3] [ ,4]

3 [1 ,] 1 4 7 10

4 [2 ,] 2 5 8 11

5 [3 ,] 3 6 9 12

(27)

Variables, Vectors and Matrices

1 > m a t r i x(0 , n r o w = 5 , n c o l = 5)

2 [ ,1] [ ,2] [ ,3] [ ,4] [ ,5]

3 [1 ,] 0 0 0 0 0

4 [2 ,] 0 0 0 0 0

5 [3 ,] 0 0 0 0 0

6 [4 ,] 0 0 0 0 0

7 [5 ,] 0 0 0 0 0

(28)

Variables, Vectors and Matrices

1 # C o n c a t e n a t i o n

2 > x = 1:3

3 > y = 4:6

4 > r b i n d( x , y )

5 [ ,1] [ ,2] [ ,3]

6 x 1 2 3

7 y 4 5 6

8 > c b i n d( x , y )

9 x y

10 [1 ,] 1 4

11 [2 ,] 2 5

(29)

Variables, Vectors and Matrices

1 x < - m a t r i x(1:12 , 3 , 4)

2 # e x t r a c t the d i a g o n a l of a m a t r i x

3 x [row( x ) = = col( x ) ]

(30)

Variables, Vectors and Matrices

1 > k= m a t r i x(1:10 ,2 ,5)

2 > k

3 [ ,1] [ ,2] [ ,3] [ ,4] [ ,5]

4 [1 ,] 1 3 5 7 9

5 [2 ,] 2 4 6 8 10

6 > k [ 1 : 2 , 3 : 4 ]

7 [ ,1] [ ,2]

8 [1 ,] 5 7

9 [2 ,] 6 8

(31)

Variables, Vectors and Matrices

1 d i a g(5) # d i a g o n a l 5 x5 m a t r i x of ’1 ’

2 d i a g(5 ,7 ,8) # 7 x8 m a t r i x w i th 5 on d i a g .

(32)

dimension names (matrices and array)

1 > b < - m a t r i x(1:20 ,4 ,5)

2 > d i m n a m e s( b )< - l i s t( l e t t e r s [1:4] , l e t t e r s [ 1 : 5 ] )

3 > b [" b "," b "]

4 [1] 6

(33)

Matrix Size

1 dim( x ) # s i z e of m a t r i x x

2 x < - m a t r i x(1:10 ,2 ,5)

3 col( x ) # c o l u m n i n d i c e s of ALL e l e m e n t s

4 row( x ) # row i n d i c e s of ALL e l e m e n t s

5 x [<i>,<j>] # e x t r a c t i - th row and j - th c o l u m n

(34)

Inf, NaN

1 x = 1:3

2 is.f i n i t e( x )

3 is.i n f i n i t e( x )

4 Inf

5 NaN

6 is.nan( x )

(35)

Sums and Products

1 > x = m a t r i x(1:20 ,4 ,5)

2 > sum( x )

3 [1] 210

4 > p r o d( x )

5 [1] 2 . 4 3 2 9 0 2 e +18

6 m a t r i x(1:10 , n r o w =2) -> a

7 c o l S u m s( a )

8 r o w S u m s( a )

(36)

Sums and Products

1 > c u m s u m( 1 : 1 0 )

2 [1] 1 3 6 10 15 21 28 36 45 55

3 > c u m p r o d( 1 : 5 )

4 [1] 1 2 6 24 120

5 [ 1 0 ] 3 6 2 8 8 0 0

6 > c u m m i n(c(3:1 , 2:0 , 4 : 2 ) )

7 [1] 3 2 1 1 1 0 0 0 0

8 > c u m m a x(c(3:1 , 2:0 , 4 : 2 ) )

9 [1] 3 3 3 3 3 3 4 4 4

(37)

Sums and Products

1 > a= c(3:1 , 2:0 , 4 : 2 )

2 > a

3 [1] 3 2 1 2 1 0 4 3 2

4 > c u m m i n( a )

5 [1] 3 2 1 1 1 0 0 0 0

6 > c u m m a x( a )

7 [1] 3 3 3 3 3 3 4 4 4

(38)

Replacing values

1 > r e p l a c e( x , x<2 ,3)

2 [1] 3 2 3 4 5 6 7 8 9 10

3 > x = 1 : 1 0

4 > x

5 [1] 1 2 3 4 5 6 7 8 9 10

6 > r e p l a c e( x , x<2 ,3)

7 [1] 3 2 3 4 5 6 7 8 9 10

(39)

Matrix Calculation

ifx,y are n×mmatrices x +y =x[i,j] +y[i,j] x −y =x[i,j]−y[i,j]

ifx isn×m andy is m×p then x%∗%y =

m

Xx[i,j]·y[j,k]

(40)

Matrix Calculation

In expressions involving matrix and vector, the vector is interpreted such that the multiplication works.

If x is vector of lengthm and y is anm×p matrix,x%∗%y is a vector of length p.

If x is an n×m matrix andy is a vector of length m,x%∗%y is a vector of length m.

If x andy are vectors of length m,x%∗%y is a scalar (i.e.

vector of length 1), representing the inner product Pm x[i]∗y[i].

(41)

Matrix Inversion

1 # if a is nxn m a t r i x

2 s o l v e( a ) # i n v e r s e of a

3 a ^ -1 # e l e m e n t w i s e i n v e r s e

(42)

Lists in R

1 > a< - c(3 ,2 ,1)

2 > b< - c(6 ,5 ,4)

3 > f< - l i s t( a , b )

4 > f [1]

5 [ [ 1 ] ]

6 [1] 3 2 1

7 > f [ a ]

8 [1] 3 2 1

(43)

Lists in R

1 > a< - c(3 ,2 ,1)

2 > b< - c(6 ,5 ,4)

3 > f< - l i s t( a , b )

4 > d< - c(" a "," b ")

5 > e< - l i s t( a , b , d )

6 > u n l i s t( f )

7 > u n l i s t( e )

(44)

Finding Help

Google (or your other favorite search engine) R-help mailing list

R-announce, R-package, R-devel (for developer-specific topics) R-sig-* special interest groups (r-sig-finance)

Seehttp://r-project.org/mail.html for details

(45)

For Further Reading

An Introduction to R, R-intro.pdf R Data Import/Export, R-data.pdf The R Reference Index, fullrefman.pdf Introductory Statistics with R, Peter Dalgaard

Data Analysis and Graphics Using R, Maindonald/Braun R Graphics, Murrell

Referenzen

ÄHNLICHE DOKUMENTE

Der Betrag der Determinante (also ihr Wert ohne Vorzeichen) sagt, um wel- chen Faktor sich jedes n-dimensionale Volumen ändert, wenn man die Ma- trix anwendet, also wie eine 2

Hilfsmit- tel: maximal acht einseitig oder vier beidseitig beschriftete DIN-A4-Spickzettel beliebigen Inhalts, möglichst selbst verfasst oder zusammengestellt; kein Skript, keine

§  Die oberste Matrix auf diesem Stack ist die aktuelle MODELVIEW- Matrix, die für die Geometrie-Transformation verwendet wird. §  Alle Transformations-Kommandos

Proseminar Lineare Algebra II, SS

[r]

[r]

[r]

•  Nukleotide und Aminosäuren können als diskrete, diskontinuierliche Charaktere behandelt werden!. •  Der phylogenetische Stammbaum wird anhand des Musters der Änderungen