• Keine Ergebnisse gefunden

Machine  Learning

N/A
N/A
Protected

Academic year: 2022

Aktie "Machine  Learning"

Copied!
16
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Machine  Learning

Clustering,  Self-­‐Organizing  Maps

(2)

 Clustering

The  task:  partition  a  set  of  objects  into  “meaningful”  subsets   (clusters).  The  objects  in  a  subset  should  be  “similar”.  

Notations:  

Set  of  Clusters   Set  of  indices   Feature  vectors  

Partitioning                                                                                                    

                                                                                                                                                   for                        ,    

(3)

 Clustering

Let                                    and  each  cluster  has  a  “representative”  

The  task  reads:  

Alternative  variant  is  to  consider  the  clustering            as  

a  mapping                                            that  assigns  a  cluster  number  to  each  

(4)

 K-­‐Means  Algorithm

Initialize  centers  randomly,   Repeat  until  convergence:  

1. Classify:  

2. Update  centers:  

• The  task  is  NP  

• converges  to  a  local  optimum  (depends  on  the  initialization)

(5)

 EM  vs.  Clustering

EM  –  in  the  Expectation  step  posteriors  are  computed  

K-­‐Means  –  objects  are  classified

(6)

 EM  vs.  Clustering

For  example,  compare  the  updating  rules  for    

K-­‐Means:  

   

EM  for  the  Gaussian  Mixture  Model  (remember  the  corresponding   seminar):  

yk = arg min

y

X

i

p(k|xi) · ||xi y||2 =

P

i p(k|xi) · xi P

i p(k|xi)

(7)

 Sequential  K-­‐Means

Repeat  infinitely:  

1. Chose  randomly  a  feature  vector        from  the  training  data   2. Classify  it:  

3. Update  the      -­‐th  center:  

  with  a  decreasing  step  

• converges  to  the  same,  as  the  parallel  version  

• is  a  special  case  of  the  Robbins-­‐Monro  Algorithm

(8)

 Some  variants

Other  distances,  e.g.                                      instead  of  

In  the  K-­‐Means  algorithm  the  classification  step  remains  the  same,   the  update  step  –  the  geometric  median  of        

(a  bit  complicated  as  the  average  ☹).  

Another  problem:  features  may  be  not  additive  (          does  not  exist)   Solution:  K-­‐Medioid  Algorithm  (            is  a  feature  vector  from  the  

training  set)

(9)

 A  “generalization”

Observe  (for  the  squared  distance):  

In  what  follows:  

with  a  Distance  Matrix        that  can  be  defined  in  very  different  ways.  

Example:  Objects  are  nodes  of  a  weighted  graph,                        is  the  length   of  the  shortest  path  from        to      .  

Distances  for  “other”  objects  (non-­‐vectors):  

• Edit  (Levenshtein)  distance  between  two  symbolic  sequences  

• For  graphs  –  distances  based  on  graph  isomorphism  etc.

(10)

 An  application  –  color  reduction

Objects  are  pixels,  features  are  RGB-­‐values.  

Partition  the  RGB-­‐space  into  “characteristic”  colors.  

(8  colors)

(11)

 Another  application  –  superpixel  segmentation

(12)

 Cohonen  Networks,  Self-­‐Organizing  Maps

The  task  is  to  “approximate”  a  dataset  by  a  neural  network  of  a   certain  topology.  

An  example  –  stereo  in  “flatland”.  

The  input  space  is  3-­‐  (or  more)  dimensional,  the  set  of  points  is  

(13)

 Radial  Basis  Functions

Another  type  of  neurons,  

corresponding  classifier  –  “inside/outside  a  ball”  

Note:  RBF-­‐neurons  can  be  represented  as  “usual”  (linear)  ones  in   an  appropriately  chosen  feature  space.

(14)

 Self-­‐Organizing  Maps

SOM-­‐s  (usually)  consist  of  RBF-­‐neurons        ,  each  one  represents   (covers)  a  “part”  of  the  input  space  (specified  by  the  centers              ).  

The  network  topology  is  given  by  means  of  a  distance                            .   Example  –  neurons  are  nodes  of  a  weighted  graph,  distances  are   shortest  paths.  For  the  “flatland”  example  the  graph  is  a  2D-­‐grid   with  the  unit  weight  for  all  edges.

(15)

 Self-­‐Organizing  Maps,  sequential  algorithm

1. Chose  randomly  a  feature  vector        from  the  training  data  (white)   2. Compute  the  “winner”-­‐neuron  (dark-­‐yellow)  

3. Compute  the  neighborhood  of              in  the  network  (yellow)  

4. Update  the  weights  of  all  neurons

(16)

 Self-­‐Organizing  Maps,  algorithms

                       is  monotonously  decreasing  with  respect  to          (time)  and     Without  3)  –  the  sequential  K-­‐Means.  

Parallel  variants:  

Go  through  all  feature  vectors,  sum  up  the  gradients,  apply.  

Example  for                                :    

The  network  fits  into  the  data  distribution  (unfolds).

Referenzen

ÄHNLICHE DOKUMENTE

Bayesian concept learning: the number game The beta-binomial model: tossing coins The Dirichlet-multinomial model: rolling dice... Bayesian

The famous (Fisher’s or Anderson’s) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50

– Unsupervised Learning: Trying to group similar examples together or to find interesting patterns in the data..

In general, it is very often possible to learn non-linear decision rules by the Perceptron algorithm using an appropriate transformation of the input space (further extension

 apply  the   changes  immediately..  the  logistic

Let (in addition to the training data) a loss function be given that penalizes deviations between the true class and the estimated one (the same as the cost function in

stellte die Hypothese auf (bis heute nicht experimentell nachgewiesen), daß sich die Gewichtung der Synapse verstärkt, wenn Neuronen vor oder nach der Synapse. gleichzeitig

im Baum kann ein Knoten nur komplett oder gar nicht gelöscht werden, in der Regel ist. „partielles Löschen“ abhängig vom Kontext