• Keine Ergebnisse gefunden

TI RBI

9. Applications

9.7. Durcheinander

9. Applications

Figure 9.44.: Video stills from the presentation of a prototype of Durcheinander at Animax, Bonn in late 2007. The corresponding video is part of the accompanying DVD.

9.7. Durcheinander

0.25 0.50 0.75 1.00 1.25 1.50 distance

1 0.00 0

1

cluster threshold

0

Figure 9.45.: A 2D-plot of a two-dimensional artificial data set and its corresponding dendrogram. The red line indicates a specific clustering that defines the shape of the data items in the scatterplot.

change its configuration and how does it change?

2. What happens when data items are in a special configuration?

3. What are the differences between the various distance metrics?

4. What are the differences between the various cluster metrics?

Durcheinander’s purpose is to help answer these questions by means of the TAI paradigm. It provides the opportunity to physicallygraspthe data and, at the same time, allows auditory exploration of the effect of different clustering parameters. Durcheinander’s tangible objects are laid out on a table and the sound is delivered in a spatial sound environment. Learners have turned out to particularly benefit from this collaborative multiuser nature of the system;

it invites to discuss the results of Agglomerative Clusteringin the process of co-operative exploration, instead of before and after. Furthermore, itsinteractive programming approach allows researchers to experiment with different Sonification methods during interaction.

The following usage scenario describes a typical situation in which Durcheinander may be Usage scenario

used:

Learners stay around the Durcheinander surface. A teacher configures a specific data set by outlaying it with objects. Now, all try out the stability of that specific layout by moving objects, while explicitly listening to changes in the configuration. After a while of trial and discussions, the teacher changes to a different cluster metric, and lets the students explore the resulting differences in the algorithm’s behaviour.

In the interface, the current state of the cluster algorithm is mediated through the Auditory Display. Its behaviour depends on the used cluster metric. The clear separation of input data (objects) and processing layers (auditory part) can be used to explicitly transport the otherwise invisible clustering information and give hints on the importance of the interpretation regarding the cluster algorithm’s output.

9. Applications

9.7.1. Agglomerative Clustering

Clustering can help to unveil hidden structures of a specific kind in possibly high-dimensional data sets. It is especially suitable for compact structures in the sense of the used distance metric. Agglomerative Clustering is a special approach for clustering and produces so-called dendrograms of inter-cluster distances by application of the following rules [ELL01b]:

1. Initially, all data items xi are considered to be clustersci, so that∀xi ∈X :xi =ci

2. Compute distances between all pairs of clusters and find the smallest distance:

minpair = arg min

i6=j

d(ci, cj) (9.15)

mindist = min

i6=j d(ci, cj) (9.16)

3. Joinci andcj at the distancemindist. This jointhci, cjirepresents the new clusterck 4. Addck to the list of clusters, remove ci,cj from this list

5. If more than one cluster is in the list of clusters GOTO 2, else END.

A cut at a specific distance in the resulting dendrogram represents one possible clustering of the given data set. For example applying Agglomerative Clustering to the data set shown in Figure 9.45 (a) results in the dendrogram shown in Figure9.45(b). The red line represents a possible cut.

Although it seems natural to use the standard Euclidean metric to measure object distances, it is also possible to use other metrics which may fit better to the domain of the given data set. The choice of the inter-object metric as well as the choice of how to determine cluster distances heavily affects the structure of the Agglomerative Clustering outcome and therefore the resulting dendrogram. These metrics differentiate Agglomerative Clustering into e.g. single-linkage, complete-linkage, or average distance clustering:

Single Linkage:

d(ci, cj) = min

x∈ci,y∈cjd(x, y) (9.17)

Complete Linkage:

d(ci, cj) = max

x∈ci,y∈cj

d(x, y) (9.18)

Average Distance:

d(ci, cj) = avgx∈ci,y∈cjd(x, y) (9.19) Although it is relatively easy to understand the general global behaviour of the clustering algorithm, it is difficult to understand the way in which local variations such as the exact position of data items affect the algorithm’s output. This is particularly interesting since Agglomerative Clustering is usually applied to data that incorporates measuring errors, which cause variations in data item locations.

A dynamically changing structure may not necessarily be best represented in form of a visual dendrogram; Sonification allows us to explore its recursive (re-)configuration without a projection onto the plane of geometry.

9.7. Durcheinander

9.7.2. Implementation

As a basis for Durcheinander, we use the tDesk, a tabletop tangible computing environment Hardware

designed and built in the interaction laboratory at Bielefeld University (see Section5.5). By design, the dimensions of the surface allow groups of people to work on tangible applications, providing each member direct access to the physical objects. We use a digital camera below the tDesk to capture the 2D positions of the objects used as the data set in our system.

This method prevents possible visual object occlusions by the users such that all 20 objects are all the time recognisable by the vision-engine. A blob recognition algorithm then detects number and position of the objects, which is fed into the actual clustering algorithm which in turn computes the dendrogram.

The dendrogram structure is translated into a corresponding sound synthesis graph which Clustering

may be triggered externally by knocking on the surface of the tDesk. The resulting sound is rendered in real time to the users by the multi-channel audio system surrounding the table. Each physical data item produces a sound that is spatially related to its position on the surface; every object sound again consists of sub-sounds determined by other nodes of the dendrogram.20 The graph structure is being continually updated, and whenever its configuration differs substantially from its predecessor, the system generates a trigger that propagates through the synthesis graph; a series of reconfigurations can be heard as a series of differing sounds in context.

The Sonification algorithm constructs a computation graph in which each node (representing Sound synthesis

a cluster ci) takes an n-tuple of streams as input, provided by its enclosing cluster. In addition to this, a variable number of arguments allow parametric control and triggering of each node:

1 {|in, trig, dist, id, lagTime|

2 freq = freq.lag(lagTime.max(0.05));

3 freq = freq * (3 ** dist);

4 [

5 in + Decay2.ar(trig, 0.01, 2.5, 0.1) * SinOsc.ar(

6 freq,

7 SinOsc.kr(Rand(1, 4), 0, 0.05, Rand(0, pi))

8 ),

9 trig,

10 freq

11 ];

12 }

Each object is acoustically represented by a node that passes its own frequency response (freq) resulting from a trigger (trig) on to the next node’s input (in). For this, each node passes ann-tuple of streams to both of its two adjacent nodes. The algorithm defining this flow-graph can be rewritten conveniently at runtime such that different synthesis techniques can be tested online.

20In order to realise such a framework, we implemented a modular sound architecture in SuperCollider, a higher-level programming language that is specially suited for real time sound rendering [McC02].

9. Applications

9.7.3. Conclusion

Durcheinander uses sound as a tool to represent structure and dynamics of Agglomerative Clustering algorithms. It’s educational purpose is underlined by the separation of user-controllable data input and auditorily represented clustering results. A change in the cluster configuration triggers sonic events, indicating the momentary hierarchical configuration of the dendrogram on which the clustering process is based on. Durcheinander provides an additional perspective to clustering techniques by focusing on other aspects than the common visual representations. These aspects particularly include the spatial correspondence of clusters and its change under induced noise in data input, respectively under a change of the cluster metrics.

In late 2007, we presented Durcheinander at a workshop for children at the Animax in

Lessons learned

Bonn.21 There, we had the chance to extensively work with visitors to adjust Durcheinan-der’s Auditory Display (see Figure9.44). At ICAD 2008, we presented a live demonstration of Durcheinander. In the light of these presentations, we can report that also inexperienced users tend to grasp the objects and start exploring without any uncertainty. Users tend to forget the technical system and manipulate the sounds directly, having the interface ready-to-hand. We view this as a valuable feature for systems dedicated for exploration and learning. These are the same insights as we experienced for AudioDB (see Section9.4).

However, we also realised that Durcheinander needs further development to be useful for actual didactical purposes. However, its current state clearly proofs that TAIs provide valuable methods for educational applications.

21This workshop was part of the DFG-funded research projectArtistic Interactivity in Hybrid Networks of the GermanJahr der Geisteswissenschaften 2007.