• Keine Ergebnisse gefunden

4. Visual Analysis of Two-Dimensional Time-Dependent Data 137

4.2. Background

This section presents the main techniques for visual analysis of two-dimensional time series relevant to the tasks presented in the introduction (see Subsection4.1.1). We firstly define two-dimensional time series (with regard to both individual and group dynamics). In the following subsection, we overview algorithmic techniques related to analysis of two-dimensional dynamic data. We then proceed with visualization techniques for this data type.

In turn, visual analysis techniques combining interactive visualization and algorithmic analysis for analysis of this data are presented.

4.2.1. Definitions

An individual two-dimensional time series(i.e., 2D point time series, a dynamic 2D point) can be regarded as an instance oftrajectory– the term used mainly in geographic area. In this respect, “A trajectory is the path made by the moving entity throughout the space where it moves. The path is never made instantly but requires a certain amount of time.” I.e., “ Trajectory = space-time path” connecting individual measures time-space points. [AAPS08], where linear movement of the points with constant speed between time points is assumed.

Along these lines, we definetrajectory of an entity k:Tkas:

Tk={t0k,t1k, . . . ,tnk}, an ordered set of points, where

tikis the 2D positiontik= [xki,yki]of the entitykat time pointi, where i∈I,I={0, . . . ,n}.

In this example,t0kis the starting point andtnkis the end point of the trajectoryTk. (see also Figure4.3for an illustration).

Time pointsi∈I can be equidistant (e.g., once an hour or once a day) or irregular (e.g., each time an object passes a sensor). In our work, we concentrate on trajectories measured on uniform time intervals.

y t2 t2

y t0 t1

2

t2 2

t3

index time i index object k y x

tik =[ ik, ik], : , :

T

2

1

t1 1

t2

t1

t2

2

t4

j yi

i

i [ , ], : _ , : _

x

1

t0 t13

t4

t4

T

1

x

Figure 4.3.: Example of two-dimensional time series (trajectories). The trajectory pointstikare connected by a line forming a trajectory path.

4.2. Background

In addition, we can define atrajectory fragment[AAPS08] as a part of the whole trajectory as TFk=tik

0, . . . ,tikn;TFk⊂Tk, where

IF={i0, . . . ,in},IF6=∅is a nonempty consecutive part of original time spaceI IF⊂I.

Afragment of a trajectoryis usually created by selecting a specific time subset from the whole time period. In analogy to the fragmentation (i.e., partitioning) of a trajectory into individual fragments is done by dividing the whole time setIinto disjunct time subsets

IF0, . . . ,IFl,where

l [

j=0

IFj =Iand

IFj∩IFh=∅,∀j,h∈0, . . . ,l;j6=h.

In our work, we concentrate on fragmentation (partitioning) of a trajectory into a set of disjunct trajectory fragments (subtrajectories) with equal number of time points in each fragment.

If entities (also referred to as objects or items), for which we define the trajectories, aregrouped (e.g., by a specific attribute such as country of origin) (see Figure4.4), then the time development of eachgroup of entities TG creates a complex composition of movements of individual group membersg∈G(see Figure4.5for an illustration). This is stated in [AAPS08] as:“The collective movement behavior of a population of entities over a time period is a complex configuration built from movement characteristics of all entities at all time moments, which has no arrangement with respect to the population of entities and has a continuous linear arrangement with respect to the time.”(see also Figure4.5).

TG= [

gG

Tg

Figure 4.4.: Illustration of point grouping using convex hull.

4.2.2. Algorithmic Analysis of Two-Dimensional Time Series

The research on analysis of two-dimensional time series (i.e., trajectory mining) considers analysis and descrip-tion of important properties in trajectory data. Of primary concern are methods to define appropriate similarity

Figure 4.5.: Illustration of group movements. Left: Trajectories of individual points in the group. Right: Trajec-tories of individual points in the group together with the trace of the convex hulls around the group in each time point.

functions to query, compare, and cluster trajectories [NP06,PKM07,COO05], and to support the detection of interesting patterns [PBKA08b].

In this section, we concentrate on clustering methods used for analysis of two-dimensional time dependent data.

Recently, clustering of trajectory data has received considerable attention in applications in geo-spatial and related research areas. However, finding an appropriate clustering method for trajectory data is a challenging task, as stated by Nanni et al. [NP06]: “Spatio-temporal trajectory data introduce new dimensions and, corre-spondingly, novel issues in performing the clustering task. Clustering moving object trajectories, for example, requires finding out both a proper spatial granularity level and significant temporal sub-domains. Moreover, it is not obvious to identify the most promising approach to the clustering task among the many in the literature of data mining and statistics research; neither it is obvious to choose among the various options to represent a trajectory of a moving object and to formalize the notion of (dis)similarity (or distance) among trajectories.”.

In clustering, the definition of similarity (or distance) between entities plays an important role. The clustering of trajectories mainly uses either euclidean distance or transformation of trajectories into feature vectors thereby defining distance between trajectories as distances between feature vectors. For more detailed discussion of measuring similarity between trajectories, we refer to Section4.6.2.

A variety of clustering methods for trajectories have been introduced by now. We overview them below. They can be divided by the type of clustering used, according to the usage of trajectory fractions and according to the consideration of grouping of trajectories. We first discuss various clustering methods for full trajectories without grouping, then present a method for clustering trajectory fragments and finally mention a method for discovering moving clusters (groups of objects).

The clustering methods for trajectories include probabilistic clustering based on mixture of regression models, k-means clustering, hierarchic agglomerative clustering, density based clustering and SOM-based clustering.

Probabilistic clustering using mixture of regression models was proposed by Gaffney et al. [GS99]. They apply unsupervised learning using EM algorithm. This approach was extended in later works [CGS00,Gaf03, Gaf04,CGMS03] using random effects regression mixtures. Alternatively in this context, a mixture of hidden Markov models was used by [Smy97,ASKP03]. These rather sophisticated methods need an assumption of an underlying model of the data.

Direct distance based clustering usingk-means and hierarchical agglomerative clusteringwas presented by Nanni [Nan02]. In the further work, Nanni and Pedreschi [NP06] propose an adaptation of adensity-based

4.2. Background

clustering algorithm, OPTICS in particular, to trajectory data. Their approach of temporal focusing chooses partial time intervals which are best suited for trajectory clustering. However, this approach is suitable mainly on geographic (in particular traffic route) problems, where the entities mainly follow a few main roads and many small roads leading to them. This method finds the main roads but is not very suitable for abstract data with no particular routes. This paper was extended with interactive selection of distance functions, progressive cluster refinement and interactive visualization in 2008 [RPN08] (see Section4.2.4for more information).

Density-based clustering was also used to discover interesting places in trajectories by Tietbohl et al. [PBKA08a].

They concentrate on discovering stops in long trajectories of moving objects in geographic space. They do not focus on the similarity of trajectories.

Lee et al. [LH07] extenddensity based clustering with trajectory partitioningfor finding clusters based on sub-trajectories. Their algorithm partitions a trajectory into a set of line segments, and afterwards groups the similar line segments into a cluster. They apply the minimum description length (MDL) principle for trajectory partitioning and density-based line-segment clustering for the grouping. The results are similar to the method of Nanni and Pedreschi [NP06].

Trajectory partition and subsequent trajectory partitioning for abstract (financial) data was presented by Schreck et al. [STFK07]. They use self-organizing map (SOM) for getting overview of the trajectory patterns, grouping them by similarity and the visualization of the results. This approach has been extended in future work (see Section4.2.4).

An approach todiscovering moving clusters (i.e., “sets of objects that move close to each other for a long time interval”) was presented by Kalnis et al. [KMB05]. Their algorithm performs spatial clustering in each time point and combines the results into a set of moving clusters. Two means of algorithm acceleration are proposed as well. The main difference to the above mentioned approaches is that the set of objects in a cluster may vary over time.

Please note that similarly to clustering,trajectory aggregationcan be used for abstraction of trajectories. There exist many approaches, which are summarized in [AA08].

4.2.3. Visualization of Two-Dimensional Time Series

The visualization of time series, in general, is a broad area within information visualization. A complete overview of the literature on visualization techniques for presenting time-dependent data would exceed the scope of this chapter. Nonetheless, in Section2.3.3, a brief overview of visualization techniques for dynamic data is provided.

Additionally, surveys of systems specialized on time-series can be found in [AMM08] or [SC00]. However, these surveys focus on the representation of one-dimensional time series.

In the following, we concentrate on techniques specialized on the visualization of two-dimensional time de-pendent data. The visualization of this type of data in a dynamic display enhances the static visualization of two dimensional points with time dimension. In the following, we consider both the static and dynamic case. We first discuss techniques disregarding grouping of entities and then those techniques which include the grouping information.

4.2.3.1. Visualization of Two-Dimensional Time Series Disregarding Grouping of Entities

The visualization of two-dimensional data in the static case usually employs scatterplots. The techniques for dy-namic data therefore often enhance scatterplots with a visualization of the time dimension of the data. They employ animation of points in 2D [CK03,AAG00] or visualization of trajectories in 2D [NFA01] and 3D [Kra03]. Some systems combine both approaches [Gap,TK07]. An evaluation of the techniques was presented

in [RFF08] and in [TK07]. The results show that animation is more suitable for presentation/overview exami-nation, while trajectories are more suitable for the detailed analysis of the data.

In the geographic domain, also aggregated views (such as spatio-temporal histograms), T-T (time-time plots) have been introduced. An overview of geo-based visualization techniques is provided in [AAK08]. Addition-ally, Willems et al. [WvdWvW09] presented a technique based on density fields displayed as colored height maps. It offers exploration of vessel movements and their speed variations.

4.2.3.2. Visualization of Two-Dimensional Time Series Including Grouping of Entities

Two dimensional data (i.e. 2D points) can be grouped according to a selected criteria into so called 2D point clouds. Such point clouds, in the static case (or in each time point), may be represented by solid shapes, using various geometric constructs (so called“hulls”) or using distance fields. In [SP07], the comparison of various hulls was shown. The hull types include minimum bounding discs, boxes, and convex hulls. In [SSZW08], an algorithm for the construction of compact, enclosing shapes was presented. Recently, so called “bubble sets“, continuous isocontours connecting group members were presented [CPC09].Distance fieldsallow for the representation of point sets by smooth formation of visual areas, using appropriate transfer functions [KTSZ08].

On the efficiency side, the visualization of massive point cloud data sets may be accelerated by appropriate data structures as presented in [HE03]. The common challenge of these approaches is a compact representation of the cloud revealing the shape and distribution of the points while avoiding the overlapping of the clouds.

In analogy to the visualization of 2D dynamic points described above, visualization of 2D point clouds over time can also employ animation, trajectories of clouds or combination of both. The hulls around trajectories are however suitable mainly for points that move closely together and disregard inner distribution of the entities.

For spatio-temporal analysis of multiple entities, in the geographic domain, several aggregation-based ap-proaches have been introduced which are surveyed in [AAK08]. These include aggregating movement data into a surface by computing the total number of person-minutes spent in each cell of a regular grid, transition matrix counting number of entities moving between each pair of locations, and discrete or continuous flow maps. These approaches however suffer either from spatial context, or overplotting issues.

4.2.4. Visual Analysis of Two-Dimensional Time Series

Related work to the visual analysis of two-dimensional time series includes approaches which combine algo-rithmic analysis of the data with interactive visual exploration of the data. In analogy to the previous, we first introduce approaches which disregard groupings of entities and then concentrate on those which include group-ings in their method.

4.2.4.1. Visual Analysis of Two-Dimensional Time Series Disregarding Grouping of Entities

The visual analysis of trajectories is a relatively new topic dealt with mainly in the geographic applications. The study of Andrienko et al. [AAK08] presents an overview of Visual Analytics techniques for the detection of patterns in movement data. They discuss both the algorithmic and visualization techniques for individual and group movements while stating challenges for future research in the geographic spatio-temporal research area.

In the following, we present selected recent studies on visual analysis of movement patterns presented.

Wren et al. [CZQ08] visually analyze facility monitoring data. The dataset includes sensor information on movements of people and camera pictures without direct entity identification (i.e., they are not able to identify persons passing sensors). They offer interactive visualization of movements, the sensor occupancy distribution

4.2. Background

and path queries. From the algorithmic analysis point of view, a new track recovery algorithm is proposed.

However, given that we consider trajectories of identified entities, the latter part of their approach is not relevant to the work presented in this thesis.

Spatio-temporal visual analysis of individual movements are presented by Andrienko et al. [AA08] who pro-posespatio-temporal aggregationfor visual analysis of trajectories. The spatial aggregation algorithm is com-bined with interactive visualization of movements showing main data flows.

Recently, Andrienko et al. [AA07] presented a system which supports visual analysis of car movements using a combination of interactive visualization and clustering and aggregation of routes. The visualization is geo-based in 2D and 3D with options to select trips and time periods. The clustering of trips according to, for example, start and end points employs the OPTICS algorithm. Clustering results can be interactively visually explored.

This work has been extended using progressive density-based clustering with interactive selection of distance functions by Rinzivillo et al. [RPN08]. For example, they use clustering of trajectories by starting and finishing points first and then refine the clusters according to route parameters. They provide an interactive exploration of the results using state of the art geographic visualization techniques. This is an extension of a previous paper [NP06] with an interactive selection of parameters and a visualization of results (see Subsection4.2.2).

This study is mostly similar to our work.

Visual analysis of trajectories using SOM clustering and interactive exploration of the results was presented earlier in 2007 also by Schreck et al. [STFK07]. This study was extended with interactive visual monitoring of the clustering process and steering of initialization in 2008 [SBTK08].

4.2.4.2. Visual Analysis of Two-Dimensional Time Series Including Grouping of Entities

The visualization of static one-class point data extended with statistical analysis of the data was explored in [WAG05] and [WAG06]. The statistic indicators of point cloud shape and point distribution are used for proposing interesting two dimensional projections of the originally multi-dimensional data for further visual inspection. This approach is however limited to static data without entity groupings (only one class of data).

In the dynamic case, the ESDA toolkit [BK04] offers the possibility to visualize hulls around trajectories and calculate and visualize the central tendency and dispersion of the group movement. This approach does not include further analysis of the data or other visual abstractions and is mainly suitable for entities moving together.

4.2.5. Summary

The algorithmic analysis of two-dimensional time dependent data is mainly concerned with the similarity and the clustering of trajectories. The similarity measures used are mainly based on direct trajectory or feature vector representation. The clustering algorithms used include, for example, k-means, density-based and probabilistic approaches. In 2007, we presented a system for visual representation of clusters of trajectories using SOM.

Additionally, there are approaches for trajectory aggregation and clustering of sub-trajectories.

The visualization approaches for two-dimensional dynamic data include 2D and 3D animation and trajectory techniques based on scatterplot framework.

The visual analysis of two-dimensional time-dependent data without grouping information combines the above-mentioned algorithmic analysis with interactive visualizations for exploration of the data space.

The approaches for groups of data in the dynamic case are rare. In the static case, statistic analysis of a point cloud has been used for finding interesting views on the data. In the dynamic case, visual abstractions (e.g., hulls, mid-points) are used for visualization suitable for data exploration.

In summary, the tight integration of algorithmic analysis with interactive visualization steered by the user for analysis of trajectories both in individual and group case has not been extensively explored. Individual fields include a variety of methods which are applied mainly separately. Moreover, there are only few approaches for visual analysis of this type of data.