• Keine Ergebnisse gefunden

Relational Implementation of the Multidimensional Data Model

8.1 Visual Analysis Framework

8.1.1 Related Work on Visualization for OLAP

The work related to the topic of this chapters in one way or another can be subdivided into two major groups, namely, visual analysis systems and advanced visualization techniques for OLAP.

VISUAL ANALYSIS SYSTEMS

First proposals to use visualization for exploring multidimensional data were not tailored towards OLAP applications, but rather addressed the generic problem of visual querying of large datasets stored in a database.

Keim and Kriegel [75] proposed VisDB, a visualization system based on a new query paradigm. In VisDB, users are prompted to specify an initial query. Thereafter, guided by visual feedback, they dynamically adjust the query, e.g., by using sliders for specifying range predicates on single attributes. Retrieved records are mapped to the pixels of the rectangular display area colored according the degree of their conformity to the specified set of selection predicates and positioned according to a grouping or ordering directive.

8.1 : Visual Analysis Framework 169

Another example of an early work related to multidimensional data exploration can be found in [44], where an intelligent visual interface CoDecide for cooperative analysis of spreadsheet data is proposed. CoDecide links multiple views of a data cube in a multi-perspective and multi-user mode and uses “tape” representations for visualizing the problem dimensions.

OLAP tools of the current state of the art provide the classical pivot table interface along with a set of popular business visualization techniques, such as charts and time series as well as more sophisticated layouts, such as scatterplots, maps, graphs, cartograms, matrices, grids, etc., and proprietary visualizations (e.g., ProClarity Decomposition Tree [141] and Fractal Map [22]). In the abundance of existing OLAP tools, we limit ourselves to naming a few products, which offer distinguished features.

Tableau Software [169] and other established OLAP vendors deliberately restrict the set of supported visualizations to the popular and proven ones, such as tables, charts, maps, and time series, doubting general utility of exotic visual metaphors [55]. Polaris, a visual tool for multidimensional analysis developed by the research team of Pat Hanrahan at Stanford University [167], is a predecessor of Tableau Software. Polaris inherits the basic idea of the classical pivot table interface that maps aggregates onto a grid defined by dimen-sion categories assigned to the grid’s rows and columns. However, Polaris uses embedded graphical marks rather than textual numbers in the table cells. The types of supported graphics are arranged into a taxonomy, comprising rectangle, circle, glyph, text, Gantt bar, line, polygon, and image layouts.

Advizor system [39] implements a technique that organizes data into three perspectives. A perspective is a set of linked visual components displayed together on the same screen. Each perspective focuses on a particular type of analytical task, such asi)single measure view using a 3D multiscape layout,ii)multiple measures arranged into a scatterplot, andiii)anchored measures presented using techniques from multidi-mensional visualization (e.g., Box Plots [177] or Parallel Coordinates [69]). A more recent survey [40]

investigates common visual metaphors and associated interaction techniques with improved visual scalabil-ity, i.e., the capability to effectively display large volumes of multidimensional data, and describes how the proposed techniques were implemented in the Advizor system.

ProClarity Analytics [150] is famous for innovative visualization tools, such as the Decomposition Tree, Perspective View, and Performance Map. In 2006, ProClarity became a Microsoft subsidiary and as of 2007, ProClarity Analytics was released as a part of the Microsoft Office Performance Point Server [34].

Another noteworthy tool is Report Portal – a web client reporting solution for Microsoft Analysis Services released by XMLA Consulting [188]. Report Portal in its current version 2.2 offers interactive OLAP and data mining reports based on visualization techniques , such as GIS maps, chart trees, TreeMaps, dashboards, and animated scatterplots (moving bubbles).

VISUALIZATION TECHNIQUES

A pioneering and fundamental work on automating visualization of relational data was carried out by Jock Mackinlay [105], who proposed to define visual presentations in terms of graphical languages. Graphical languages encode syntactic and semantic properties of graphical presentations in form of sentences, simi-lar to other formal languages. Expressiveness and effectiveness criteria are used to assess the quality of a graphical encoding and to compare various presentation alternatives with one another. By formalizing graph-ical presentation as a collection of graphgraph-ical languages, Mackinlay’s approach provides an abstraction for automatic synthesis of effective visual designs for a variety of data sets, focusing on two-dimensional static representations, such as bar-charts, scatterplots, and connected graphs.

Besides the classical visualization techniques, such as the pivot table and 2-dimensional plots and charts familiar to any data analyst, a wealth of more comprehensive visual frameworks for incremental exploration of and navigation in large multidimensional data volumes have emerged. Visualization techniques applicable in the OLAP context can be roughly grouped into the following categories (see [140] for further details):

170 Chapter 8 : Interactive Exploration of OLAP Aggregates

Geometric(Scatterplots, Landscapes, Hyperslices, Parallel Coordinates) Icon-based(Chernoff Faces, Stick Figures, Color Icons, TileBars) Pixel-oriented(Recursive Pattern, Circle Segments)

Hierarchical(Dimensional Stacking, Worlds-within-Worlds, TreeMap, Cone Trees, InfoCube) Graph-Based(Straight-, Poly- and Curved-Line, DAG, Symmetric, Cluster)

Hybridtechniques which arbitrarily combine any of the above.

Applicability of a particular technique or a meaningful combination of techniques depends largely on the analysis needs and the level of user expertise.

Figure 8.1 presents a structured overview of visualization techniques for OLAP arranged into four quad-rants according to the layout (simple vs. hybrid) and granularity (uniform vs. mixed). The techniques in each quadrant are sorted upwards in the increasing order of the maximum number of dimensions they can support.

Visual metaphors capable of displaying multiple measure fields are shown with orange background. This enumeration is by no means exhaustive and contains only the major techniques provided by existing OLAP tools or proposed in the research literature. Descriptions of the most of the enumerated techniques may be found in standard literature on information visualization [20, 176], as well as in industrial and research publications [119, 168, 172, 178].

Any OLAP tool implements just a small subset of the visualizations listed in Figure 8.1, mostly from the upper-left quadrant of simple layouts. However, there is a trend towards adopting novel and more complex layouts to support a wider spectrum of analysis tasks. This trend raises the issue of assisting the user in choosing a “good” visualization. In data warehouse systems, the issue of assessing the aptitude of a particular visualization approach for solving different types of analysis tasks is rather neglected. Typically, the user has to find an appropriate solution manually by experimenting with different layout options. As a result, users often come up with inefficient and even misleading visualizations. Apparently, a successful visual OLAP framework needs to be based on a comprehensive taxonomy of domains, tasks, and visualizations.

Simple Hybrid/Composite

(Floating) Bar Chart High-Low-Close Graph Multiscape

Decomposition Tree with Space-filling Bars Matrix of Charts

Figure 8.1: Segmentation of visualization techniques for OLAP by layout and granularity

8.1 : Visual Analysis Framework 171

To support a large set of diverse visualization techniques and to enable dynamic switching from one tech-nique to another, an abstraction layer has to be defined for specifying the relationships between the data and its visual presentation. Maniatis et al. [111] propose an abstraction layer solution, called theCube Presen-tation Model (CPM), which distinguishes between two layers: the logical layer deals with data modeling and retrieval whereas the presentation layer provides a generic model for representing the data visually (nor-mally, on a 2D screen). The entities of the presentation layer include points, axes, multicubes, slices, tapes, cross-joins, and content functions. The authors demonstrate how CPM constructs can be mapped to advanced visual layouts at the example of the Table Lens – a technique based on a cross-tabular paradigm with support for multiple zoomable windows of focus.

While many OLAP vendors restrict the set of supported visualizations to the popular and proven ones and doubt general utility of exotic visual metaphors, some research works suggest enriching the visual OLAP framework by extending basic charting techniques or employing novel and less known visualization tech-niques to take full advantage of multidimensional and hierarchical properties of the data [95, 164, 170, 172].

Tegarden [172] formulates the general requirements of business information visualization and gives an overview of advanced visual metaphors for multivariate data, such asKiviat diagramsandParallel Coordi-natesfor visualizing data sets of high dimensionality, as well as 3D techniques, such as3D Scattergrams,3D line graphs,floors and walls, and3D map-based bar-charts.

Another branch of visualization research for OLAP concentrates on developingmultiscalevisualization techniques capable of presenting the data at different levels of aggregation. Stolte et al. describe their im-plementation of multiscale visualizations within the framework of the Polaris system [168]. The underlying visual abstraction is that of azoom graphthat supports multiple zooming paths, where zooming actions may be tied to dimension axes or triggered by a different type of interaction.

Lee and Ong propose a multidimensional visualization technique that adopts and modifies the Parallel Coordinates method for knowledge discovery in OLAP [95]. The main advantage of this technique is its scalability to virtually any number of dimensions. Each dimension is represented by a vertical axis and the aggregates are aligned along each axis in form of a bar-chart. The other side of the axis may be used for generating a bar-chart at a higher level of detail. Polygon lines adopted from the original Parallel Coordinates technique are used for indicating relationships among the aggregates computed along various dimensions (a relationship exists if the underlying sets of fact entries overlap in both aggregates).

Sifer [164] presents a multiscale visualization technique for OLAP based oncoordinated views of dimen-sion hierarchies. Each dimendimen-sion hierarchy with qualifying fact entries attached as the bottom-level nodes is presented using a space-filling nested tree layout. Drilling-down and rolling-up is performed implicitly by zooming within each dimension view. Filtering is realized by (de-)selecting the values of interest at any level of dimension hierarchies, resulting either in highlighting the qualifying fact entries in all dimension views (global context coordination) or in eliminating the disqualified entries from the display (result only coordination). A similar interactive visualization technique, called theHierarchical Dynamic Dimensional Visualization (HDDV), is proposed in [170]. Dimension hierarchies are shown as hierarchically aligned barsticks. A barstick is partitioned into rectangles that represent portions of the aggregated measure value associated with the respective member of the dimension. Color intensity is used to mark the density of the number of records satisfying a specified range condition. Unlike in [164], dimension level bars are not ex-plicitly linked to each other, allowing to split the same aggregate along multiple dimensions and, thus, to preserve the execution order of the disaggregation steps.

One of the major visualization challenges for OLAP is the ability to present a large number of dimensions on a display. An additional visual attribute for mapping a dimension could beanimation, as found in the Gapminder software for interactive data exploration using animated scatterplots in which animation is used to show the evolution of values along the timeline [157]. A well-structured classification of visualization and interaction techniques with respect to the type and the dimensionality of the data is produced in [77].

172 Chapter 8 : Interactive Exploration of OLAP Aggregates

A technique for finding an appropriate ordering of the aggregates along dimensional axes, proposed by Wei Choong et al. in [26], may help to improve the analytical quality of any visualization. By default, the ordering of the measures is imposed by the lexical ordering of dimension members. To make patterns more obvious, the user has to rearrange the ordering manually. The proposed algorithm automates the ordering of measures in a representation as to best reveal the patterns (e.g., trends, similarity) in a data set.

Whenever a data cube contains spatio-temporal characteristics, the analysis may benefit from specialized exploration techniques for space-time patterns. Rivest et al. [155] propose SOLAP (spatial OLAP) as a visual platform for spatio-temporal analysis using cartographic and general displays. The authors also define different types of spatial dimensions and measures as well as a set of specialized geometry-aware OLAP operators. A synopsis of techniques for spatio-temporal exploration arranged according to the data and task types is produced in [6]. Kuchar et al. [88] point out that time dimension is not an ordinary data attribute and that, therefore, to ensure satisfactory analysis, interaction and visualization techniques have to incorporate explicit awareness of temporal characteristics.