• Keine Ergebnisse gefunden

Relational Implementation of the Multidimensional Data Model

8.1 Visual Analysis Framework

8.1.2 Components of a Visual OLAP Tool

Comprehensive analysis includes a variety of tasks such as examining the data from multiple perspectives, ex-tracting useful information, verifying hypotheses, recognizing trends, revealing patterns, gaining insight, and discovering new knowledge from arbitrarily large and/or complex data volumes. In addition to conventional operations of analytical processing, i.e., drill-down, roll-up, slice-and-dice, pivoting, and ranking, OLAP frontends support further interactive data manipulation techniques, such as zooming and panning, filtering, brushing, collapsing, distorting, etc.

OLAP tools account for a diversity of potential analytical tasks by providing a comprehensive framework for interactive generation of desired visual presentations. The overall query specification cycle evolves byi) selecting a data source of interest,ii)choosing a desired visual layout (e.g., a scatterplot or a pivot table), and iii)mapping various data attributes to these structural elements of the chosen layout (e.g., the horizontal and the vertical axis of a plot) as well as to other visual attributes, such as color, shape, and size.

The entire exploration framework can be considered as composed of an input and an output area for spec-ifying queries and presenting query results, respectively. The input component has the form of a navigation interface for visual querying of data sources by presenting data cubes as browsable structures. The output area presents the results of user interactions in a selected visual format and enables interactive exploration by providing a taxonomy of available visual layouts and attributes along with a toolkit of interaction techniques for dynamic refinement of the queried data subset and its visual representation. A unified framework is ob-tained by designing an abstraction layer for each element and providing mapping routines (e.g., metadata to a navigation hierarchy, navigation events to database queries, and query results to a visual layout) that implement the interaction between different layers.

VISUAL QUERY SPECIFICATION

Visual OLAP disburdens the end-user from composing queries in the “raw” database syntax (e.g., SQL or MDX). Instead, queries are specified visually. Multidimensional data is represented as a browsable structure whose elements can be queried by “pointing-and-clicking” and “dragging-and-dropping”. The visual inter-face does not trade advanced functionality off for simplicity, it rather facilitates the process of specifying ad hoc queries of arbitrary complexity.

While analytical queries aggregate over detailed data, visual exploration evolves in the inverse direction, i.e., “descending” from coarsely grained views towards more detailed ones via a stepwise decomposition into subaggregates along selected dimensions. This prevailing drill-down direction is reflected in the structure

8.1 : Visual Analysis Framework 173

of a typical OLAP data navigation: each data cube is presented as a hierarchy of its dimensions and each dimension is a recursive top-down nesting of granularity levels, i.e., with the coarsest granularity at the top and the finest at the bottom. Users proceed by specifying the measure(s) (both the data field and the aggregate function), choosing the dimensions to be used as decomposition axes, filtering the selected data subset, and manipulating the visual representation of the result. These query steps are performed irrespective of the query type and the chosen visualization technique. Therefore, we see a great potential for improving the usability in designing a common uniform navigation framework for satisfying any type of analytical query.

Various navigation events, such as dragging and clicking, are translated into valid queries and executed instantaneously. Therefore, from the user’s point of view, querying is done implicitly by populating the visualization with data and incrementally refining the data view. The first step is to instantiate an empty visualization template with data, performed by dragging the elements (measures, dimensions) of interest into the respective layout areas. Figure 8.2 shows an example of instantiating a visualization in the Tableau Software: an empty pivot table template prompts the user to drop data fields from the navigation (left) into the column, the row or the cell area.

Figure 8.2: Mapping data fields to a visual layout in Tableau

Any OLAP query follows the same scheme, i.e., consists of the same query clauses, some of which are optional. In ROLAP systems, database queries are expressed in SQL and structured into the following sequence of clauses (optional clauses and elements are placed in square brackets):

174 Chapter 8 : Interactive Exploration of OLAP Aggregates

SELECT[ dimension_list, ] measure_list FROMtable_list

[WHEREpredicate_list ]

[GROUP BY[ ROLLUP|CUBE ] dimension_list ] [HAVINGmeasure_predicate_list ]

[ORDER BYattribute_list [sort_direction] ]

Elementsmeasure_listandtable_listare obligatory and have to be populated with at least one measure and one fact table, respectively. Thereby, the simplest possible query for instantiating a visualization with a grand total value is generated by picking a data cube and some measure field in it. For example, picking the fieldDiscountfrom the cubeSuperstore Salesin Figure 8.2 would generate the following SQL query:

SELECTSUM(discount)FROMsuperstore_sales

Further clauses serve for refining the initial query: i) WHERE andHAVING clauses allow to specify selection conditions on any attributes and aggregated measure fields, respectively,ii)GROUP BYcontains dimension categories to aggregate along, andiii)ORDER BYsorts the output. These clauses are populated with data by invoking corresponding OLAP operations described in the next section.

VISUALIZATION OF QUERY RESULTS

In the context of OLAP, visualization refers to the mapping of the data returned by a query or a series of queries to a visual layout. The output of any OLAP query is a data cube. Visual presentation is generated by assigning the cube’s elements – measures and dimensions – to visual variables of the display. A visualization technique is defined by its graphical primitives, such as line or circle segments, points, curves, etc., which in combination determine the layout template. Further visual variables, such as color, position, length, and area, are used for encoding various properties of the data set into its visual presentation.

Users analyze the visual presentation by extracting the quantitative information encoded into the graphics in the form of perceptual tasks. Visual analysis tasks are quite different from those encountered in classical data analysis. The former include recognizing shapes, discerning colour, judging sizes and distances, tracing motion, etc. Obviously, various tasks differ in their accuracy and ease of interpretation. Cleveland and McGill [28] propose the notion ofelementary perceptual tasks, orelementary graphical encodings, to describe the basic way of encoding data into a visualization. The authors also provide an empirically verified ranking of those tasks according to the accuracy of quantitative perception. Mackinlay [105] extended the set of considered tasks by addressing the issue of encoding non-quantitative information and provided a ranking of perceptual tasks according the data type (quantitative, qualitative, nominal).

The commonly recognized perceptual tasks in descending order of accuracy for quantitative data domains according to [28, 105] (except for animation, which was not evaluated in those studies) are the following ones:

Position(e.g., a coordinate of a point along an axis), Length (distance)(e.g., length of a bar in a bar-chart), Angle(e.g., angle of a segment in a pie-chart), Slope(e.g., slope of a line in a line-chart),

Direction (orientation)(e.g., direction of an edge in a graph), Area (size)(e.g., a rectangular area of a node in a TreeMap), Volume(e.g., volume of a 3-D shape),

Curvature(e.g., curve-difference charts), Density (darkness)(e.g., greyscale colormap),

Color saturation (brightness)(e.g., fading effect in the animation),

8.1 : Visual Analysis Framework 175

Position

5.4. Graphical Perception 69

A B C D E

051015

Figure 5.16: Decoding using position along a scale.

Figure 5.17: Decoding using positions on non-aligned scales.

assessments about slopes and changes of slope, itisangles which we look at. These means that making slope judgements suffers from the same problems as making angle judgements. In particular, the best decoding of slopes happens when the slopes are about 1.

Figure 5.18: Slope decoding using angles.

Length Angle Slope Direction Area Volume Curvature

Density Saturation Color hue Texture Connection Containment Shape Symbol

Figure 8.3: Elementary perceptual tasks in visual data analysis

More accurate

Figure 8.4: Ranking of perceptual tasks with respect to the data type

Color hue(e.g., assigning distinct colors to the segments of a pie-chart), Texture (shading)(e.g., assigning different filling patterns to shapes), Connection(e.g., connecting related nodes in a graph by an edge),

Containment(e.g., nesting child nodes within a parent node in a TreeMap),

Shapeorsymbol(e.g., using marks with different shapes or borders in a scatterplot),

Animation (motion)(e.g., animating the evolution of a value along the time axis in a scatterplot).

Figure 8.3 shows simple pictorial symbols that describe the main idea of each elementary perception task.

A visual layout typically employs a combination of different tasks to encode multiple characteristics of the data items. For example, bar-charts use position and length, whereas stacked bar-charts additionally make use of containment and color.

Since OLAP is concerned with the analysis of quantitative information, the accuracy of the utilized visual elements is a crucial requirement. However, the adequacy of the perceptual tasks for non-quantitative data types is also an important issue since dimensional characteristics may be of different data types – numeric, ordinal, or nominal. Ranking of perceptual tasks according to the encoded data type, proposed by Mackinlay [105], is shown in Figure 8.4. This ranking is related to relational data in general. Data sets retrieved by OLAP queries can be considered a special case of relational data consisting of two types of attributes, namely, numeric measures and descriptive dimensions. Therefore, ranking of quantitative tasks is especially relevant for encoding measures whereas ordinal and nominal ranks should be considered for mapping dimensions.

176 Chapter 8 : Interactive Exploration of OLAP Aggregates

Prevailing visual layouts and default visualization metaphors in the area of OLAP generally adhere to the proposed rankings: popular presentations are business charts and scatterplots, which map measures to length, angle, volume, area, or color, encoding their dimensional characteristics into position, density, or color hue.

Intuitively, the more perceptual tasks a particular visualization combines, the more characteristics of a data set can be presented. Simple bar-charts and pie-charts are capable of showing just a single measure grouped along a single dimension category, scatterplots support two dimensions and pivot tables allow to display mul-tiple measures and to nest mulmul-tiple dimensions in its rows and/or columns. Comprehensive analysis tasks may require more expressive visualization techniques. Popular approaches to increasing the dimensionality are to extend a 2-dimensional layout to 3-D (e.g., 3D charts and maps in Miner3D [124]), to use hierarchical layouts (e.g., Chart Trees in Report Portal [188]), to split the view into multiple perspectives (e.g., perspec-tives in Advizor [39]), or to arrange it into a grid of small multiples (e.g., visual tables in Tableau [169]).

Another emerging trend is to adopt specialized multidimensional visualization techniques, such as Parallel Coordinates [69], which scale to a higher number of dimensions.