Data Warehousing
& Mining Techniques
Wolf-Tilo Balke Silviu Homoceanu
Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de
2. Architecture 2.1 Basics
2.2 Storage structures 2.3 Tier architectures 2.4 Distributed DW 2.5 Middleware
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 2
2. Architecture
• Architecture of a DW
–Data is stored in a predefined database –Maintenance of the database is performed as in
OLTP by a DBMS –Usual functionality of the
database is ensured
•Storage, Update, Delete, Locate
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 3
2.1 Basics
Organizationally Structured Individually Structured
Departmentally Structured
Source Systems
• Databases & DBMS
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 4
2.1 Basics: Databases & DBMS
Application
SELECT id FROM revenues WHERE val > 50 000 DBMS
Disk 1 ID VAL 1 37 000 2 67 000
3 45 000 …..
Disk N
• DW is characterized by –Large volume of information
–Mostly used for reading the information and not for updating or deleting operations outside the ETL phase
• This characteristics suggest indexes as a must have in DW…so let’s remember indexes
2.1 Basics: Indexing
• Indexes are additional data structures which help locating records in a DB
–Creation of indexes is part of the physical tuning task of the DB administrators
–Indexes can influence the actual location of storage for a record
•Sequential storage, or via a hash function
–If the location is determined by the index not all attributes can be directly indexed (primary vs.
secondary indexes)
2.1 Basics: Indexing
• Indexes are useful for speeding up accessto the data
• They are ordered by indexing field(search key) –search key is the attribute used to look up records
into a file
• An index file consists of index entries –Records of the form
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 7
2.1 Basics: Indexing
search key location
• Primary indexes
–Order data by some unique attributeas
indexing field (primary key), store database records in this order
–An index record contains a pointer to the respective storage place
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8
2.1 Primary Index
• Secondary indexes point to locations
of records regarding non-orderingattribute –Indexing does not affect storage order
–There can be multiple secondary indexes for the same DB file
–Secondary indexes are usually dense
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 9
2.1 Secondary Index
• Characteristicsof secondary indexes –Speeds up retrieval, if secondary index on the
searched attribute does not exist, the entire file has to be searched linearly
–Use more time and space, because they are dense –Provide logicalordering
•Accessing records in this order might not be the most efficient way regarding block accesses
• In DW, due to the large amount of data, multi- level ordered indexesare used
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 10
2.1 Secondary Index
• Here’s a great idea: Why not index every attribute?
–Have a physical index on the primary key, and logical indexes on every other attribute
• This results in good read efficiency, but really terrible write/update efficiency
–But Data Warehouses only need good read efficiency?
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 11
2.1 Indexes
• Whenever a DB is modified, most of the indexes have to be updated
–This result in a large amount of overhead on operations like insert, delete or update
•If the indexes are multi-level every level has to be updated
• Why should we care? We have a DW not an OLTP system –The majority of the operations in a
DW are reads –But… remember ETL?
–We should use considerable more indexes than in OLTP, but loading data into the DW should not last forever!!!
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 12
2.1 Indexes
• In DW the underlying technology has to support –Creation and loading of new indexes
–Efficient accessto the indexes
• Efficient access can be accomplished in different ways –Using bit maps
–Having multi-leveled indexes
–Storing all or parts of an index in main memory
–Compacting the index entries when the order of the data being indexed allows such compaction
–Creating selective indexesand range indexes
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 13
2.1 Indexes in DW
• Recommended index structures are:
–B-tree indexes, on high cardinality attribute columns (due to the bushy nature of B-Trees)
–Bitmap indexes on all medium and low cardinality attributes
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 14
2.1 Indexes in DW
• Rel. DB 2: Basic structure of a B-Tree node –Node contains key values and respective data
(block) pointersto the actual data records –Additionally, there are node pointers for the left
respectively right interval around a key value
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 15
2.1 B-Trees
…
Key Value Data Pointer Tree Node
Node Pointers
• Bitmap indexes:
–Work well with small number of distinct values
•E.g. , gender data
–Have a significant space and performance advantage over other structures for this type of data
–Useful in DW for joining a large fact table to smaller dimension tables
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 16
2.1 Bitmap indexes
Identifier Gender Bitmaps
F M
1 Female 1 0
2 Female 1 0
3 Male 0 1
4 Unspecified 0 0
5 Male 0 1
• Architecture of a DW
2.1 Basic architecture
Summary Data Raw Data
Metadata
Users
Analysis
Reporting
Mining Warehouse
Flat files Operational
System
Operational System
Data Sources Staging Area
Inventory Purchasing
Sales Data Marts
• The Data Staging Area –Is both a storage and process
area (the ETL process) –It represents everything that
happens between the operational source system and the data presentation area
–The key architectural requirement for data staging area is that it is off-limits to business users and does not provide query and presentation services
2.1 Basic architecture
Warehouse Data
Sources Staging Area
Data Marts
• Customers aren’t invited to visit the kitchen…
–Similar to a restaurant’s kitchen, the data staging area should be accessible only to skilled professionals
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 19
2.1 Basic architecture
• TheData Presentation Area
–Is where data is organized, stored and made available for queries, report writers, and other analytical processing
–This area isthe Warehouse as far as the business community is concerned
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 20
2.1 Basic architecture
Warehouse Data
Sources Staging Area
Data Marts
• Storage structure
–After extraction from the operational data, in DW information is stored in databases
–The databases are operated by a DBMS
–Different database structures can be used for a DW:
•Relational model (RDB) operated by a RDBMS
•MultiDimensional model (MDB)operated by a MDBMS
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 21
2.2 Storage structures
• RDB and MDB are complementaryand do not have to exclude each other
–In the staging area some RDBMS can be used, however it must be off-limits to user queries because of performance reasons
–By default, normalized databases are excluded from the presentation area, which should be strictly multi-dimensionally (MDBMS)
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 22
2.2 Storage structures
• DB in relational model
–A database is seen as a collection of predicatesover a finite set of variables
–The content of the DB is modeled as a set of relations in which all predicates are satisfied
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 23
2.2 Relational DB
Books Title ISBN (PK) Price Publisher (FK)
Category (FK) BookCategory
Cat_ID (PK) Description Publisher Name ID (PK)
• A relationis defined as a set of tuplesthat have the same attributes
–It is usually described as a table
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 24
2.2 Relational DB
Relation Attribute
Tuple
• AMultidimensional DB (MDB)is optimized for DW and OLAP applications
–They are created using input from the staging area –Their purpose is to answer questions like
“How many Nokia 5800 have we sold so far this year in Braunschweig?”
–MDBs are RDBS optimized for OLAP queries
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 25
2.2 Multidimensional DB
• MDB are…
–Designed for efficient and convenient storage and retrieval of large volumes of data
–Stored, viewed and analyzed from different perspectives called dimensions
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 26
2.2 Multidimensional DB
• MDB example
–An automobile manufacturer wants to increase sale volumes
•Evaluation requires to view historical sale volume figures from multiple dimensions
•Sales volume by model, by color, by dealer, over time
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 27
2.2 Multidimensional DB
• A relational structure of the given evaluation would be
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 28
2.2 Multidimensional DB
Model Color Sales volume
Mini VAN Blue 324
Mini VAN Black 113
Mini VAN Red 18
Sedan Black 160
Sedan Blue 115
Sedan Red 6
Sports coupe Red 16
Sports coupe Black 16
Sports coupe Blue 12
2.2 Multidimensional structure
113 324 18
160 115 6
16 12 16
Mini VAN
Coupe Sedan
Blue Red
Black
289 451 40
455
281
44
* 1560
*
• The complexitygrows quickly with the number of dimensions and the number of positions
–Example: 3 dimensions with 10 values each and no indexes
–If we consider viewing information in a RDB it would result in a worst case of 103=1000 records view
2.2 Multidimensional DB
• Now, if we consider performance
–For responding to a query when car type = Sedan, color = Blue, and dealer = Berg
•RDBMS has to search through 1000 records to find the right record
•MDB has more knowledgeabout where data lies
•The maximum of searches in the case of MDB is of 30 positions
•Average case 18 vs. 501 positions
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 31
2.2 Multidimensional DB
• If the query is more relaxed
–Total sales across all dealers for all colors when car type = sedan
•RDBMS still has to go through the 1000 records
•MDB, however, goes only through a sliceof 10x10
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 32
2.2 Multidimensional DB
• Performance advantages
–MDBs are an order of magnitude faster than RDBMSs
–Performance benefits are more for queries that generate cross-tab views of data (the case of DW)
• Conclusion
–The performance advantages offered by MDBs facilitates the development of interactive decision support applications like OLAP that can be impractical in a relational environment
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 33
2.2 Multidimensional DB
• Any database manipulation is possible with both technologies
• MDBs however offer some advantages in the context of DW:
–Ease of data presentation –Ease of maintenance –Performance
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 34
2.2 RDB vs. MDB
• Ease of data presentation
–Data views are natural output of the MDBs
–Obtaining the same views in RDB requires a complex query
•Example with Walmart and Sybase:
–select sum(sales.quantity_sold) from sales, products, product_categories, manufacturers, stores, cities where manufacturer_name = ‘Colgate’
and product_category_name = ‘toothpaste’
and cities.population < 40 000
and trunc(sales.date_time_of_sale) = trunc(sysdate-1) and sales.product_id = products.product_id and sales.store_id = stores.store_id
and products.product_category_id = product_categories.product_category_id and products.manufacturer_id = manufacturers.manufacturer_id and stores.city_id = cities.city_id
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 35
2.2 RDB vs. MDB
• Ease of data presentation
–Top k queries cannot be expressed well in SQL
•Find the five cheapest hotels in Frankfurt
–SELECT * FROM hotels h WHERE h.city = Frankfurt AND 5 >
(SELECT count(*) FROM hotels h1 WHERE h1.city = Frankfurt AND h1.price < h.price);
•Some RDBMS extended the functionality of SQL with STOP AFTER functionality
–SELECT * FROM hotels WHERE city = Frankfurt Order By price STOP AFTER 5;
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 36
2.2 RDB vs. MDB
• Ease of maintenance
–No additional overhead to translate user queries into requests for data
•Data is stored as it is viewed
–RDBs use indexes and sophisticated joins which require significant maintenance and storage to provide same intuitiveness
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 37
2.2 RDB vs. MDB
• Performance
–Performance of MDBs can be matched by RDBs through database tuning
–Not possible to tune the database for all possible ad- hoc queries
–Aggregate navigators are helping RDBs to catch up with MDBs as far as aggregation queries are concerned
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 38
2.2 RDB vs. MDB
• When MDBs are in-appropriate?
–If the dataset types are not highly related, using a MDB results in a sparse representation
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 39
2.2 MDB
113 324 18
160 115 6
16 12 16
Mini VAN
Coupe Sedan
Blue Red
Black
34 25
45
Smith
Fox James
115 3
6
34
• When MDBs are appropriate?
–In the case ofhighly interrelateddataset types MDBs are recommended for greatest ease of access and analysis
–Examples of applications
•Financial Analysis and Reporting
•Budgeting
•Promotion Tracking
•Quality Assurance and Quality Control
•Product Profitability
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 40
2.2 MDB
• Popular DW architectures –Generic Two-Tier Architecture –Independent Data Mart
–Dependent Data Mart and Operational Data Store –Logical Data Mart and Active Warehouse
–Three-Tier Architecture
• Other
–One-Tier Architecture –N-Tier Architecture –Web-based Architecture
2.3 Tier architectures
• Generic Two-Tier Architecture –Data is not completely current in the DW –Periodic extraction
2.3 Layered architectures
• Data analysiscomes in two flavors
–Depending on the execution place of the analysis
•Thin Client
–Analytics are executed on the server –Client just displays
–This architecture fits well for Internet/Intranet DW access
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 43
2.3 Layered architectures
Server
Data storage Analysis
Client HTTP, IIOP
•Fat Client
–The server just delivers the data –Analytics are executed on the client
–Communication between client and server must be able to sustain large data transfers
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 44
2.3 Layered architectures
Server
Data storage Analysis
Client
ODBC, JDBC, NFS
• Independent Data Mart –Mini warehouses – limited in scope
–Separate ETL for each independent Data Mart –High Data Marts access complexity
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 45
2.3 Layered architectures
• Dependent Data Mart and Operational Data Store
–Single ETL for the DW
–Data Marts are loaded from the DW –More simple
data access than in the previous case
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 46
2.3 Layered architectures
• Logical Data Mart and Active Warehouse –The ETL is near real-time
–Data Marts are notseparate databases, but logical views of the DW
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 47
2.3 Layered architectures
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 48
2.3 DW vs. Data Marts
DW Data Marts
Application independent
Specific DSS application
Centralized, Decentralized by user area Planned Organic, possibly not planned
DW Data Marts
Historical, detailed, summarized
Some history, detailed, summarized
Lightly denormalized
Highly denormalized Scope
DW Data Marts
Multiple subjects One central subject
DW Data Marts
Many internal and external sources
Few internal and external sources Data
Sources Subjects
Other characteristics
DW Data Marts
Flexible Restrictive
Data-oriented Project oriented
Long life Short life
Large Start small, becomes
large Single complex structure
Multiple, semi-complex structure, together complex
• Generic Three-Tier Architecture –Derived data
•Data that had been selected, formatted, and aggregated for DSS support
–Reconciled data
•Detailed, current data intended to be the single, authoritative source for all decision support
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 49
2.3 Layered architectures
Data mart metadata
DW metadata
Operational metadata Derived data
Data Mart
Reconciled data DW and ODS
Operational data Operational data
• One-Tier Architecture –Theoretically possible
–Might be interesting for mobile applications
• N-Tier Architecture
–Higher tier architecture is also possible
•But the complexity grows with the number of tier-interfaces
• Web-based Architecture –Advantages:
•Usage of existing software, reduction of costs, platform independence
–Disadvantages:
•Security issues: data encryption/user access and identification
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 50
2.3 Layered architectures
• In most cases the economics and technology greatly favor a single centralized DW
• But in some cases, distributed DW make sense
• Types of distributed DW –Geographically distributed
•Local DW/global DW
–Technologically distributed DW
•Logically one DW, physically more DW –Independently evolving distributed DW
•Uncontrolled growth
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 51
2.4 Distributed DW
• Geographicallydistributed
–In the case of corporations spread around the world
•Information is needed both locally and globally –A distributed DW makes sense
•When much processing occurs at the local level
•Even though local branches report to the same balance sheet, the local organizations are their own companies
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 52
2.4 Distributed DW
USA HQ Local DW
2.4 Distributed DW
Europe Site A Local DW
Asia Site B Local DW
Global DW All IBM
IBM/Teradata
Sybase
Local operational processing
Local operational processing
Local operational
processing
• Technologicallydistributed DW
–Placing the DW on the distributed technology of a vendor
–Advantages
•The entry cost is cheap – large centralized hardware is expensive
•No theoretical limit to how much data can be placed in the DW – we can add new servers to the network
2.4 Distributed DW
–As the DW starts to expand network data communicationstarts playing an important role
•Example: Let’s simplify and consider we have 4 nodes holding each data regarding the last 4 years
•Now let’s consider we have a query which needs to access the data from the last 4 years:
such a query arises the issue of transporting large amount of data between processors
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 55
2.4 Distributed DW
2005 2006
2007 2008
• Independently evolving distributed DW –In practice there are many cases in which independent
DW are developed concurrently and uncontrolled in the same organization
•The first step many corporations make is to build a DW for financial or marketing
•Once it is successfully set up, other parts of the organization follow independently the process resulting in the coexistence of more indepen- dent DW in the same organization
•This problem will be addressed later
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 56
2.4 Distributed DW
• Middleware-Systems
– Provide an inter-connectivity layer between heterogeneous platforms and the applications that come on top
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 57
2.5 Middleware
Middleware-System
Platform -Hardware - Operating System
Platform -Hardware - Operating System Platform-interface Platform-interface
APIs
Application Application Application
• Middleware in DW?
–DW usually implies
•Heterogeneous hardware, databases, operating systems, networks and applications
–Middleware serves both users and developers
•It shields both users and developers from differences in services and resources used by applications
–Without middleware…
•Changes at the lower layers could imply propagating changes by updating the higher layers
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 58
2.5 Middleware
• Rolesof Middleware
–Assist the developer in ETL and populating the DW –Assist DW users in accessing the DW
–It is therefore needed at different points in the life cycle
• Types:
–Copy management: data extraction, transformation,…
–Gateways: DB and independent gateways –Program to program: RPC, ORBs –Message oriented
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 59
2.5 Middleware
• Most common middleware technologies –CORBA(Common Object Request Broker
Architecture)
–DCOM(Distributed Component Object Model) –J2EEin DW
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 60
2.5 Middleware
• CORBA
–Mechanism for normalizing method-call semantics between application objectson the samehost or on remotehost
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 61
2.5 Middleware
(client) main()
ORB Object reference Generated stub
code
ORB (server) main()
Object reference Generated skeleton code
Object implementation
network
ORB vendor code ORB vendor-tool generated code User-defined application code
• ORB
–Is a middleware technology that manages
communicationand data exchange between objects in object-oriented programming and databases
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 62
2.5 Middleware
ORB
Client app
Object implementation
(service) Establish
connection
Client – Service communication
• Client
–The application program that invokes a method or operation on an object implementation
• Stub
–Precompiled interfacebetween the client and the ORB, generated by the ORB tool
• ORB
–An interface containing help functions and APIs that can be used by a client or an object implementation
• BOA (Basic Object Adapter)
–Refers to the part of the ORB responsible for managing server-side operations –Replaced by the POA (Portable Object Adapter)
• Skeleton
–The server-side analogue of stubs
• Implementation
–Called a service or method in object-oriented terminology, defines the operations supporting an interface definition language(IDL)
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 63
2.5 Middleware
1. the client makes the call through the stub to the ORB
2. the ORB dispatches the call to the BOA, that does the object activation 3. the implementation registers itself, if necessary, and declares itself ready 4. the BOA, now signaled ready, invokes the implementation via the skeleton from
IDL
5. a response or exception propagates up to the client caller.
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 64
2.5 How CORBA works
Client Stub ORB BOA Skeleton Implementation
1
2 3 4 5
• CORBA in DW –Query Service
•Supports SQL and OQL –Object Transaction Service
•Ensure correct state of transactional objects –Distributed commit/rollback
–Guarantees ACID properties
–It is able to send copies of multidimensional data
2.5 Middleware
• DCOM
–Microsoft's concurrent for CORBA
–Can access distributed stored data through ADO (ActiveX Data Objects)
•ADO uses for the actual database access OLE DB (Object Linking and Embeding DB) and ODBC (Open DB Conectivity)
–There is also a multidimensional ADO – ADO MD
•It contains objects for communication of data cubes
2.5 Middleware
• J2EE in DW
–Not fit for storage and analyze of a multidimensional DB
–JOLAP offers a programming interface for analytical access to the DW
•A Java community initiative, sustained by SUN and Oracle
•Lack of effective support –OLAP4J
•Is simply put a multidimensional JDBC
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 67
2.5 Middleware
• So why is middleware important?
–Heterogeneous
•Hardware, Data sources, Data targets, Platforms, Operating systems Communication protocols
–Connectivity
–Platform and Application independence –Support of standard protocols and interfaces
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 68
2.5 Middleware
• Modeling
–Basics of data modeling –Data models in DW
Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 69