• Keine Ergebnisse gefunden

Distributed Data Management Sheet 12 (until 29.07.2010 – Two weeks)

N/A
N/A
Protected

Academic year: 2021

Aktie "Distributed Data Management Sheet 12 (until 29.07.2010 – Two weeks)"

Copied!
1
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Exercises for Distributed Data Management

Institut für Informationssysteme – TU Braunschweig - http://www.ifis.cs.tu-bs.de

Technische Universität Braunschweig Institut für Informationssysteme http://www.ifis.cs.tu-bs.de Wolf-Tilo Balke, Christoph Lofi

Distributed Data Management Sheet 12 (until 29.07.2010 – Two weeks)

This exercise is optional and will provide only bonus points. You may hand it in via Email to lofi@ifis.cs.tu-bs.de .

(35 points bonus)

Using the HBase and Hadoop appliance used in the last exercise, develop a Java program which does the following:

- Import the following CSV file into an according HBase table: http://www.ifis.cs.tu- bs.de/webfm_send/517

o The file is an excerpt from IMDB, the columns are as follows:

Movie Name, Movie Year, Average IMDB Rating (0-10), Number of Rating Votes, Name of Director

- Create a Hadoop Map & Reduce Task which generates a statistic containing a row for each year containing the year and the average movie rating of all movies released in that year and having more than 200 rating votes, e.g. (“2011”,”7.0”);

(“2012”,”6.1”); …

- Copy & Paste the statistic into your solation, also attach the Java program.

Refer to the HBase, Hadoop and Cloudera help and tutorial web pages for assistance.

You may use the pre-configured Eclipse installation delivered by the Cloudera VM.

Referenzen

ÄHNLICHE DOKUMENTE

Figure 5.11 (e) shows similar performance results for queries on collections distributed using AS and P algorithms because the complete collection is stored on two data nodes,

Among the recent data management projects are the final global data synthesis for the Joint Global Ocean Flux Study (JGOFS) and the International Marine Global

With a boiling charter market, many transactions took place during the first 4 months of the year, driving prices up and motivating more owners to consider selling their

The operations performed included subcutaneous mas- tectomy, subcutaneous mastectomy with additional liposuction, breast reduction, and isolated liposuction.. The operation

During this month the circulation over Malaya is dominated by the North East monsoon, which usually commences its influence in November or December2). Rainfall is

gebieten eine Tendenz zu hohen jedoch schwankenden Ertragen besteht. A method of plotting two variables on the same choropleth map has already been described1). Recently

Israeli leaders have long stated: “Israel will not be the first country to introduce nuclear weapons in the Middle East.” 303 Israel has never articulated a nuclear doctrine,

Open the downloaded virtual machine image within that software and start it (you may ig- nore all error messages about outdated VMware tools / or copied machines).. Login user