Congress of Catalan Archivists 2015 Thomas Risse 25/11/15
Challenges and Approaches for Web Archive Creation and Usage
Thomas Risse
L3S Research Center
Congress of Catalan Archivists 2015 Lleida, 29. May 2015
1
Congress of Catalan Archivists 2015
World Wide Web = 50 Bill. Pages + 1 Bill. Users The Web and the Social Web
play a crucial role
– Information and services for all domains
– Allows contributions by every citizen – Giving room for the articulation for a
multitude of stakeholders
– Reflects all types of events, opinions, developments within society, science, politics, environment, business, …
The Web is a core part of our daily life
25/11/15
Thomas Risse 2
Congress of Catalan Archivists 2015
Spam Attack on Copts
Gun running from Sudan
Are we loosing
the past of the web?
25/11/15
Thomas Risse 3
Congress of Catalan Archivists 2015
The Web is a quickly changing, ever growing information space [1]
– It’s growing by >8% per week
– After 1 year only 40% of the pages are still accessible while 60%
of the pages are new
A Web Archive as a Collective Memory is a cultural necessity for the future
But „Archive and Store Everything“ is not a practical approach
[1] A. Ntoulas, J. Cho, and C. Olston. What's new on the web?: the evolution of the web from a search engine perspective.
In Proceedings of the 13th international conference on World Wide Web (WWW '04)
The Web is Changing and Forgetting
25/11/15
Thomas Risse 4
Congress of Catalan Archivists 2015
[2] D. Gomes, J. Miranda and M. Costa. A survey on web archiving initiatives.
In Proceedings of the 1st International Conference on Theory and Practice of Digital Libraries 2011 (TPDL 2011)
Where are we now?
• A number of tools are available (e.g. Heritrix)
• Crawl descriptions are currently lists of URLs
• >42 world wide Web Archives initiatives with different scopes [2]
• Still a lot manual effort is necessary but only some 100 people are involved world wide [2]
• Increasing interest from Digital Humanities, Journalism, etc.
More support is necessary
– Crawl by Events, Topics and Entities
– Using the “Wisdom of the Crowds” for selection and appraisal
25/11/15
Thomas Risse 5
Congress of Catalan Archivists 2015
Agenda
• Motivation
• The Crawl Process and its Challenges
• Process Overview
• Dynamic Pages
• Basic Crawl Strategies
• Termination of Crawls
• Next Generation Web Archiving
• Requirements
• Topical Crawling
• The ARCOMEM Approach
• iCrawl - Integrated Crawling
• Web Archive Access and Usage Examples
• Current methods
• The ARCOMEM Approach
• Time and Topic aware diversification
• Language Evolution
• Conclusions
25/11/15
Thomas Risse 6
Congress of Catalan Archivists 2015
Agenda
• Motivation
• The Crawl Process and its Challenges
• Process Overview
• Dynamic Pages
• Basic Crawl Strategies
• Termination of Crawls
• Next Generation Web Archiving
• Requirements
• Topical Crawling
• The ARCOMEM Approach
• iCrawl - Integrated Crawling
• Web Archive Access and Usage Examples
• Current methods
• The ARCOMEM Approach
• Time and Topic aware diversification
• Language Evolution
• Conclusions
25/11/15
Thomas Risse 7
Congress of Catalan Archivists 2015
Web Archiving and some challenges
Selection
Prepar ation Di scovery Fi ltering
Capture
Li nk Ex tr action Fetchi ng
Archiving
Storage
Index
Access Quality Review
Archivist
User
Noise Filtering
Dynamic Pages JavaScript, Flash
Multimedia Content Temporal Coherence
of Crawls
Data Volume Temporal Aspects
Deep Web
Content Selection Content
Appraisal
Social Media API Crawling
25/11/15
Thomas Risse 8
Contributions the European Projects
Congress of Catalan Archivists 2015
Web Archiving and some challenges
Selection
Prepar ation Di scovery Fi ltering
Capture
Li nk Ex tr action Fetchi ng
Archiving
Storage
Index
Access Quality Review
Archivist
User
Noise Filtering
Dynamic Pages JavaScript, Flash
Multimedia Content Temporal Coherence
of Crawls
Data Volume Temporal Aspects
Deep Web
Content Selection Content
Appraisal
Social Media API Crawling
25/11/15
Thomas Risse 9
Congress of Catalan Archivists 2015
Dynamic Pages
25/11/15
Thomas Risse 10
Congress of Catalan Archivists 2015
Follow the links....
<a href="http://www.gnu.org/philosophy/free- sw.html">free software</a>
25/11/15
Thomas Risse 11
Congress of Catalan Archivists 2015
But what about these pages?
25/11/15 Thomas Risse
function() {
var fcb_referrers = [
"fcbarcelona.com", "fcbarcelona.cat", "fcbarcelona.es",
"fcbarab.com", "fcbarcelona.fr", "fcbarcelona.jp",
"fcbarcelona.cn", "fcbarcelona.qq.com"
];
var current_language = "en";
var routes = {"ca":"http://www.fcbarcelona.cat"};
var navigator_language = navigator.language || navigator.userLanguage;
var language = (navigator_language || "").split("-")[0];
var s = "^https?:\/\/[^\/]*(" + fcb_referrers.join("|").replace(/\./g, '\\.') + ")";
var fcb_referrer = document.referrer.match(new RegExp(s));
var redirect_url = routes[language];
if (language && language != current_language && !fcb_referrer && redirect_url) { window.location = redirect_url;
} })();
12
Congress of Catalan Archivists 2015
“Endless” Pages
25/11/15 13
Thomas Risse
Automatically reload content while browsing the page
• Requires execution of JavaScript
• When can the fetching be stopped?
• Examples: Facebook, Twitter, Tumblr,
etc.
Congress of Catalan Archivists 2015
Embedded Social Media
25/11/15 14
Thomas Risse
Embedded Social Media Content
• JavaScript embeds
• Dynamically integrated content on the browser side
• Often standard scripts
Possibility to extract parameters
• Content fetching requires API
Crawler
Congress of Catalan Archivists 2015
Handling of Dynamic Pages
The Problem
• Embedded “Programs”
• Adobe Flash impossible to handle
• JavaScript is readable
• Dynamic creation of links and content Approaches
• Guessing of links
“guessing” by assembling any fragments that look like links into URLs
Can be very noisy - lots of wrong URL’s
• Extraction of parameters from program code
Applicable for known code libraries e.g. Facebook, Twitter
• Execution of JavaScript
Simulate user activities- “pressing” the links and see what comes out
Execute code in a Javascript engine (e.g. WebKit, Firefox Browser)
Extract links from resulting DOM tree
Status
• Some implementations exist (e.g. LiWA Service, Browser Monkey)
• API Crawler exist
• No integration into standard Web Crawler
25/11/15
Thomas Risse 15
Congress of Catalan Archivists 2015
Basic Crawl Strategies
25/11/15
Thomas Risse 16
Congress of Catalan Archivists 2015
Crawl Strategy
25/11/15
Thomas Risse 17
www.news.de
journalists.htm
thomas.htm
joe.htm
ukraine0105.htm
ukraine0205.htm
ukraine_sports0105.htm
spain_sports0205.htm about.htm
ukraine_crisis.htm
sports.htm
Congress of Catalan Archivists 2015
Crawl Strategy: Depth-first
25/11/15 Thomas Risse
www.news.de
journalists.htm
thomas.htm
joe.htm
ukraine0105.htm
ukraine0205.htm
18
ukraine_sports0105.htm
spain_sports0205.htm about.htm
ukraine_crisis.htm
sports.htm
Congress of Catalan Archivists 2015
Crawl Strategy: Breadth-first
25/11/15 Thomas Risse
www.news.de
about.htm
journalists.htm
ukraine_crisis.htm
thomas.htm
joe.htm
ukraine0105.htm
ukraine0205.htm
19
ukraine_sports0105.htm
spain_sports0205.htm
sports.htm
Congress of Catalan Archivists 2015
Termination of a Crawl
When is the harvesting of Web Pages complete?
Problem: The total amount of pages is unknown Typical Approach
• Crawler Queue is empty (only for very focused crawls)
• Termination after number of pages/amount of data
• Termination after time
Most often incomplete crawls depending on the strategy
25/11/15
Thomas Risse 20
Congress of Catalan Archivists 2015
Possible Consequences after stopping the Crawl
25/11/15 21
Thomas Risse
www.news.de
journalists.htm
thomas.htm
joe.htm
ukraine0105.htm
ukraine0205.htm
ukraine_sports0105.htm
spain_sports0205.htm about.htm
ukraine_crisis.htm
sports.htm
www.news.de
about.htm
journalists.htm
ukraine_crisis.htm
thomas.htm
joe.htm
ukraine0105.htm
ukraine0205.htm
ukraine_sports0105.htm
spain_sports0205.htm sports.htm
Depth-first Breadth-first
Congress of Catalan Archivists 2015
Alternative Crawl Strategies
25/11/15 Thomas Risse
- Selection by Popularity
- Ranking of the waiting queue
according to the popularity of the content - Requires knowledge about the Web Graph
- For example: PageRank
- Mainly useful to optimize regular crawls - Content based selection
- Topics, Events, Entities
- Requires semantic crawl specification
- The selection of the right strategy depends on the crawl intention
22
Will be discussed afterwards
www.news.de
journalists.htm
thomas.htm
joe.htm
ukraine0105.htm
ukraine0205.htm
ukraine_sports0105.htm
spain_sports0205.htm about.htm
ukraine_crisis.htm
sports.htm
Congress of Catalan Archivists 2015
Agenda
• Motivation
• The Crawl Process and its Challenges
• Process Overview
• Dynamic Pages
• Basic Crawl Strategies
• Termination of Crawls
• Next Generation Web Archiving
• Requirements
• Topical Crawling
• The ARCOMEM Approach
• iCrawl - Integrated Crawling
• Web Archive Access and Usage Examples
• Current methods
• The ARCOMEM Approach
• Time and Topic aware diversification
• Language Evolution
• Conclusions
25/11/15
Thomas Risse 23
Congress of Catalan Archivists 2015
Growing Scientific Interest in Web Archive Content (1/2)
Historians
- Official Publications (e.g. Government) - Journalistic Resources
- Important topics and events with a high media coverage
- Multi-cultural or controversial topics
- Optimal: continues observation of topics in the Web
Social Sciences
- Observations of topics and events on major sites are good starting points - Identified Topic
- Official publications, journalistic and social media sources - Changes on the topic should be identified
- Metadata / Context (e.g. Author, Organizations and their interests, gender, location)
- Demographic information about social sites
- Provenance: Transparent and detailed documentation of content selection
25/11/15
Thomas Risse 24
Congress of Catalan Archivists 2015
Growing Scientific Interest in Web Archive Content (2/2)
Law
- Research is based on official publications and protocols of parliaments or comments
released by publishers
- Social media (especially blogs) are increasingly used
- Only used as background information
- Reason: missing citability and authenticity of resources
- Genesis of laws
- Used to understand original intention of laws
- A democratic system requires a complete documentation of the law genesis.
- Currently different degrees of documentation
- Official publications: parliament and committee meetings - Public discourse
25/11/15
Thomas Risse 25
Congress of Catalan Archivists 2015
Derived Requirements
Topical Dimension
- Crawl intention are mainly focused around events and rarely around entities - What is the intention of the researcher?
- Easy monitoring by the researcher and possibility to correct
Flexible Crawling Strategies - Shallow observation crawls
- Focused crawls with prioritization (e.g. PageRank and/or semantics)
Social Web Crawling
- General interest with different media focus
- Integrated with Web crawler to capture the full context
Authenticity
- See a web page as the user saw the page (e.g. including ads and tweets at that time point)
Context and Provenance - Demographics of sites
- Documentation of crawl specification and history
25/11/15
Thomas Risse 26
Congress of Catalan Archivists 2015
Topical / Event Crawling
25/11/15 Thomas Risse
www.news.de
about.htm
journalists.htm
ukraine_crisis.htm
thomas.htm
joe.htm
ukraine0105.htm
ukraine0205.htm
27
ukraine_sports0105.htm
spain_sports0205.htm sports.htm
Crawl Specification - Terms
- Ukraine - Crisis - …
- Seed List
- www.news.de
- …
Congress of Catalan Archivists 2015
ARCOMEM Crawling Phases
25/11/15
Thomas Risse 28
Crawling Online Processing
Offline Processing SARA
for
Broadcaster, Parliaments
ARCOMEM
Storage Archive
Crawling Appraisal
Selection Cross Crawl Processing
Entities
Obama, Romney, Biden, Ryan, Republicans, Democrats, Keywords
US Election, CommitToMitt, Teaparty, Budget deficit, Social Media Seedlist
https://twitter.com/whitehouse , https://twitter.com/blog44 , https://twitter.com/BarackObama, ...
Seedlist
http://news.bbc.co.uk/, http://telegraph.co.uk/, ...
Internet
Congress of Catalan Archivists 2015
Architecture Overview
Online Processing
Crawler Cross Crawl Analysis
Offline Processing
Queue
Management Application-Aware
Helper
Resource Selection
& Prioritization Resource
Fetching
Intelligent Crawl Definition Consolidation
Enrichment GATE Offline Analysis
Social Web Analysis
GATE Online Analysis Social Web Analysis Named Entity
Evol. Recog.
Extracted SocialWeb Information
Crawler Cockpit
ARCOMEM Storage (HBase, H2RDF)
URLs
Relevance Analysis
&
Priorization Image/Video Analysis
Twitter Dynamics
WARC Export
Application
WARC Files
SARA
SOLR Index +
Broadcaster Parliament
25/11/15
Thomas Risse 29
Congress of Catalan Archivists 2015
Crawler Cockpit
25/11/15
Thomas Risse 30
Congress of Catalan Archivists 2015
Relevance Distribution Time
25/11/15 31
Thomas Risse
Congress of Catalan Archivists 2015
New Strategies lead to …
Example: German Elections Crawl 2013
• Completely user defined crawl (German Broadcasters)
• Many focused terms and entities in German
• Many full names e.g. “Angela Merkel”
• Few less focused keywords in English
• 1 st Phase
• Many English pages with no relation to German Elections
• 2 nd Phase
Refinement of Crawl Specification
Focused English terms
Last names instead of full names
Finding a good cut off point of archive threshold
Smaller result set with higher focus
25/11/15
Thomas Risse 32
Congress of Catalan Archivists 2015
… new Experiences …
A good specification of the crawl intention becomes critical
Crawl specifications are rather complex compared to seed lists
More experiences, observations and analysis are necessary from Users and Developers
Room for more sophisticated guidance in the 1 st phase
Finding the right cut-off point
Depends on the user requirements
Highly focused Smaller Archives but higher risk of loosing interesting information
Less focused Lower risk of missing information but larger archives
Translation into scoring of individual crawls
25/11/15
Thomas Risse 33
Congress of Catalan Archivists 2015
… and new Limitiations
Social Media and Web Crawling are separate systems
• Process
1. Crawling of Social Media content 2. Extraction of Links
3. Crawling of Web Pages
• Static integration of Social Media
• Uni-directional Path: Social Media Web Content
• Missing Path: Web Content Social Media In Addition
• Complex system
• Required many changes in the Heritrix queue handling
25/11/15 34
Thomas Risse
Congress of Catalan Archivists 2015
Integrated crawling with the L3S iCrawl System
25/11/15 Thomas Risse
Apache Nutch based Web Archive crawler (under development)
• Learning the intention of the crawl
• Integration of Web and Social Media Crawling
• Content based monitoring of the crawl process
Web Archive
Crawl Specification
Learning the Crawl Specification Semantic
Crawl Description
Initial Seedlist
Provenance
Crawl Monitor
Crawler Crawl Analysis
&
Enrichment Specification
Refinement
Archive Creation &
Cataloguing Web Crawler
API Crawler Scheduler
Web ArchiveWeb Archive
Crawl Preparation Crawl Execution Crawl Finalization
35
Congress of Catalan Archivists 2015
iCrawl Wizard
25/11/15 36
Thomas Risse
Congress of Catalan Archivists 2015
Twitter #Ukraine Feed
Example for Integrated Crawling
25/11/15 Thomas Risse
ID Batch URL Priority
(High Page Relevance)
(Medium Page Relevance) (Low Page Relevance)
Web Link Extracted URL
ID Batch URL Priority
UK1 1 http://www.foxnews.com/world/2014/11/07/ukraine-accuses-russia-
sending-in-dozens-tanks-other-heavy-weapons-into-rebel/ 1.00 UK2 1 http://missilethreat.com/media-ukraine-may-buy-french-exocet-anti-ship-
missiles/ 1.00
ID Batch URL Priority
UK1 1 http://www.foxnews.com/world/2014/11/07/ukraine-accuses-russia-
sending-in-dozens-tanks-other-heavy-weapons-into-rebel/ 1.00 UK2 1 http://missilethreat.com/media-ukraine-may-buy-french-exocet-anti-ship-
missiles/ 1.00
UK3 x http://missilethreat.com/us-led-strikes-hit-group-oil-sites-2nd-day/ 0.40
ID Batch URL Priority
UK1 1 http://www.foxnews.com/world/2014/11/07/ukraine-accuses-russia-
sending-in-dozens-tanks-other-heavy-weapons-into-rebel/ 1.00 UK2 1 http://missilethreat.com/media-ukraine-may-buy-french-exocet-anti-ship-
missiles/ 1.00
UK3 x http://missilethreat.com/us-led-strikes-hit-group-oil-sites-2nd-day/ 0.40 UK4 y http://missilethreat.com/turkey-missile-talks-france-china-disagreements-
erdogan/ 0.05
… …
37
Crawler Queue
Congress of Catalan Archivists 2015
Agenda
• Motivation
• The Crawl Process and its Challenges
• Process Overview
• Dynamic Pages
• Basic Crawl Strategies
• Termination of Crawls
• Next Generation Web Archiving
• Requirements
• Topical Crawling
• The ARCOMEM Approach
• iCrawl - Integrated Crawling
• Web Archive Access and Usage Examples
• Current methods
• The ARCOMEM Approach
• Time and Topic aware diversification
• Language Evolution
• Conclusions
25/11/15
Thomas Risse 38
Congress of Catalan Archivists 2015
WayBackMachine
• Basic tool to access Web Archives
• Also used in
combination with other tools
• Only URL based access
• Provides
• Capture overviews
• Details about crawl dates
25/11/15
Thomas Risse 40
Congress of Catalan Archivists 2015
Searching the Web Archive
25/11/15 41
Thomas Risse
Congress of Catalan Archivists 2015
Other Ways to Explore Web Archives
25/11/15
Thomas Risse 42
Congress of Catalan Archivists 2015
Web Archive Access
Different user groups with different needs
• Not the typical Web Search Engine user
Explorative Search
• Web Archive Search ≠ Web Search
• Time Dimension
• Many versions of the same page
• Comparing versions
Large scale analysis across time
• Information extraction (Entities, Events, Topics, Opinions)
• Interlinking
• Visualizations
25/11/15
Thomas Risse 43
Congress of Catalan Archivists 2015
The ARCOMEM Approach to Web Archive Access
25/11/15
Thomas Risse 44
Congress of Catalan Archivists 2015
Historical Search on News Archives*
Consider a journalist interested in the history of ... Rudolph Giuliani
• News articles encode history as it happens.
• Aspects are diverse across time
• Time windows can be diverse in aspects.
25/11/15 Mayoral
Campaign
Mayoral Campaign
Mayoral
Campaign 9/11
Post politics endeavours Senate,
Cancer, Allegations
Number of Documents
Mayor
*Jaspreet Singh, Avishek Anand; Historical Search on News Archives; L3S Report 2015
Thomas Risse 45
Congress of Catalan Archivists 2015
Large Scale Data Analytics Example Named Entity Evolution Recognition
25/11/15
Thomas Risse 46
Congress of Catalan Archivists 2015
Language changes over time!
Our language is DYNAMIC and changes with our culture, politics, technology, social media, etc.
Different spellings over time
New words are introduced
Words change their meanings
In long-term digital archives Problems!
25/11/15
Thomas Risse 47
1914 1924 1991 today t
Leningrad Petrograd
St. Petersburg St. Petersburg St. Petersburg
Congress of Catalan Archivists 2015
Problem: Finding Documents
A scholar writing a thesis about ”Pope Benedikt”:
”I want to know more about Pope Benedikt”
?
?
25/11/15
Thomas Risse 48
Congress of Catalan Archivists 2015
Change Period
Named Entity Evolution
Named Entities (NE): people, places, companies...
Characteristics of Named Entity Evolution (NEE)
Same thing but different terms over time
Change occurs over short periods of time
Small or no concept shift
Announced to the public repeatedly
Goal: Find method for named entity evolution recognition independent from external
knowledge sources
Joseph Ratzinger Pope Benedict
Pope Benedict XVI Benedict XVI
Pope emeritus Benedict XVI Joseph Aloisius Ratzinger
Cardinal Ratzinger
Cardinal Joseph Ratzinger
25/11/15
Thomas Risse 49
Congress of Catalan Archivists 2015
Named Entity Evolution Recognizer (NEER)
Filtering Finding
Temporal Co-references
Co-References
Benedict XVI
Joseph Ratzinger
Cardinal Ratzinger 1. Pope Benedict XVI
2. Pope Benedict 3. Benedict XVI 4. Cardinal Ratzinger 5. Pope
6. Benedict
Identifying Change Periods
(Burst Detection)
Extract Text NLP Processing Context Creation
In his latest address to American bishops visiting Rome, Pope Benedict XVI stressed that Catholic educators should remain true to the faith -- a reminder issued just in time for another tense season of commencement addresses.
No, the pope did not mention Georgetown University by name when discussing the Catholic campus culture wars.
In his latest address to American bishops visiting Rome, Pope Benedict XVI stressed that Catholic educators should remain true to the faith -- a reminder issued just in time for another tense season of commencement addresses.
No, the pope did not mention Georgetown University by name when discussing the Catholic campus culture wars.
In his latest address to American bishops visiting Rome, Pope Benedict XVI stressed that Catholic educators should remain true to the faith -- a reminder issued just in time for another tense season of commencement addresses.
No, the pope did not mention Georgetown University by name when discussing the Catholic campus culture wars.
In his latest address to American bishops visiting Rome, Pope Benedict XVI stressed that Catholic educators should remain true to the faith -- a reminder issued just in time for another tense season of commencement addr- esses. No, the pope did not mention Georgetown University by name when discussing the Catholic campus culture wars.
In his latest address to American bishops visiting Rome, Pope Benedict XVI stressed that Catholic educators should remain true to the faith -- a reminder issued just in time for another tense season of commencement addr- esses. No, the pope did not mention Georgetown University by name when discussing the Catholic campus culture wars.
In his latest address to American bishops visiting Rome, Pope Benedict XVI stressed that Catholic educators should remain true to the faith -- a reminder issued just in time for another tense season of commencement addr- esses. No, the pope did not mention Georgetown University by name when discussing the Catholic campus culture wars.
Evaluation Results
• Burst detection found total 73% of all change periods
• High recall for
unsupervised method
• Machine Learning boosts precision
• Data Set:
http://www.l3s.de/neer- dataset/
Barack Obama Senator
State Senator Barack Obama Senator-elect Barack Obama Senator Barack Obama
Illinois Democrat
Vladimir Putin
President-elect Vladimir V Putin Minister Vladimir Putin
Acting President Vladimir V Putin President Vladimir V Putin
25/11/15
Thomas Risse 50
Congress of Catalan Archivists 2015
Motivation
• Optimal access to Web archives taking into account
• the temporal dimension of Web archives
• structured semantic information available on the Web
• social media and network information
Objectives
• Evolution-Aware Entity-Based Enrichment and Indexing
• Aggregating Social Networks and Streams
• Temporal Retrieval and Ranking
• Collaborative Exploration and Analytics
Testbeds
• Temporal Wikipedia
• Academic Web Archive
• Politics on the Web
More information: http://alexandria-project.eu/
ALEXANDRIA
Foundations for Temporal Retrieval, Exploration and Analytics in Web Archives
25/11/15
Thomas Risse 51
Congress of Catalan Archivists 2015
Conclusions
Web Crawling
• Mature standard Web crawlers but
still many challenges (e.g. Social Media, Dynamics)
• Manual interventions will always be necessary
• New user communities with different interests and requirements
• Additional crawling strategies are necessary
Web Archive Access
• Current approaches far behind the State of the Art
• Increasing Web Archive interest will lead to increasing user expectations
• Different usage categories: Explorative search vs. Big Data analysis
• Legal Aspects are still a big issue
25/11/15 52
Thomas Risse
Congress of Catalan Archivists 2015 Thomas Risse 25/11/15
Thank You!
Dr. Thomas Risse
Forschungszentrum L3S
Leibniz Universität Hannover Appelstrasse 9a
30167 Hannover, Germany
E-Mail: risse@L3S.de Telefon: +49-511-762 17764 Telefax: +49-511-762 17779
53