Visualisation of Semantic Enrichment
Alexa Schlegel1, Ralf Heese1, Annika Hinze2
1Freie Universitaet Berlin, Germany aschle, heese @inf.fu-berlin.de
2University of Waikato, New Zealand hinze@cs.waikato.ac.nz
Abstract: Automatically creating semantic enrichments for text may lead to annotations that allow for excellent recall but poor precision. Manual enrichment is potentially more targeted, leading to greater precision. We aim to support non- experts in manually enriching texts with semantic annotations. Neither the visualisation of semantic enrichment nor the process of manually enriching texts has been evaluated before. This paper presents the results of our user study on visualisation of text enrichment during the annotation process. We performed extensive analysis of work related to the visualisation of semantic annotations. In a prototype implementation, we then explored two layout alternatives for visualising semantic annotations and their linkage to the text atoms. Here we summarise and discuss our results and their design implications for tools creating semantic annotations.
1 Introduction
With semantic technologies, annotations are no longer about the content (as in Web 2.0 tagging) but become part of the content. Such semantic enrichment typically consists of an annotated text passage (text atom) and related information (annotation). Clear visualisation of enrichment, that is, indication of text atom and linkage to annotation, is important for both the definition of annotations and the reading of enriched text.
Furthermore, it is essential for the acceptance of semantic technologies that non-experts are enabled to produce and consume semantically enriched content.
Here we focus on the visualisation of enrichment during the annotation process. To the best of our knowledge, neither the visualisation of semantic enrichment nor the process of manually enriching texts has been evaluated before. Additionally, research on presentation of semantically annotated documents typically targets the rather passive reception aspects of data visualisation. Our research is motivated by experiences in two projects, TIP and loomp, addressing the annotation of content by non-experts. TIP is a mobile tourist information system that provides different information depending on a user’ interests [Hin09]. Textual information in TIP has to undergo a semantic enrichment process to be prepared for this interest-based filtering [Hsi08]. Loomp is a tool for the management of semantic enrichment, which we applied to texts (predominantly) relating to museums and their exhibitions [Luc09]. In loomp, users enrich texts semantically by linking text passages to concepts, thus forming annotations with additional, structured
informatio assign cate for generat open new generation In this pap during ann allowed u architectur their linka discuss our The remain annotation highlightin and the im user study research an
2 Cha
The proces a text. We Atoms and We here de All of thes Cardinalit atoms and refer to the relationshi Granulari document.
Positionin Overlappin
Figure 1: P
1 Available at
n. While atom egories to long
ting rich cont ways of loc of new seman per, we report notation. We i users to creat
re). We explor age to annota r results and d nder of the pa ns. In Section 3
ng annotations mplementation are discussed nd give an ove
aracteristics
ss of semantic e refer to an a d annotations d efine a list of e characteristi ty. We differ annotations.
e same annota ps. Semantic ity. An atom If a documen ng. We distin ng annotations
(a) (c) Positioning of a
t https://github.co
ms typically c ger text passag tent and recom cating, access
ntic links.
the results of mplemented a te text annot red two layou ations along design implica aper is structur 3 we present t s. We then des of our highli d in Section 5 erview of futu
s of Annota
c enrichment l atom as a con do not need to
properties for ics directly in rentiate betwe
For example, ation. Commen
annotations of could be defi nt is the smalle nguish betwee s additionally
annotations: (a)
om/aschle/Overla
consist of only ges. In both ca mmendations sing and reus
f our user stud a system for l tations by re ut alternatives four characte ations for tool
red as follows the results of scribe the met ighting appro 5. Finally, we ure work in Se
ations
links additiona ntinuous porti o be represente r characterizin nfluence the vi een 1:1, 1:n, , n:1 means th nting in word ften have n:1 ined as charac est atom, anno en overlappin may be the sp
( (d) Overlapping, (
appingAnnotation
y a few words ases, the anno for tourist inf sing existing
dy on visualis light-weight se eferring to c
for the visual eristics of vis s creating sem s: Section 2 di
our analysis o thodology and aches in Secti e summarize ection 6.
al information ion of a text ed in the same ng (relationshi
isualization of n:1, and n:m hat an arbitrar d processors re relationships.
cter, word, ph otations are re ng and adjac pecial cases of
(b)
b) inclusion, (c
ns
s in loomp, in tations may b formation sys content as w
sation of text emantic enrich categories (e.g lisation of text sual feedback mantic annotat scusses charac of current app d set-up of our
ion 4. The res the contributi
n into the main linked to an e data format.
ips between) a f atoms and an m relationship ry number of esult mainly in
hrase, sentence ferred to as m cent atoms (c f inclusion or
) identity, and (
n TIP, users be used later stems. They well as the
enrichment chment1 that g., history, t atoms and k. Here we tions.
acteristics of proaches for r user study sults of our ions of this
n content of annotation.
annotations.
nnotations.
ps between atoms may n 1:1 or 1:n
e, or whole metadata.
cf. Fig. 1).
identity.
(d) adjacent
Highlighti within a t backgroun Position o correspond next to the Visualisin employed:
Connectio backgroun positioned
3 Rela
We evalua research h available o introduced The summ the suppor (otherwise support all that a visu this type o visualizatio We now d manual an options fo annotation
ing atoms an text are mod d colour, and of annotations ding atom (so
atom, or belo g overlappin mix of colour
Figure 2: Ov
ons. To indic d colour may nearby or mo
ated Work
ated work re has been perf on the Web fo d in Section 2 t marised results rted alternativ left blank).
l four options.
alisation is su of overlapping
on.
discuss each nd automatic or manual ann n, some suppor
d annotation dification of use of graphic s. Options for
metimes freel ow the docume ng atoms. To
rs, stack view
verlapping anno
ate the relatio y be assigned t
ouse-over effe
lated to the formed in th or annotating to analyse the
of our evalua ves between
For overlapp Each option upported but h g may occur in
of the analys annotations ( notation only rt categories (
ns. Typical app text styles ( cal elements ( r positioning a
ly movable), ent.
indicate over w, stripes, and
otations: mixtur
onship betwe to both atoms ects are used.
visualisation his area and
ebooks or ot e visualisation ation are listed
options in a ing annotatio that is fully su has design lim
n the text, the
sed systems a (e.g., Gate, O y. Most of th
as done in our
proaches to v (e.g., underlin (e.g., boxes or
annotations ar in the left or
rlapping atoms vertical lines
re of colours an
een atoms and s and annotati
of semantic most availab ther texts. We n supported by d in Figure 3.
characteristic ons, a system
upported is m mitations, and e software do
and tools in t OpenCalais, rd
e systems an r approach).
isually disting ne or bold), r icons).
re as an overl right margin
s, different str in the margin
nd stack view
d annotations ions, annotatio
annotations.
ble annotation e used the cha y available too
In the table i c are indicate
’s visualisatio marked with +;
– indicates th es not provide
turn. Some su dfquery), othe nalysed suppo
guish atoms change of
lay near the n of the text
rategies are n.
s, the same ons may be
Not much n tools are aracteristics ols.
in Figure 3, ed by an x on may not
; o indicates hat although de a specific
upport both ers provide ort free-text
Figure 3: Tools for creatting annotations and their charracteristics
Booktate ( typically se can create annotation An annota correspond enough sp created bet mixture of such an an A.nnotate activities t characters.
Users can bottom ma backgroun annotation annotation title of the at the sam within its application annotation users assig distinguish supports a
Crocodoc into HTML four backg margin at t line betwe annotation as an overl Although u same back
(www.booktat et at word lev several anno ns). Atoms are ation is place ding atoms an pace for the a tween two par f colours is use
notation then (a.nnotate.com that are supp . The backgro
select one of argin (overlay d colour. D n is indicated ns near the ato annotations.
me height as t context. The n supports all ns may cover gn the same o h overlapping
special strateg
(crocodoc.com L5 format and ground colour the same heig een atom and n the correspon lay near the at users can sele kground colou
te.com) is a vel; however, a otations for a
highlighted b ed in the mar nd has the sam annotations ne
ragraphs. To v ed (similar to the original c m) is a Web ported are cr
und colour of f three visuali y and margin
epending on d differently.
om over the te If the annotat he atom. If it e order of an types of over earlier ones l r no backgrou
atoms withou gy for overlap
Figure 4: A
m) is a Web a d supports ann rs for highligh ght as the corr its annotatio nding atom is tom.
ect the backg ur (see Figure
system for a annotations on single atom ( by assigning a
rgin directly me backgroun ext to a para visualize the o
Figure 2, top) colour is restor b application reating of co f text atoms ca izations for an layout show
the visualiz In case of ext. In the oth tion is placed
t is located a nnotation corr rlapping annot leading to pro und colour to ut selecting o pping annotati
A.nnotate over
application tha notation of the hting atoms.
responding at on. If users ho s indicated and
ground colour 5). The syste
annotating eB n section leve (1:n relations a background beside the p nd colour as agraph, a larg overlapping of ). If users hov red.
for annotati mments, strik an be changed nnotations, ov wn in Figure 4 zation, the li f an overlay
her cases it u on the right m at the bottom
responds to th tations except oblems access the atoms, th one of them (i
ions).
lay options
at converts of ese documents
Annotations tom. Addition
over with the d the annotati
of an atom, em supports tw
Books. An an l are also poss hips between colour or by u paragraph con the atom. If t ger empty spa f two atoms a ver with their m
ng PDF files king text, an d to one of sev verlay, right m 4) and also c
ink between the applicat ses the annota margin then it
then the atom he order of a t identity. How sing older ann hey would not i.e., the system
ffice documen s. Users can se
appear in the nally, the syste
e mouse point on is addition
all annotation wo kinds of a
nnotation is sible. Users n atoms and underlining.
ntaining the there is not ace may be a subtractive
mouse over
s. Example nd inserting ven colours.
margin, and choose their atom and tion places ated text as t is position m is shown atoms. The wever, later notations. If t be able to em does not
nts and PDF elect one of e right-hand em draws a nter over an nally shown
ns have the annotations:
highlight a atoms are selected. In
Using diig Crocodoc, application is indicate When open a pop-up o Bible+ is a passages a between d offers mos
rdfquery (c generating correspond Independen inclusions Gate (gate categories) highlighted overlappin Additional
and comment covered by n n case of ident
go (www.diig users can sel n allows users d by a speech ning a docum on mouse-over an iphone app as well as se different texts
t versatile sup
Figur
code.google.c RDFa elemen ding annotatio ntly of their s
as overlappin e.ac.uk) is a to
) are displaye d in the same ng atoms are lly, users can
t. Overlappin newer ones a tity, none of th
Figure 5
go.com) users lect one of fou s to add comm
h bubble show ment, all annota
r. Only identic for Bible read earch in the
and differen pport of overla
re 6: Bible+ ove
om/p/rdfquery nts. It uses fra ons (e.g., fac semantics, all ng annotations oolkit for anal d in the right background c e supported open a stack
ng annotations and the overl the overlappin
5: Crocodoc ann
s can add an ur backgroun ments to atom wing the num ations are init cal overlappin ding. Users ca
annotated tex nt devices. Be apping annota
erlapping annot
y) is a JavaSc ames for high cts about nam
frames have s; adjacent ato lyzing and pro t-hand margin colour as the c using a mix view showin
s are support lapping part o ng atoms can b
notations
nnotations to nd colours for ms. The presen
mber of comm tially hidden a ng annotations an take notes, xt. Annotation etween the an ations (see Fig
tations (here on
cript library fo hlighting atom med entities) i
the same colo ms are well d ocessing texts n (see area (1) corresponding xture of the ng the atoms a
ted, but older of two atoms be selected.
web-pages.
highlighting nce of such an ments linked t and are only d s are supported
highlight, and ns can be sy nalysed system gure 6).
n ipad)
or parsing, qu s in the text a in the left-ha our. The libra istinguishable s. Annotation ) in Figure 7) g annotation. A
ir backgroun as horizontal b
r annotated s cannot be
Similar to atoms. The n annotation to an atom.
displayed as d.
d bookmark ynchronised ems, Bible+
uerying, and and displays and margin.
ary supports e.
types (e.g., ); atoms are All types of nd colours.
bars having
the corresp different le
Atlas.ti (w science [M level and a software. I but are ind shows the mouse-ove only when each availa veeeb (ww semantic a colour. Th more atom
OpenCalai OpenCalai facts”. The system ass clearly ide need to hov
ponding back evels (see (2) a
Fig
www.atlasti.co Muh94]. The an arbitrary n In contrast to dicated by vert assigned ann er nearby the n users select a
able colour.
ww.veeeb.com annotations. A he tool implem ms overlap the
is (viewer.ope is viewer sup e atoms are u signs different entified betwe ver the mouse
kground colou and (3) in Fig
gure 7: Gate ann
om) is a tool authors ident number of ove
other annotat tical bars in th notations and correspondin a bar. The bar
m) is a Web All recognize ments a speci darker is the o
Figure
encalais.com) pports only tw underlined or t colours to di een atoms of e over an atom
ur. In case o gure 7).
notations includ
l for evaluati tify the suppo erlapping ann tion software, he right-hand
displays the ng bar. The re
rs are arrange
b-based tool ed entities are ial technique
orange colour
e 8: veeeb anno
is a service f wo types of a r assigned a b ifferent annota different type m to see the co
f overlapping
ding the stack v
ing textual d ort for both a notations as th , atoms are no margin. In the content of an elated atom i ed into column
for analyzin e highlighted
for indicating r (see Figure 8
otations
for analyzing annotations: “ background c ations. Overla es. For atoms omplete atom.
g atoms the b
view
ata conducted annotations on he main featu ot highlighted e margin, the s n atom as an s highlighted ns, using one
ng texts and using the sa g overlapping 8).
and enriching
“entities” and colour, respec apping atoms c
of the same
bars are on
d in social on character ures of their d in the text system also overlay on in the text column for
generating ame orange
atoms: the
g texts. The “events &
ctively. The can only be
type, users
TIP is a m supports us category a informatio by backgro longer link tools. The
Fig
loomp is a single back the right m
Summary We exami assigning a (e.g., icons identified;
styles. How main probl they typica The exam correspond annotation possible). T been evalu
mobile touris sers in the sem re then stored n in this categ ound patterns ked to the com
interface was
gure 9: TIP imp
Web-based e kground colou margin. Overla
ined several t a background s). In contrast, the most freq wever, only fe lem is distingu ally use the sa mined tools
ding atoms. A ns and atoms.
To the best of uated for their
st information mantic mark-u d separately to gory. In the im s (see Figure mplete text, TI
evaluated in a
port service – tex
ditor for creat ur for highligh apping atoms a
tools for anno colour. Only , no common quent techniqu ew tools prov uishing overla me style).
typically ap Additionally,
All tools po f our knowled ease of use (b
n system [Hi up of texts. Th o be available mport service,
9). Because IP avoids som a simple pape
ext annotations (
ting semantic ghting annotat
are currently n
otating texts.
few change th approach for ques are mixed vide a clear vi
apping atoms
pply similar a mouse-over osition annota dge, none of t beyond a simp
in09]. Its info he text snippet e should a use overlapping
the use of th me of the disp er prototype st
(left) and conce
annotations i ions and anno not supported
Almost all t he font style o
indicating ove d background
sualization of of the same c
visualizatio r effect may ations near th the tools and a ple study repor
ormation imp ts within each er be intereste annotations ar he final annot lay issues fac udy.
ept structure (rig
n texts [Luc09 otations are di
(except adjac
ools highligh or add graphic erlapping atom d colours and
f overlapping ategory of ann
ns to annot highlight cor he related ato annotation int rted in [Hsi08
port service h annotation ed in tourist re indicated tation is no ced by other
ght)
9]. It uses a isplayed on cent ones).
ht atoms by cal elements ms could be mixed font atoms. The notation (as
otation and rresponding oms (where
terfaces has 8].)
4 Imp
The imple visualisatio features w texts such texts (not m lines. The annotation categories subcategor
We explor implement atom withi colour of purple=his are highlig speech bub atom in a category o and the an relationshi may overla portion of Semantic a technologi with both i phase, par During the longer text decisions i Each study
plementatio
mented syste on of text at were included as the ones u multi-media o
system needs n sets (such as had to be co ries.
Figure 10: B
red the two a ted as simple in the text is f the bar r story). The ba ghted by a mou
bble near the coloured fra of the annotati nnotation appe ps between at ap or be adjac text was restri annotations ar es. We theref interfaces (alt rticipants fami e application
t. The partici in interaction y concluded w
on and Stud
m was purpo toms and the to make the used in TIP an objects), wher s support for s expressed b nsidered; how
Bar layout
alternatives of prototypes us indicated by reflects the ars are ordered use-over of th
atom. The bo ame (Figure 1 ion. The back ears as a spee toms and anno cent (see Sect
icted to three re meant to b fore observed ternatively sta iliarized them phase, they h pants were en with the proto with a guided i
dy Setup
ose-built to ex eir linkage to system suitab nd loomp. The re annotations categories, th by different o wever, each c
f bar layout a sing HTML a a vertical ba annotation d by length an he correspondi
order layout h 11), where th kground colou
ech bubble. B otations, and f tion 2). The n
and the numb be created by d 12 non-exer arting with ba mselves with t had to execut ncouraged to otype, instead interview.
xplore two la o annotations ble for annota e main usage s may span ov hat is, be able
ntologies). O category may
Figure 1
and border lay and JavaScript ar in the left m
concept (e.g nd order in th ing bar and th highlights ann he colour corr ur of an atom Both layouts a
for atoms to s number of atom ber of categori
non-experts w rt participants ar or border la the system usi
te a number o think out lou d of asking for
yout alternati . The follow ations of tour scenario is an ver a few wor to explicitly nly a limited have several
11: Border Layo
yout. Both la t. In the bar la margin (Figur g., orange=a he text. Atoms e annotations notations by en
responds to th changes on m allow for man span several li
ms overlappin ies to four.
with respect t s (P1 to P12) ayout). During
ing a short pr of annotation ud as they we r the ‘correct’
ives for the wing design
rism-related nnotation of rds or some distinguish number of (up to ten)
out
ayouts were layout, each re 10). The architecture,
s in the text appear as a nclosing an the selected mouse-over ny-to-many ines. Atoms ng the same
to semantic interacting g a learning ractice text.
n tasks on a were making
’ procedure.
5 Stud
We here br Atom defi Three part particular, annotate ` Restricting meaningfu acceptable (e.g., the n create cros on semanti Layout an the bars’ p definitely some bars (depending She addit depending the bars to largest to smallest to People wh that it wou belonging design was text.
Interactio annotation text were a left to find of the 12 p thought it click the ri borders. Th Clarity of interview, clearly arra equivalent agreed and This indica
dy Results a
riefly summar inition: All 1 ticipants noted
P3 wished to
`Beton’ (Eng g selections t ul atoms but i atoms. Two name `Daniel ss-references b
ic annotations nd ordering o position in the
on the left-h s to the left a
g on the posi tionally sugg on length or o be ordered b smallest (Fig o largest (Figu ho preferred o uld be easier
to the small s clearer whe
n with bars a ns. During ann already annota d out what wa participants fe was not quite ight border” ( hree were und f layout: On a 6 mainly agr anged.” (1/2/3
question abou d 2 partially a ates that partic
and Discus
rise the main f 2 participants d that it was o select `Libes
gl: concrete) to whole wo in this cases participants (P
Libeskind’ a between atom s.
of bars: 11 o left-hand ma and side!”).
and others to tion of the at gested placing
category. All by length. Sev gure 12, top), ure 12, bottom ordering large to identify th ler bars. The en the longer
and borders:
notating, they ated. P8 remar as already ann lt it was very e easy, as the (P9). P10 exp decided.
a 5-point Liker reed and 4 agr 3 started with
ut the clarity o agreed (4/1/1 cipants seem t
ssion
findings of ou s found it eas not possible skind’ (as par within `Be ords was sup
prevented th P2, P10) wish and the profe ms was also ob
of 12 participa argin (P5: “Ne P4 suggested o the right of
tom within ea g bars left l participants ven suggested , four sugges m); one was in est to smalle he lines of tex e other group bars were clo
All participan used the mou rked: “I do no notated and wh
easy to identi atom’s space ected that he
rt scale, 2 par reed partially the bar layout of the border started with to prefer the b
ur user study.
sy to select te to select lette rt of Libeskind
tonstehlen’ w pposed to m he creation of hed to correla ssion `archite bserved in ano
ants liked eeds to be d placing f the text ach line).
or right preferred d ordering sted from ndecisive.
st argued xt (atoms) p felt the ose to the
nts interacted use-over to id ot like that I al
hat not.” Usin ify the catego e may be very could extend
ticipants comp with the stat t; 1/4/1 with th
layout, 6 com bar layout; 2/
border layout.
F
ext atoms for ers or parts of
d-Bau), and P with category make it easier
f what would te atoms with ect’). A simila other study we
successfully w entify which p lways have to ng the border ry of an anno y small and it
an atom by d
pletely agreed tement “the ba
he border layo mpletely agree
/3/1 with bord igure 12: Order
annotation.
f words. In P5 aimed to y material.
r to create d have been
h each other ar desire to e performed
with the bar parts of the o look to the layout four otation. Five t is “hard to dragging its
d during the ar layout is out). On the ed, 4 mainly der layout).
ring of bars
During the study it was noted that overlapping annotations constitute a considerate proportion of all created annotations (used by 8 of 12; up to 30% of all annotations). In the guided interviews, we observed that the participants saw the bar layout to be more suitable for annotating larger text passages because many (small) bars on the left side potentially make the interface less clear. Participants also found that the bar layout was somewhat imprecise as atoms are only identified by line but not by position in each line.
However, the bar layout was found to be well suited for reading and annotating since texts themselves do not contain any highlighting.
Participants found the border layout to be more suited for annotating short text passages because they could easily recognize the atoms, and the relationship between atoms and annotations was clear. However, participants noted that users may get confused by the borders if they are confronted with too many atoms.
6 Conclusion and Future Work
The success and rapid uptake of Web 2.0 concepts was largely due to and driven by the availability of applications for non-expert users (i.e., users with little knowledge about Web technologies). We believe that the success of the Semantic Web similarly depends on the availability of applications for non-expert users (i.e., users with little knowledge about semantic concepts and technologies). Many semantic web researchers focus on creating applications for producing and consuming semantically enriched content.
However, only few ensure the usability of their user interfaces for the large group of non-expert users.
In this paper we present our analysis of visual tools for creating annotations and describe the results of an initial user study on the highlighting of annotations. The results of our study form a first step towards formulating recommendations and best-practice examples for the design of annotation systems with manual components.
The indication of overlapping annotations was identified as the main issue for visualisation of annotations. None of the tools and annotation interfaces had been previously evaluated for their ease of use. In our user study, two layout alternatives for the visualisation of text atoms and their linkage to annotations. Our user study confirmed that overlapping annotations constitute a considerate proportion of all created annotations. They were identified as part of a typical annotation process and should not be treated as special cases. The border layout supports clear identification of overlapping annotations, whereas their identification is more complicated in the bar layout. We also found that the bar layout is more suitable for annotating larger text passages whereas the border layout is more suitable for annotating words and short passages. We therefore recommend that systems should implement two views on annotated texts: One view for unhindered reading, a quick overview of the text and locating atoms and annotations at a glance (e.g., bar layout) and another one for creating annotations in the text and retrieving detailed information about the annotated text passages (e.g., border layout).
The work presented in this paper considered mainly the visualisation of (simple semantic) annotations (e.g., assigning a category). However, full semantic mark-up requires the additional assignment of semantic identifiers. The understanding of complex semantic annotations (e.g., assigning and interpreting the linkage to resources) by non- expert users is more complicated and needs to be explored further. Moreover, so far only annotations created by single users were analysed. The concurrent annotation of texts by a group of users (e.g., in a crowd-sourcing approach) will most likely lead to more overlapping and potentially contradicting annotations. Appropriate resolution of these cases still needs further research.
References
[Hin09] Annika Hinze, Agnès Voisard, George Buchanan: Tip: Personalizing Information Delivery in a Tourist Information System. Journal of IT & Tourism 11(3): 247-264, 2009 [Hsi08] Ping-Ju Hsieh. Administration Service for the Tourist Information System. Master’s
thesis, Computer Science Department, The University of Waikato, June 2008, available online at http://researchcommons.waikato.ac.nz/handle/10289/2478.
[Luc09] Markus Luczak-Rösch and Ralf Heese. Linked data authoring for non-experts. In Proceedings of the Linked Data on the Web Workshop (co-located to WWW’2009).
LNCS, March 2009
[Muh94] Thomas Muhr. ATLAS.ti: Ein Werkzeug für die Textinterpretation. In Schriften zur Informationswissenschaft, pp 317-324. Univ.-Verlag Konstanz, 1994.