• Keine Ergebnisse gefunden

Digital History of Concepts: Sense Clustering over Time

N/A
N/A
Protected

Academic year: 2022

Aktie "Digital History of Concepts: Sense Clustering over Time"

Copied!
1
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Digital History of Concepts: Sense Clustering over Time

Inga Kempfert

1

, Saba Anwar

1

, Alexander Friedrich

2

, Chris Biemann

1

1 Universität Hamburg, 2 Technische Universität Darmstadt

{5kempfer, anwar, biemann}@informatik.uni-hamburg.de, friedrich@phil.tu-darmstadt.de

We present a tool for tracking word senses over time for enabling a digital, data-driven history of concepts, in an ongoing collaboration between computational linguistics and philosophy [FB 2016]. Research in the history of concepts deals with the historical semantics of terms with a special focus on the evolution or change of scientific or political concepts related to them. Spe- cial interest is given to the (often polysemic) meaning of basic concepts that are still crucial for the understanding of contemporary cultural, political or scientific world- and self-descriptions (such as “freedom”, “power”, “life”, or “crisis”). A fundamental question here is how relevant contexts and periods of time of such changes can be identified. Our exploration tool allows to visualize the semantics of conceptual terms and their change over time, based on digital text analysis. The visualization shall help conceptual historians to identify and track significant changes of conceptual terms within and across different periods of time in different text cor- pora as well as to narrow down relevant contexts and source documents for further study.

Distributional semantics and word sense induction enable a data-driven approach to tracking word senses over time. Distributional semantics represents word meaning by their global contexts [MC 1991], allowing us to compute word similarity over large text corpora, such as the Google Books collection, using the graph-based JobimText framework [BR 2013].

Word sense induction creates data-driven hypotheses of coherent paradigms of target words, forming clusters that reflect different word senses. On time slices of time-stamped text cor- pora, we can access the formation, change and the demise of word meanings [MMM+. 2014].

Our contribution consists of SCoT (Sense Clustering over Time), a web interface to access the different senses of a word as they change over time. The paradigms of the target word are displayed as a graph, where word nodes are connected with edges indicating their similarity.

The interface allows for parameterization of the graph creation and display as well as setting time intervals of interest. For the formation of concepts, graph clustering provides an automa- tic initialization of colour-coded sense clusters, which can be labelled and post-edited by the user, since clustering is known to produce distinctions that are highly correlated, but not nec- essarily congruent to the user’s needs. For the visualization of differences, colour coding is em- ployed to show, for selected time intervals, which paradigms resp. senses are stable, are added or fall out of use. This mode also allows for stepping through the time intervals for visually ana- lysing the dynamics of change. Across both modes, it is possible to pin nodes onto the visuali- zation canvas to ensure visual continuity of senses across time slices. The tool is implemented as a freely accessible web interface that allows locally saving and loading its current state, ena- bling the interruption and the sharing of sessions, and will be demonstrated live.

References: Biemann, C. & M. Riedl (2013). Text: Now in 2D! A Framework for Lexical Expansion with Contextual Similarity. J Lang Mod 1(1):55-95. Friedrich, A. & C. Biemann (2016). Digitale Begriffsgeschichte? Methodologische Überlegungen und exemplarische Versuche am Beispiel moderner Netzsemantik, Forum Interdisziplinäre Be- griffsgeschichte 5(2):78-96. Miller, G. & W. Charles (1991). Contextual Correlates of Semantic Similarity. Language and Cognitive Processes, 6(1):1-28. Mitra, S., R. Mitra, S. Maity et al. (2015). An automatic approach to identify word sense changes in text media across timescales. J Nat Lang Eng 21(05):773-798.

Referenzen

ÄHNLICHE DOKUMENTE

Based on this, we will then analyze a number of conceptual modeling languages to see whether they accommodate the explicit modeling of (potentially im- portant) conceptual

Note: For cluster 3 the most informative variables for the Natural Capital Delta are P08 (Basin Level Flood Controls), The proportion of the population who are

12 Illustrating learning investigation in CO 2 emission inventories for Italy, Portugal, and Spain: weak learning in imprecision was detected using method 1 and no learning

To determine how much of the change in the crude labour force rate and the mean age of the labour force reflects a change in age-specific labour force participation rates as compared

SCoT can be used for various tasks such as linguis- tic studies of polysemic words or research into the history of concepts, but also offers a general and new solution to the

The main challenge is the scalability w.r.t. the number of group changes and the number of time points. Therefore vi- sualization combined with data analysis is needed. The re-

In order to provide first insights into recent developments of digital Latin scholarship, several digital text collections and commonly used tools are introduced, focusing especially

Today we are so used to treating time as space and this kind of spacialised time as a numerical quantity that we often fail to distinguish between the experience and the