• Keine Ergebnisse gefunden

A Multi-Layered Corpus of Namibian English

N/A
N/A
Protected

Academic year: 2022

Aktie "A Multi-Layered Corpus of Namibian English"

Copied!
1
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

A M ULTI -L AYERED C ORPUS OF N AMIBIAN E NGLISH

F REDERIC Z ÄHRES

BIELEFELD UNIVERSITY

For contact information, a detailed bibliography as well as a digital version of the abstract and poster, please scan the QR code.

Open Questions & Issues

Ethical concerns (e.g. consent of inactive accounts?)

How well does this corpus address sociolinguistic issues?

Good ways for ‘geo-searching’ both videos and comments?

What is recommendable software for this approach?

Are there comparable social media corpora?

Varying audio quality of videos affecting acoustic analysis

World Englishes: Namibia pt. i

Since independence in 1990, English has been Na- mibia’s sole official language and medium of in- struction from, at least, secondary school – despite a strong Bantu-speaking majority. English had little

(e.g. colonial) history in Namibia and only 3% of all Namibians use it as their home language. This (and the near-absence of post-independence linguis-

tic research) is why Namibia presents an interesting case in the context of World Englishes.

World Englishes: Namibia pt. ii

Due to the shared history with South Africa, Afri- kaans still is a strong linguistic influence and serves as a lingua franca in parts of the country, which is why World Englishes handbooks classify

Namibian English as an off-shoot of varieties of South African English. Recent research has shown

that this is not the whole truth, however, and lin- guistic features suggesting undergoing nativization

have been identified among younger speakers.

Morpho-Syntactical & Lexical Analyses

On the one hand, the corpus consists of the transcript of the spoken data from the video, which is created from refined versions of the automatically-generated captions from YouTube.

On the other hand, the corpus contains several sources of written data. The compilation of written data is conducted on the following levels and is tagged for POS:

a. YouTube video title, captions & description b. YouTube comments

c. Data from YouTube channel page

d. Data from further linked social media such as Twitter or Facebook

This vlog and CMC corpus complements the already existing Corpus of Namibian Online Newspapers by Kautzsch (in prep.) with further registers and can be used to compare seemingly unique NamE morphosyntactic and lexical constructions.

6TH CONFERENCE ON CMC & SOCIAL MEDIA CORPORA | UNIVERSITY OF ANTWERP, BELGIUM | 17 & 18 SEPTEMBER 2018

Namibian YouTubers

According to Schneider’s (2016) taxonomy, natural videos constitute the vast majority of the available data. With basic search terms, over 50 unique non-professional content creators with at least three videos have been found who mainly use English – some inactive, some upload weekly.

[wɔ˞ ld]

Acoustic Analyses

The audio data allows for acoustic and auditory analyses of segmental variation based on recent observations on NamE, which include the realization of a TRAP-DRESS(-NURSE) vowel merger as well as vowel splits of the KIT and NURSE vowels (Kautzsch, Schröder & Zähres 2017).

Preliminary results: Analyses are in agreement with previous studies of phonological phenomena, especially regarding the NURSE-WORK split.

Referenzen

ÄHNLICHE DOKUMENTE

The transcripts are written in Serbian Latin script, using a semi-phonetic (semi- orthographic) method, which uses a standard alphabet and follows standard orthography while trying

In addition to compiling corpora and analysing their content, Aare Undo (2018) calculated the error rate of an automated part-of-speech tagger used for the

IXTRODUCTION—Various methods of“ purification: distillation, aeration, precipitation, strainiug, filtration—Theories of the action of sand filters, charcoal filters,

Martini in der Sola del Mappamondo (Worms, 1986], 23: "The Madonna front side and the Passion back side of Duccio's pala for the main altar of Siena cathedral clearly point to

In the majority of cases the node word was the term denoting the mentally challenged suffering from a particular disorder but in cases with Alzheimer’s disease

Within the OCSC we use so called dual system of transcription, which means (1) an orthographic one with the purpose of linguistic (morpho- logical) analysis and tagging

In the open form on the right, the user will find detailed information, for example concerning the type, the form and the medium of the survey, the time period in which it

In addition to describing the recordings contained in the Spoken Language Corpus of Swedish at Göteborg University, we discuss the standard of transcription (MSO) which is used in