Automatic Extraction of Quranic Lexis Representing Two Different Notions of Linguistic Salience: Keyness and Prosodic Prominence

This paper presents two sets of lexical items automatically extracted from the Arabic Qur’ān, and denoting two different notions of linguistic salience: keyness and prosodic prominence. Our novel hypothesis investigates a possible correlation between them. Our novel findings discover distributional...

Full description

Saved in:  
Bibliographic Details
Published in:Journal of Semitic studies
Authors: Brierley, Claire (Author) ; Atwell, Eric (Author) ; Dickins, James 1957- (Author) ; Islam, Tajul (Author) ; Sawalha, Majdi (Author)
Format: Electronic Article
Language:English
Check availability: HBZ Gateway
Journals Online & Print:
Drawer...
Fernleihe:Fernleihe für die Fachinformationsdienste
Published: Oxford University Press [2018]
In: Journal of Semitic studies
Online Access: Presumably Free Access
Volltext (Verlag)
Volltext (doi)
Description
Summary:This paper presents two sets of lexical items automatically extracted from the Arabic Qur’ān, and denoting two different notions of linguistic salience: keyness and prosodic prominence. Our novel hypothesis investigates a possible correlation between them. Our novel findings discover distributionally significant keywords that also occur strategically in phrase‐final position so as to maximise their prominence, and thus meaningfulness, for reader, reciter, and aural recipient. Our methodology first computes Quranic keywords via the Corpus Linguistics technique of Keyword Extraction, and maps them to major Quranic themes in Islamic scholarship. Next, we implement a bespoke algorithm for rule-based capture of words annotated with madd or prolongation, a specific type of prosodic highlighting in Quranic recitation rules or tajwīd. We find it especially interesting that the concept of final syllable lengthening (madd before pause) is encoded in tajwīid and effectively demarcates phrase boundaries in the Qur’ān. We concentrate on nominal keywords (i.e. nouns and adjectives) since these are more likely to be aligned with phrase edges and to bear the hallmarks of pre-boundary lengthening. This correlation between keyness and prominence occurs 43.29% of the time in our data, since 526 keywords appear in our extracted subset of nominal types tagged with madd before pause: ((526/1215)*100). Finally, we identify which Quranic keywords are most likely to be annotated with enhanced prolongation in the final syllable before pause, using an easy-to-interpret, single value metric: the Laplace Point Estimate. Keywords that emerge as semantically weighted in terms of both distributional and prosodic significance are most likely to reflect the Quranic themes of God, Nature, and Eschatology.
ISSN:1477-8556
Contains:Enthalten in: Journal of Semitic studies
Persistent identifiers:DOI: 10.1093/jss/fgy005