Abstract
Artificial Intelligence continues to transform our lives every day with Large Language Models (LLMs) like OpenAI’s ChatGPT and Google Gemini driving most of this progress. This research contributes to that progress by focusing on an overlooked pragmatic feature of language, ‘Illocutionary Force’, which refers to the speaker’s communicative intent within an utterance. We went further to investigate its impact on Authorship Identification improvement, advancing the efforts in both Natural Language Processing and Computational Linguistics fields. Leveraging the Argument Interchange Format (AIF) collection of datasets, this research develops models for classifying illocutionary forces in individual utterances and utterance pairs with coherence through the Inference Anchoring Theory.In the first experiment, we introduced a novel Dynamic Class Weighting method for classifying illocutionary forces across four AIF corpora - United States 2016 (US2016), Question Time 30 (QT30), Murder Mystery 123 (MM123), Moral Maze 2012 (MM2012). We compare the performance of the new method with four state-of-the-art approaches consisting of Dialogue Act Classification and commercial systems, using Dialogue Act Classification benchmark datasets (Switchboard Dialogue Act and Meeting Recorder Dialogue Act). Our method outperformed the baseline methods, showing up to 11% improvements in the macro F1 scores.
In the second experiment, we extend the illocutionary force classification task by modelling coherent illocutionary forces (Arguing and Disagreeing) in relation to the argument relationships of Support and Attack between these utterances. Our fine-tuned DeBERTa-v3-large showed higher F1 scores than RoBERTa-large (current benchmark), establishing a new baseline for modelling illocutionary forces in utterance transitions.
Finally, we demonstrated how these two illocutionary forces in dialogue structures have significance in an Authorship Identification task set out by Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection Group (PAN) in 2019, and observed an improvement in Accuracy across 4 out of 5 English problem sets, and an improvement in Macro F1 in 3 out of the 5 English problem sets, indicating the usefulness and potential of introducing illocutionary forces in an Authorship Identification pipeline.
| Date of Award | 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Michael Crabb (Supervisor), Jacky Visser (Supervisor), Chris Reed (Supervisor) & Brian Pluss (Supervisor) |
Keywords
- Authorship Identification
- Argument Mining
- Dialogue Structure
- Deep Learning
Cite this
- Standard