Improving DRS-to-Text Generation Through Delexicalization and Data Augmentation
Contributo in Atti di convegno
Data di Pubblicazione:
2024
Abstract:
Text generation from Discourse Representation Structure (DRS), is a complex logic-to-text generation task where lexical information in the form of logical concepts is translated into its corresponding textual representation. Delexicalization is the process of removing lexical information from the data which helps the model be more robust in producing textual sequences by focusing on the semantic structure of the input rather than the exact lexical content. Implementation of delexicalization is even harder in the case of the DRS-to-Text generation task where the lexical entities are anchored using WordNet synsets and thematic roles are sourced from VerbNet. In this paper, we have introduced novel procedures to selectively delexicalize proper nouns and common nouns. For data transformations, we propose to use two types of lexical abstractions (1): WordNet supersense-based contextually categorized abstraction; and (2): abstraction based on the lexical category associated with named entities and nouns. We present many experiments for evaluating the hypotheses of delexicalization in the DRS-to-Text generation task by using state-of-the-art neural sequence-to-sequence models. Furthermore, we also explored data augmentation through delexicalization while evaluating test sets with different abstraction methodologies i.e., with and without supersenses. Our experimental results proved the effectiveness of model generalizability through delexicalization while comparing it with the results of fully lexicalized DRS-to-Text generation. Delexicalization resulted in an improved translation quality with a significant increase in evaluation scores.
Tipologia CRIS:
04A-Conference paper in volume
Keywords:
Delexicalization, Data augmentation, Discourse representation structure, Formal meaning representation, Neural DRS-to-Text generation, Super senses
Elenco autori:
Amin, Muhammad Saad; Anselma, Luca; Mazzei, Alessandro
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Natural Language Processing and Information Systems
Pubblicato in: