Skip to Main Content (Press Enter)

Logo UNITO
  • ×
  • Home
  • Pubblicazioni
  • Progetti
  • Persone
  • Competenze
  • Settori
  • Strutture
  • Terza Missione

UNI-FIND
Logo UNITO

|

UNI-FIND

unito.it
  • ×
  • Home
  • Pubblicazioni
  • Progetti
  • Persone
  • Competenze
  • Settori
  • Strutture
  • Terza Missione
  1. Pubblicazioni

Improving DRS-to-Text Generation Through Delexicalization and Data Augmentation

Contributo in Atti di convegno
Data di Pubblicazione:
2024
Abstract:
Text generation from Discourse Representation Structure (DRS), is a complex logic-to-text generation task where lexical information in the form of logical concepts is translated into its corresponding textual representation. Delexicalization is the process of removing lexical information from the data which helps the model be more robust in producing textual sequences by focusing on the semantic structure of the input rather than the exact lexical content. Implementation of delexicalization is even harder in the case of the DRS-to-Text generation task where the lexical entities are anchored using WordNet synsets and thematic roles are sourced from VerbNet. In this paper, we have introduced novel procedures to selectively delexicalize proper nouns and common nouns. For data transformations, we propose to use two types of lexical abstractions (1): WordNet supersense-based contextually categorized abstraction; and (2): abstraction based on the lexical category associated with named entities and nouns. We present many experiments for evaluating the hypotheses of delexicalization in the DRS-to-Text generation task by using state-of-the-art neural sequence-to-sequence models. Furthermore, we also explored data augmentation through delexicalization while evaluating test sets with different abstraction methodologies i.e., with and without supersenses. Our experimental results proved the effectiveness of model generalizability through delexicalization while comparing it with the results of fully lexicalized DRS-to-Text generation. Delexicalization resulted in an improved translation quality with a significant increase in evaluation scores.
Tipologia CRIS:
04A-Conference paper in volume
Keywords:
Delexicalization, Data augmentation, Discourse representation structure, Formal meaning representation, Neural DRS-to-Text generation, Super senses
Elenco autori:
Amin, Muhammad Saad; Anselma, Luca; Mazzei, Alessandro
Autori di Ateneo:
ANSELMA Luca
MAZZEI Alessandro
Link alla scheda completa:
https://iris.unito.it/handle/2318/2014270
Link al Full Text:
https://iris.unito.it/retrieve/handle/2318/2014270/1376388/Delexicalization_for_DRS_to_Text_Generation__NLDB_2024_.pdf
Titolo del libro:
Natural Language Processing and Information Systems
Pubblicato in:
LECTURE NOTES IN COMPUTER SCIENCE
Journal
LECTURE NOTES IN COMPUTER SCIENCE
Series
  • Aree Di Ricerca

Aree Di Ricerca

Settori (12)


PE6_7 - Artificial intelligence, intelligent systems, natural language processing - (2024)

CIBO, AGRICOLTURA e ALLEVAMENTI - Farmacologia Veterinaria

CULTURA, ARTE e CREATIVITA' - Culture moderne

INFORMATICA, AUTOMAZIONE e INTELLIGENZA ARTIFICIALE - Digitalizzazione della Cultura e della Creatività

INFORMATICA, AUTOMAZIONE e INTELLIGENZA ARTIFICIALE - Digitalizzazione della Società e della Pubblica Amministrazione

INFORMATICA, AUTOMAZIONE e INTELLIGENZA ARTIFICIALE - Salute e Informatica

LINGUE e LETTERATURA - Anglistica e angloamericanistica

LINGUE e LETTERATURA - Francesistica

PIANETA TERRA, AMBIENTE, CLIMA, ENERGIA e SOSTENIBILITA' - Diritto dell'Ambiente

PIANETA TERRA, AMBIENTE, CLIMA, ENERGIA e SOSTENIBILITA' - Informatica e Ambiente

SCIENZE MATEMATICHE, CHIMICHE, FISICHE - Fisica delle Particelle e dei Nuclei

SCIENZE MATEMATICHE, CHIMICHE, FISICHE - Laboratori innovativi, strumentazione e modellizzazione fisica
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 25.6.1.0