Data di Pubblicazione:
2022
Abstract:
The data augmentation approach is becoming very popular in Natural Language Generation (NLG). Different approaches have been utilized in NLP and NLG to augment data and increase training examples for the neural model. Yet no studies have performed augmentation on logical input i.e., Discourse Representation Structures (DRS). We present data augmentation in DRS i.e., DRS taken from the PMB corpus, for the DRS-to-Text generation task. We conducted our experiments on a standard bi-LSTM-based sequence-to-sequence model thus creating an end-to-end neural approach for generating English sentences from DRS. We evaluated the output generated from word-level and character-level decoders with the help of reference-based evaluation metrics like BLEU, ROUGE, METEOR, NIST, and CIDEr. The practical implementation of augmented DRS succeeded in achieving better results compared to DRS without augmentation. To prove the significance of our model, we conducted statistical significance tests i.e., the Shapiro-Wilk Test (to check data normality) and the Wilcoxon Test (to test model significance). Wilcoxon results states that our model is significantly better with the p-value = 2.37e-05 for Char-level model and p-value = 7.78e-07 for Word-level model.
Tipologia CRIS:
04A-Conference paper in volume
Keywords:
Bi-LSTM; Data Augmentation; DRS-to-Text Generation; Neural Network; Parallel Meaning Bank (PMB); Shapiro-Wilk Test; Statistical Significance Test; Wilcoxon Test
Elenco autori:
Amin M.S.; Mazzei A.; Anselma L.
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Proceedings of the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI 2022) co-located with 21th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2022)
Pubblicato in: