Data di Pubblicazione:
2024
Abstract:
Works in perspectivism and human label variation have emphasized the need to collect and leverage various voices and points of view in the whole Natural Language Processing pipeline. PERSEID places itself in this line of work. We consider the task of irony detection from short social media conversations in Italian collected from Twitter (X) and Reddit. To do so, we leverage data from MultiPICO, a recent multilingual dataset with disaggregated annotations and annotators' metadata, containing 1000 Post, Reply pairs with five annotations each on average. We aim to evaluate whether prompting LLMs with additional annotators' demographic information (namely gender only, age only, and the combination of the two) results in improved performance compared to a baseline in which only the input text is provided. The evaluation is zero-shot; and we evaluate the results on the disaggregated annotations using f1.
Tipologia CRIS:
04A-Conference paper in volume
Keywords:
Evaluation; Irony Detection; Perspectivism
Elenco autori:
Basile V.; Casola S.; Frenda S.; Lo S.M.
Link alla scheda completa:
Titolo del libro:
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024), Pisa, Italy, December 4-6, 2024
Pubblicato in: