AlBERTo: Italian BERT language understanding model for NLP challenging tasks based on tweets
Contributo in Atti di convegno
Data di Pubblicazione:
2019
Abstract:
Recent scientific studies on natural language processing (NLP) report the outstanding effectiveness observed in the use of context-dependent and task-free language understanding models such as ELMo, GPT, and BERT. Specifically, they have proved to achieve state of the art performance in numerous complex NLP tasks such as question answering and sentiment analysis in the English language. Following the great popularity and effectiveness that these models are gaining in the scientific community, we trained a BERT language understanding model for the Italian language (AlBERTo). In particular, AlBERTo is focused on the language used in social networks, specifically on Twitter. To demonstrate its robustness, we evaluated AlBERTo on the EVALITA 2016 task SENTIPOLC (SENTIment POLarity Classification) obtaining state of the art results in subjectivity, polarity and irony detection on Italian tweets. The pre-trained AlBERTo model will be publicly distributed through the GitHub platform at the following web address: https://github.com/marcopoli/AlBERTo-it in order to facilitate future research.
Tipologia CRIS:
04A-Conference paper in volume
Elenco autori:
Polignano M.; Basile P.; de Gemmis M.; Semeraro G.; Basile V.
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)
Pubblicato in: