Skip to Main Content (Press Enter)

Logo UNITO
  • ×
  • Home
  • Pubblicazioni
  • Progetti
  • Persone
  • Competenze
  • Settori
  • Strutture
  • Terza Missione

UNI-FIND
Logo UNITO

|

UNI-FIND

unito.it
  • ×
  • Home
  • Pubblicazioni
  • Progetti
  • Persone
  • Competenze
  • Settori
  • Strutture
  • Terza Missione
  1. Pubblicazioni

Analysis and classification of privacy-sensitive content in social media posts

Articolo
Data di Pubblicazione:
2022
Abstract:
User-generated contents often contain private information, even when they are shared publicly on social media and on the web in general. Although many filtering and natural language approaches for automatically detecting obscenities or hate speech have been proposed, determining whether a shared post contains sensitive information is still an open issue. The problem has been addressed by assuming, for instance, that sensitive contents are published anonymously, on anonymous social media platforms or with more restrictive privacy settings, but these assumptions are far from being realistic, since the authors of posts often underestimate or overlook their actual exposure to privacy risks. Hence, in this paper, we address the problem of content sensitivity analysis directly, by presenting and characterizing a new annotated corpus with around ten thousand posts, each one annotated as sensitive or non-sensitive by a pool of experts. We characterize our data with respect to the closely-related problem of self-disclosure, pointing out the main differences between the two tasks. We also present the results of several deep neural network models that outperform previous naive attempts of classifying social media posts according to their sensitivity, and show that state-of-the-art approaches based on anonymity and lexical analysis do not work in realistic application scenarios.
Tipologia CRIS:
03A-Articolo su Rivista
Keywords:
Privacy, Text classification, Content analysis
Elenco autori:
Bioglio, Livio; Pensa, Ruggero G.
Autori di Ateneo:
BIOGLIO Livio
PENSA Ruggero Gaetano
Link alla scheda completa:
https://iris.unito.it/handle/2318/1845548
Link al Full Text:
https://iris.unito.it/retrieve/handle/2318/1845548/951389/epds2022_open.pdf
Pubblicato in:
EPJ DATA SCIENCE
Journal
  • Dati Generali
  • Aree Di Ricerca

Dati Generali

URL

https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-022-00324-y

Aree Di Ricerca

Settori (12)


PE6_11 - Machine learning, statistical data processing and applications using signal processing (e.g. speech, image, video) - (2024)

CIBO, AGRICOLTURA e ALLEVAMENTI - Farmacologia Veterinaria

CULTURA, ARTE e CREATIVITA' - Culture moderne

INFORMATICA, AUTOMAZIONE e INTELLIGENZA ARTIFICIALE - Digitalizzazione della Cultura e della Creatività

INFORMATICA, AUTOMAZIONE e INTELLIGENZA ARTIFICIALE - Digitalizzazione della Società e della Pubblica Amministrazione

INFORMATICA, AUTOMAZIONE e INTELLIGENZA ARTIFICIALE - Industria X.0

INFORMATICA, AUTOMAZIONE e INTELLIGENZA ARTIFICIALE - Salute e Informatica

LINGUE e LETTERATURA - Linguistica

PIANETA TERRA, AMBIENTE, CLIMA, ENERGIA e SOSTENIBILITA' - Diritto dell'Ambiente

PIANETA TERRA, AMBIENTE, CLIMA, ENERGIA e SOSTENIBILITA' - Informatica e Ambiente

SCIENZE DELLA VITA e FARMACOLOGIA - Tecnologie Farmaceutiche e Cosmetiche

SCIENZE MATEMATICHE, CHIMICHE, FISICHE - Teorie e modelli Matematici
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 25.5.5.0