DP-DILCA: Learning Differentially Private Context-based Distances for Categorical Data (Discussion Paper)
Contributo in Atti di convegno
Data di Pubblicazione:
2021
Abstract:
Distance-based machine learning methods have limited applicability to categorical data, since they do not capture the complexity of the relationships among different values of a categorical attribute. Nonetheless, categorical attributes are common in many application scenarios, including clinical and health records, census and survey data. Although distance learning algorithms exist for categorical data, they may disclose private information about individual records if applied to a secret dataset. To address this problem, we introduce a differentially private algorithm for learning distances between any pair of values of a categorical attribute according to the way they are co-distributed with the values of other categorical attributes forming the so-called context. We show empirically that our approach consumes little privacy budget while providing accurate distances
Tipologia CRIS:
04A-Conference paper in volume
Keywords:
differential privacy, metric learning, categorical attributes, distance-based methods
Elenco autori:
Elena Battaglia; Ruggero G. Pensa
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Proceedings of the 29th Italian Symposium on Advanced Database Systems (SEBD 2021)
Pubblicato in: