Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy

Tong, Xiaosu; Xi, Bowei; Kantarcioglu, Murat; Inan, Ali

Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy

dc.contributor.author	Tong, Xiaosu
dc.contributor.author	Xi, Bowei
dc.contributor.author	Kantarcioglu, Murat
dc.contributor.author	Inan, Ali
dc.date.accessioned	2025-01-06T17:37:30Z
dc.date.available	2025-01-06T17:37:30Z
dc.date.issued	2017
dc.description	31st Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy (DBSec) -- JUL 19-21, 2017 -- Philadelphia, PA
dc.description.abstract	Many statistical models are constructed using very basic statistics: mean vectors, variances, and covariances. Gaussian mixture models are such models. When a data set contains sensitive information and cannot be directly released to users, such models can be easily constructed based on noise added query responses. The models nonetheless provide preliminary results to users. Although the queried basic statistics meet the differential privacy guarantee, the complex models constructed using these statistics may not meet the differential privacy guarantee. However it is up to the users to decide how to query a database and how to further utilize the queried results. In this article, our goal is to understand the impact of differential privacy mechanism on Gaussian mixture models. Our approach involves querying basic statistics from a database under differential privacy protection, and using the noise added responses to build classifier and perform hypothesis tests. We discover that adding Laplace noises may have a non-negligible effect on model outputs. For example variance-covariance matrix after noise addition is no longer positive definite. We propose a heuristic algorithm to repair the noise added variance-covariance matrix. We then examine the classification error using the noise added responses, through experiments with both simulated data and real life data, and demonstrate under which conditions the impact of the added noises can be reduced. We compute the exact type I and type II errors under differential privacy for one sample z test, one sample t test, and two sample t test with equal variances. We then show under which condition a hypothesis test returns reliable result given differentially private means, variances and covariances.
dc.description.sponsorship	IFIP WG 11 3
dc.identifier.doi	10.1007/978-3-319-61176-1_7
dc.identifier.endpage	141
dc.identifier.isbn	978-3-319-61176-1
dc.identifier.isbn	978-3-319-61175-4
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.scopus	2-s2.0-85022091269
dc.identifier.scopusquality	Q3
dc.identifier.startpage	123
dc.identifier.uri	https://doi.org/10.1007/978-3-319-61176-1_7
dc.identifier.uri	https://hdl.handle.net/20.500.14669/2237
dc.identifier.volume	10359
dc.identifier.wos	WOS:000463615900007
dc.identifier.wosquality	N/A
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.language.iso	en
dc.publisher	Springer International Publishing Ag
dc.relation.ispartof	Data and Applications Security and Privacy Xxxi, Dbsec 2017
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.snmz	KA_20241211
dc.subject	Differential privacy
dc.subject	Statistical database
dc.subject	Mixture model
dc.subject	Classification
dc.subject	Hypothesis test
dc.title	Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy
dc.type	Conference Object

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy

Dosyalar

Koleksiyon