Data privacy is an increasingly important aspect of the analysis of big data for many real-world tasks. Privacy enhancing transformations of data can help unlocking the potential in data sources containing sensitive information, but finding the right balance between privacy and utility is often a tricky trade-off. In this work, we study how adversarial representation learning can be used to ensure the privacy of users, and to obfuscate sensitive attributes in existing datasets. While previous methods using this kind of approach only aim at obfuscating the sensitive information, we find that adding new information in its place strengthens the provided privacy. We propose a two step data privatization method that builds on generative adversarial networks: in the first step, sensitive data is removed from the representation, and in the second step, a sample which is independent of the input data is inserted in its place. The result is an approach that can provide stronger privatization on image data, and yet be preserving both the domain and the utility of the inputs.
John Martinsson, Edvin Listo Zec, Daniel Gillblad, Olof Mogren