Fork me on GitHub

Learned representations and what they encode

Learned representations and what they encode

Learned continuous embeddings for language units was some of the first trembling steps of making neural networks useful for natural language processing (NLP), and promised a future with semantically rich representations for downstream solutions. NLP has now seen some of the progress that previously happened in image processing: the availability of increased computing power and the development of algorithms have allowed people to train larger models that perform better than ever. Such models also make it possible to use transfer learning for language tasks, thus leveraging large widely available datasets.

In 2016, Bolukbasi,, presented their paper "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings", shedding lights on some of the gender bias that was available in trained word embeddings at the time. Datasets obviously encode the social bias that surrounds us, and models trained on that data may expose the bias in their decisions. Similarly, learned representations may encode sensitive details about individuals in the datasets; allowing the disclosure of such information through distributed models or their outputs. All of these aspects are crucial in many application areas, not the least in the processing of medical texts.

Some solutions have been proposed to limit the expression of social bias in NLP systems. These include techniques such as data augmentation, representation calibration, and adversarial learning. Similar approaches may also be relevant for privacy and disentangled representations. In this talk, we'll discuss some of these issues, and go through some of the solutions that have been proposed recently to limit bias and to enhance privacy in various settings.


  • Kiela & Bottou, EMNLP 2014, Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics
  • Kågebäck, Mogren, Tahmasebi, Dubhashi, 2014, Extractive summarization using continuous vector space models,
  • Bolukbasi,, NeurIPS 2016, Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
  • Caliskan, A., Bryson, J.J., and Narayanan, A. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356(6334):183–186
  • Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang (EMNLP 2017) Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
  • Zhao,, EMNLP 2018, Learning Gender-Neutral Word Embeddings,
  • Sahlgren & Ohlsson, 2018, Gender Bias in Pretrained Swedish Embeddings,
  • Zhao,, NAACL 2018, Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods,
  • Rudinger,, NAACL 2018, Gender Bias in Coreference Resolution,
  • Zhang,, AIES 2018, Mitigating Unwanted Biases with Adversarial Learning
  • Sato,, ACL 2019, Effective Adversarial Regularization for Neural Machine Translation
  • Wang,, ICML 2019, Improving Neural Language Modeling via Adversarial Training,
  • Sheng, Chang, Natarajan, Peng (EMNLP 2019) The Woman Worked as a Babysitter: On Biases in Language Generation,
  • Friedrich, M., Köhn, A., Wiedemann, G., & Biemann, C. (ACL 2019). Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records. arXiv preprint arXiv:1906.05000
  • Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, and Kai-Wei Chang (NAACL 2019) Gender bias in contextualized word embeddings.
  • Yi Chern Tan and L. Elisa Celis. (NeurIPS 2019) Assessing social and intersectional biases in contextualized word representations
  • Vig, J., Gehrmann, S., Belinkov, Y., Qian, S., Nevo, D., Singer, Y., & Shieber, S. (NeurIPS 2020). Investigating gender bias in language models using causal mediation analysis.
  • Shokri, R., & Shmatikov, V. (2015, October). Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security (pp. 1310-1321).
  • Shokri,, Membership inference attacks against machine learning models,
  • Wang, Beyond inferring class representatives: User-level privacy leakage from federated learning,
  • Truex,, A Hybrid Approach to Privacy-Preserving Federated Learning,
  • Bagdasaryan,, How To Backdoor Federated Learning,
  • Stealing Machine Learning Models via Prediction APIs, USENIX Security, 2016.,
  • Krishna, K., Tomar, G.S., Parikh, A.P., Papernot, N., Iyyer, M. (ICLR 2020), Thieves on Sesame Street! Model Extraction of BERT-based APIs,
  • Geiping, Bauermeister, Dröge, Moeller (2020) Inverting Gradients - How easy is it to break privacy in federated learning?
  • Martinsson, J., Listo Zec, E., Gillblad, D., Mogren, O., (2020), Adversarial representation learning for synthetic replacement of private attributes.

Slides (PDF)

CLASP Seminar, Gothenburg University, 2021-01-20
Olof Mogren

Olof Mogren, PhD, RISE Research institutes of Sweden. Follow me on Bluesky.