TEXT
I am a senior research scientist at RISE Research institutes of Sweden
heading The Deep Learning Research Group in Gothenburg. I have a PhD from Chalmers University of Technology, and I am the organizer of RISE Learning Machines Seminars.
I work on problems within applied AI where privacy, fairness, and efficiency is central. This includes work on federated learning, privacy-preserving representation learning, and generative adversarial netorks. The data modality varies, such as natural language, vision, and speech.
Some of our ongoing projects include The Federated Learning Testbed, The Swedish Medical Data Lab, AI Driven Financial Risk Assessment of Circular Business Models, and Smart Fire Detection.
Read more about me, or about my research group.
Arxiv 2020
Federated learning using a mixture of experts: Federated learning has received attention for its efficiency and privacy benefits,in settings where data is distributed among devices. Although federated learn-ing shows significant promise as a key approach when data cannot be shared orcentralized, current incarnations show limited privacy properties and have shortcomings when applied to common real-world scenarios. One such scenario isheterogeneous data among devices, where data may come from different generating distributions. In this paper, we propose a federated learning framework usinga mixture of experts to balance the specialist nature of a locally trained model withthe generalist knowledge of a global model in a federated learning setting. Ourresults show that the mixture of experts model is better suited as a personalizedmodel for devices when data is heterogeneous, outperforming both global and local models. Furthermore, our framework gives strict privacy guarantees, whichallows clients to select parts of their data that may be excluded from the federation. The evaluation shows that the proposed solution is robust to the settingwhere some users require a strict privacy setting and do not disclose their modelsto a central server at all, opting out from the federation partially or entirely. The proposed framework is general enough to include any kind of machine learningmodels, and can even use combinations of different kinds.
Arxiv 2020
Adversarial representation learning for private speech generation: As more and more data is collected in various settings across organizations, companies, and countries, there has been an increase in the demand of user privacy. Developing privacy preserving methods for data analytics is thus an important area of research. In this work we present a model based on generative adversarial networks (GANs) that learns to obfuscate specific sensitive attributes in speech data. We train a model that learns to hide sensitive information in the data, while preserving the meaning in the utterance. The model is trained in two steps: first to filter sensitive information in the spectrogram domain, and then to generate new and private information independent of the filtered one. The model is based on a U-Net CNN that takes mel-spectrograms as input. A MelGAN is used to invert the spectrograms back to raw audio waveforms. We show that it is possible to hide sensitive information such as gender by generating new data, trained adversarially to maintain utility and realism.
Arxiv 2020
Adversarial representation learning for synthetic replacement of private attributes: Data privacy is an increasingly important aspect of the analysis of big data for many real-world tasks. Privacy enhancing transformations of data can help unlocking the potential in data sources containing sensitive information, but finding the right balance between privacy and utility is often a tricky trade-off. In this work, we study how adversarial representation learning can be used to ensure the privacy of users, and to obfuscate sensitive attributes in existing datasets. While previous methods using this kind of approach only aim at obfuscating the sensitive information, we find that adding new information in its place strengthens the provided privacy. We propose a two step data privatization method that builds on generative adversarial networks: in the first step, sensitive data is removed from the representation, and in the second step, a sample which is independent of the input data is inserted in its place. The result is an approach that can provide stronger privatization on image data, and yet be preserving both the domain and the utility of the inputs.
JHIR
Blood glucose prediction with variance estimation using recurrent neural networks: Many factors affect blood glucose levels in type 1 diabetics, several of which vary largely both in magnitude and delay of the effect. Modern rapid-acting insulins generally have a peak time after 60–90 min, while carbohydrate intake can affect blood glucose levels more rapidly for high glycemic index foods, or slower for other carbohydrate sources. It is important to have good estimates of the development of glucose levels in the near future both for diabetic patients managing their insulin distribution manually, as well as for closed-loop systems making decisions about the distribution. Modern continuous glucose monitoring systems provide excellent sources of data to train machine learning models to predict future glucose levels. In this paper, we present an approach for predicting blood glucose levels for diabetics up to 1 h into the future. The approach is based on recurrent neural networks trained in an end-to-end fashion, requiring nothing but the glucose level history for the patient. Our approach obtains results that are comparable to the state of the art on the Ohio T1DM dataset for blood glucose level prediction. In addition to predicting the future glucose value, our model provides an estimate of its certainty, helping users to interpret the predicted levels. This is realized by training the recurrent neural network to parameterize a univariate Gaussian distribution over the output. The approach needs no feature engineering or data preprocessing and is computationally inexpensive. We evaluate our method using the standard root-mean-squared error (RMSE) metric, along with a blood glucose-specific metric called the surveillance error grid (SEG). We further study the properties of the distribution that is learned by the model, using experiments that determine the nature of the certainty estimate that the model is able to capture.
CVCREATIVE 2019
Semantic segmentation of fashion images using feature pyramid networks: We approach fashion image analysis through semantic segmentation of fashion images, using both textural information and cues from shape and context, where target classes are clothing categories. Our main contributions are state-of-the-art semantic segmentation of fashion images with modest memory and compute requirements.
JLM 2019
Character-based recurrent neural networks for morphological relational reasoning: We present a model for predicting inflected word forms based on morphological analogies. Previous work includes rule-based algorithms that determine and copy affixes from one word to another, with limited support for varying inflectional patterns. In related tasks such as morphological reinflection, the algorithm is provided with an explicit enumeration of morphological features which may not be available in all cases. In contrast, our model is feature-free: instead of explicitly representing morphological features, the model is given a demo pair that implicitly specifies a morphological relation (such as write:writes specifying infinitive:present). Given this demo relation and a query word (e.g. watch), the model predicts the target word (e.g. watches). To address this task, we devise a character-based recurrent neural network architecture using three separate encoders and one decoder. Our experimental evaluation on five different languages shows that the exact form can be predicted with high accuracy, consistently beating the baseline methods. Particularly, for English the prediction accuracy is 95.60%. The solution is not limited to copying affixes from the demo relation, but generalizes to words with varying inflectional patterns, and can abstract away from the orthographic level to the level of morphological forms.
The source code used for the experiments can be downloaded from https://github.com/olofmogren/char-rnn-wordrelations.
CML 2016
C-RNN-GAN: Continuous recurrent neural networks with adversarial training: Generative adversarial networks have been proposed as a way of efficiently training deep generative neural networks. We propose a generative adversarial model that works on continuous sequential data, and apply it by training it on a collection of classical music. We conclude that it generates music that sounds better and better as the model is trained, report statistics on generated music, and let the reader judge the quality by downloading the generated songs.
2021-01-20
Learned representations and what they encode: Learned continuous embeddings for language units was some of the first trembling steps of making neural networks useful for natural language processing (NLP), and promised a future with semantically rich representations for downstream solutions. NLP has now seen some of the progress that previously happened in image processing: the availability of increased computing power and the development of algorithms have allowed people to train larger models that perform better than ever. Such models also make it possible to use transfer learning for language tasks, thus leveraging large widely available datasets.
In 2016, Bolukbasi, et.al., presented their paper "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings", shedding lights on some of the gender bias that was available in trained word embeddings at the time. Datasets obviously encode the social bias that surrounds us, and models trained on that data may expose the bias in their decisions. Similarly, learned representations may encode sensitive details about individuals in the datasets; allowing the disclosure of such information through distributed models or their outputs. All of these aspects are crucial in many application areas, not the least in the processing of medical texts.
Some solutions have been proposed to limit the expression of social bias in NLP systems. These include techniques such as data augmentation, representation calibration, and adversarial learning. Similar approaches may also be relevant for privacy and disentangled representations. In this talk, we'll discuss some of these issues, and go through some of the solutions that have been proposed recently to limit bias and to enhance privacy in various settings.
2020-11-27
Social bias and fairness in NLP: Learned continuous representations for language units was the first trembling steps of making neural networks useful for natural language processing (NLP), and promised a future with semantically rich representations for downstream solutions. NLP has now seen some of the progress that previously happened in image processing: the availability of increased computing power and the development of algorithms have allowed people to train larger models that perform better than ever. Such models also make it possible to use transfer learning for language tasks, thus leveraging large widely available datasets.
In 2016, Bolukbasi, et.al., presented their paper “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings”, shedding lights on some of the gender bias that was available in trained word embeddings at the time. Datasets obviously encode the social bias that surrounds us, and models trained on that data may expose the bias in their decisions. It is important to be aware of what information a learned system is basing its predictions on. Some solutions have been proposed to limit the expression of societal bias in NLP systems. These include techniques such as data augmentation and representation calibration. Similar approaches may also be relevant for privacy and disentangled representations. In this talk, we’ll discuss some of these issues, and go through some of the solutions that have been proposed recently.
2020-11-05
Uncertainty in deep learning: Our world is full of uncertainties: measurement errors, modeling errors, or uncertainty due to test-data being out-of-distribution are some examples. Machine learning systems are increasingly being used in crucial applications such as medical decision making and autonomous vehicle control: in these applications, mistakes due to uncertainties can be life threatening.
Deep learning have demonstrated astonishing results for many different tasks. But in general, predictions are deterministic and give only a point estimate as output. A trained model may seem confident in predictions where the uncertainty is high. To cope with uncertainties, and make decisions that are reasonable and safe under realistic circumstances, AI systems need to be developed with uncertainty strategies in mind. Machine learning approaches with uncertainty estimates can enable active learning: an acquisition function can be based on model uncertainty to guide in data collection and tagging. It can also be used to improve sample efficiency for reinforcement learning approaches.
In this talk, we will connect deep learning with Bayesian machine learning, and go through some example approaches to coping with, and leveraging, the uncertainty in data and in modelling, to produce better AI systems in real world scenarios.
Video: Youtube