Federated learning using a mixture of experts: Federated learning has received attention for its efficiency and privacy benefits,in settings where data is distributed among devices. Although federated learn-ing shows significant promise as a key approach when data cannot be shared orcentralized, current incarnations show limited privacy properties and have shortcomings when applied to common real-world scenarios. One such scenario isheterogeneous data among devices, where data may come from different generating distributions. In this paper, we propose a federated learning framework usinga mixture of experts to balance the specialist nature of a locally trained model withthe generalist knowledge of a global model in a federated learning setting. Ourresults show that the mixture of experts model is better suited as a personalizedmodel for devices when data is heterogeneous, outperforming both global and local models. Furthermore, our framework gives strict privacy guarantees, whichallows clients to select parts of their data that may be excluded from the federation. The evaluation shows that the proposed solution is robust to the settingwhere some users require a strict privacy setting and do not disclose their modelsto a central server at all, opting out from the federation partially or entirely. The proposed framework is general enough to include any kind of machine learningmodels, and can even use combinations of different kinds.
Click to read more!
Adversarial representation learning for private speech generation: As more and more data is collected in various settings across organizations, companies, and countries, there has been an increase in the demand of user privacy. Developing privacy preserving methods for data analytics is thus an important area of research. In this work we present a model based on generative adversarial networks (GANs) that learns to obfuscate specific sensitive attributes in speech data. We train a model that learns to hide sensitive information in the data, while preserving the meaning in the utterance. The model is trained in two steps: first to filter sensitive information in the spectrogram domain, and then to generate new and private information independent of the filtered one. The model is based on a U-Net CNN that takes mel-spectrograms as input. A MelGAN is used to invert the spectrograms back to raw audio waveforms. We show that it is possible to hide sensitive information such as gender by generating new data, trained adversarially to maintain utility and realism.
Adversarial representation learning for synthetic replacement of private attributes: Data privacy is an increasingly important aspect of the analysis of big data for many real-world tasks. Privacy enhancing transformations of data can help unlocking the potential in data sources containing sensitive information, but finding the right balance between privacy and utility is often a tricky trade-off. In this work, we study how adversarial representation learning can be used to ensure the privacy of users, and to obfuscate sensitive attributes in existing datasets. While previous methods using this kind of approach only aim at obfuscating the sensitive information, we find that adding new information in its place strengthens the provided privacy. We propose a two step data privatization method that builds on generative adversarial networks: in the first step, sensitive data is removed from the representation, and in the second step, a sample which is independent of the input data is inserted in its place. The result is an approach that can provide stronger privatization on image data, and yet be preserving both the domain and the utility of the inputs.
Semantic segmentation of fashion images using feature pyramid networks: We approach fashion image analysis through semantic segmentation of fashion images, using both textural information and cues from shape and context, where target classes are clothing categories. Our main contributions are state-of-the-art semantic segmentation of fashion images with modest memory and compute requirements.
Generative modelling of semantic segmentation data in the fashion domain: In this work, we propose a method to generatively model the joint distribution of images and corresponding semantic segmentation maps using generative adversarial networks. We extend the Style-GAN architecture by iteratively growing the network during training, to add new output channels that model the semantic segmentation maps. We train the proposed method on a large dataset of fashion images and our experimental evaluation shows that the model produces samples that are coherent and plausible with semantic segmentation maps that closely match the semantics in the image.
Character-based recurrent neural networks for morphological relational reasoning: We present a model for predicting inflected word forms based on morphological analogies. Previous work includes rule-based algorithms that determine and copy affixes from one word to another, with limited support for varying inflectional patterns. In related tasks such as morphological reinflection, the algorithm is provided with an explicit enumeration of morphological features which may not be available in all cases. In contrast, our model is feature-free: instead of explicitly representing morphological features, the model is given a demo pair that implicitly specifies a morphological relation (such as write:writes specifying infinitive:present). Given this demo relation and a query word (e.g. watch), the model predicts the target word (e.g. watches). To address this task, we devise a character-based recurrent neural network architecture using three separate encoders and one decoder. Our experimental evaluation on five different languages shows that the exact form can be predicted with high accuracy, consistently beating the baseline methods. Particularly, for English the prediction accuracy is 95.60%. The solution is not limited to copying affixes from the demo relation, but generalizes to words with varying inflectional patterns, and can abstract away from the orthographic level to the level of morphological forms.
Preliminary version appeared in Subword & Character Level Models in NLP (SCLeM) workshop at EMNLP 2017 in Copenhagen, Denmark, September 7.
The source code used for the experiments can be downloaded from https://github.com/olofmogren/char-rnn-wordrelations.
Disentanglement by Penalizing Correlation: Deep neural networks have been tremendously successful in a number of tasks. One of the main reasons for this is their capability to automatically learn representations of data in levels of abstraction, increasingly disentangling the data as the internal transformations are applied. In this paper we propose a novel regularization method that penalize covariance between dimensions of the hidden layers in a network, something that benefits the disentanglement. This makes the network learn nonlinear representations that are linearly uncorrelated, yet allows the model to obtain good results on a number of tasks, as demonstrated by our experimental evaluation. The proposed technique can be used to find the dimensionality of the underlying data, because it effectively disables dimensions that aren't needed. Our approach is simple and computationally cheap, as it can be applied as a regularizer to any gradient-based learning model.
Character-based recurrent neural networks for morphological relational reasoning: Given a demo relation (a pair of word forms) and a query word, we devise a character-based recurrent neural network architecture using three separate encoders and a decoder, trained to predict the missing second form of the query word. Our results show that the exact form can be predicted for English with an accuracy of 94.7%. For Swedish, which has a more complex morphology with more inflectional patterns for nouns and verbs, the accuracy is 89.3%.
Named entity recognition in Swedish health records with character-based deep bidirectional LSTMs: We propose an approach for named entity recognition in medical data, using a character-based deep bidirectional recurrent neural network. Such models can learn features and patterns based on the character sequence, and are not limited to a fixed vocabulary. This makes them very well suited for the NER task in the medical domain. Our experimental evaluation shows promising results, with a 60% improvement in F 1 score over the baseline, and our system generalizes well between different datasets.
C-RNN-GAN: Continuous recurrent neural networks with adversarial training: Generative adversarial networks have been proposed as a way of efficiently training deep generative neural networks. We propose a generative adversarial model that works on continuous sequential data, and apply it by training it on a collection of classical music. We conclude that it generates music that sounds better and better as the model is trained, report statistics on generated music, and let the reader judge the quality by downloading the generated songs.
Assisting discussion forum users using deep recurrent neural networks: In this work, we present a discussion forum assistant based on deep recurrent neural networks (RNNs). The assistant is trained to perform three different tasks when faced with a question from a user. Firstly, to recommend related posts. Secondly, to recommend other users that might be able to help. Thirdly, it recommends other channels in the forum where people may discuss related topics. Our recurrent forum assistant is evaluated experimentally by prediction accuracy for the end--to--end trainable parts, as well as by performing an end-user study. We conclude that the model generalizes well, and is helpful for the users.
Extractive summarization by aggregating multiple similarities: Many existing methods for extracting summaries rely on comparing the similarity of two sentences in some way. In this paper, we present new ways of measuring this similarity, based on sentiment analysis and continuous vector space representations, and show that combining these together with similarity measures from existing methods, helps to create better summaries. The finding is demonstrated with MULTSUM, a novel summarization method that uses ideas from kernel methods to combine sentence similarity measures. Submodular optimization is then used to produce summaries that take several different similarity measures into account. Our method improves over the state-of-the-art on standard benchmark datasets; it is also fast and scale to large document collections, and the results are statistically significant.
Visions and open challenges for a knowledge-based culturomics: A white paper outlining some ideas and challenges within the field of culturomics.
Editing simple graphs: Inspired by the word-co-occurrence graph from Wikipedia documents, this paper presents an FPT approach to cluster the words.
Extractive summarization using continuous vector space models: A workshop paper showing preliminary results on multi-document summarization with continuous vector space models for sentence representation. The experiments were performed on opinionated online user reviews.
Adaptive dynamics of realistic small-world networks: Continuing in the steps of Jon Kleinberg's and others celebrated work on decentralized search in small-world networks, we conduct an experimental analysis of a dynamic algorithm that produces small-world networks. We find that the algorithm adapts robustly to a wide variety of situations in realistic geographic networks with synthetic test data and with real world data, even when vertices are uneven and non-homogeneously distributed.