Fork me on GitHub

Named entity recognition in Swedish health records with character-based deep bidirectional LSTMs

Biomedical NER illustration.

We propose an approach for named entity recognition in medical data, using a character-based deep bidirectional recurrent neural network. Such models can learn features and patterns based on the character sequence, and are not limited to a fixed vocabulary. This makes them very well suited for the NER task in the medical domain. Our experimental evaluation shows promising results, with a 60% improvement in F 1 score over the baseline, and our system generalizes well between different datasets.


The dataset presented in this paper can be downloaded from It can be freely used, but please cite our paper. See “bibtex” below.

Source code

The source code used for the experiments can be downloaded from

Simon Almgren, Sean Pavlov, Olof Mogren

Fifth workshop on building and evaluating resources for biomedical text mining (BioTxtM) at COLING
PDF Fulltext

Olof Mogren, PhD, RISE Research institutes of Sweden. Follow me on Bluesky.