TEXT

Neural Attention Models

In artificial neural networks, attention models allow the system to focus on certain parts of the input. This has shown to improve model accuracy in a number of applications. In image caption generation, attention models help to guide the model towards the parts of the image currently of interest. In neural machine translation, the attention mechanism gives the model an alignment of the words between the source sequence and the target sequence. In this talk, we'll go through the basic ideas and workings of attention models, both for recurrent networks and for convolutional networks. In conclusion, we will see some recent papers that applies attention mechanisms to solve different tasks in natural language processing and computer vision.

Mentioned papers

Other related papers

Links

Interview with Ilya Sutskever

Slides (PDF)

Talk, Chalmers Machine Learning Seminars, 2016-02-18
Olof Mogren

Olof Mogren, PhD, RISE Research institutes of Sweden. Follow me on Bluesky.