In artificial neural networks, attention models allow the system to focus on certain parts of the input. This has shown to improve model accuracy in a number of applications. In image caption generation, attention models help to guide the model towards the parts of the image currently of interest. In neural machine translation, the attention mechanism gives the model an alignment of the words between the source sequence and the target sequence. In this talk, we'll go through the basic ideas and workings of attention models, both for recurrent networks and for convolutional networks. In conclusion, we will see some recent papers that applies attention mechanisms to solve different tasks in natural language processing and computer vision.
Other related papers
Talk, Chalmers Machine Learning Seminars