Layered Multistep Bidirectional Long Short-Term Memory Networks for Biomedical Word Sense Disambiguation

Daniel Biś, Canlin Zhang, Xiuwen Liu, Zhe He

December 2018

PDF

Abstract

In this paper, we propose a novel deep neural network architecture for supervised medical word sense disambiguation. Our architecture is based on a layered bidirectional LSTM network, upon which a max-pooling along multiple time steps are performed so that a dense representation of the context is created. In addition, we introduced four different adjustments to the output of the LSTM in order to find the most suitable input form to the max-pooling layer. Results show that the best model outperforms the current state-of-the-art model on the MSH WSD dataset. Moreover, we also train an “universal” network to disambiguate all the target ambiguous words together. We concatenate the embedding of the ambiguous word to the max-pooled vector in the universal network as a `hint’ layer. Results show that our universal network achieves nearly 90 percent of the test accuracy.

Type

Conference paper

Publication

In Source Themes Conference

Source Themes

Daniel Biś

Ph.D. Student

My research interests include Natural Language Processing, Deep Learning and Artificial Intelligence in general.

Layered Multistep Bidirectional Long Short-Term Memory Networks for Biomedical Word Sense Disambiguation

Abstract

Daniel Biś

Ph.D. Student

Related