In this post we describe our attempt to re-implement a neural architecture for automated question answering called R-NET, which is developed by the Natural Language Computing Group of Microsoft Research Asia. This architecture demonstrates the best performance among single models (not ensembles) on The Stanford Question Answering Dataset (as of August 25, 2017). MSR researchers released a technical report describing the model but did not release the code. We tried to implement the architecture in Keras framework and reproduce their results. This post describes the model and the challenges we faced while implementing it [View on GitHub ].
A few months ago, we showed how effectively an LSTM network can perform text transliteration.
For humans, transliteration is a relatively easy and interpretable task, so it’s a good task for interpreting what the network is doing, and whether it is similar to how humans approach the same task.
In this post we’ll try to understand: What do individual neurons of the network actually learn? How are they used to make decisions?
Today we have officially registered YerevaNN scientific educational foundation, which aims to promote world-class AI research in Armenia and develop high quality educational programs in machine learning and related disciplines. The board members of the foundation are Gor Vardanyan, founder of FimeTech, Vazgen Hakobjanyan, cofounder of Teamable, and Rouben Meschian, founder of Arminova Technologies. Hrant Khachatrian is the director of the foundation.
The success of neural word embedding models like word2vec and GloVe motivated research on representing sentences in an n-dimensional space. Michael Manukyan and Hrayr Harutyunyan reviewed several sentence representation algorithms and their applications in state-of-the-art automated question answering systems during a talk at the Armenian NLP meetup. The slides of the talk are below. Follow us on SlideShare to get the latest slides from YerevaNN.
Many languages have their own non-Latin alphabets but the web is full of content in those languages written in Latin letters, which makes it inaccessible to various NLP tools (e.g. automatic translation). Transliteration is the process of converting the romanized text back to the original writing system. In theory every language has a strict set of romanization rules, but in practice people do not follow the rules and most of the romanized content is hard to transliterate using rule based algorithms. We believe this problem is solvable using the state of the art NLP tools, and we demonstrate a high quality solution for Armenian based on recurrent neural networks. We invite everyone to adapt our system for more languages.