Last year Hrayr used convolutional networks to identify spoken language from short audio recordings for a TopCoder contest and got 95% accuracy. After the end of the contest we decided to try recurrent neural networks and their combinations with CNNs on the same task. The best combination allowed to reach 99.24% and an ensemble of 33 models reached 99.67%. This work became Hrayr’s bachelor’s thesis.
Recently we have implemented Dynamic memory networks in Theano and trained it on Facebook’s bAbI tasks which are designed for testing basic reasoning abilities. Our implementation now solves 8 out of 20 bAbI tasks which is still behind state-of-the-art. Today we release a web application for testing and comparing several network architectures and pretrained models.
The Allen Institute for Artificial Intelligence has organized a 4 month contest in Kaggle on question answering. The aim is to create a system which can correctly answer the questions from the 8th grade science exams of US schools (biology, chemistry, physics etc.). DeepHack Lab organized a scientific school + hackathon devoted to this contest in Moscow. Our team decided to use this opportunity to explore the deep learning techniques on question answering (although they seem to be far behind traditional systems). We tried to implement Dynamic memory networks described in a paper by A. Kumar et al. Here we report some preliminary results. In the next blog post we will describe the techniques we used to get to top 5% in the contest.
Few months ago Andrej Karpathy wrote a great blog post about recurrent neural networks. He explained how these networks work and implemented a character-level RNN language model which learns to generate Paul Graham essays, Shakespeare works, Wikipedia articles, LaTeX articles and even C++ code. He also released the code of the network on Github. Lots of people did experiments, like generating recipes, Bible or Irish folk music. We decided to test it on some legal texts in Armenian.
Recently TopCoder announced a contest to identify the spoken language in audio recordings. I decided to test how well deep convolutional networks will perform on this kind of data. In short I managed to get around 95% accuracy and finished at the 10th place. This post reveals all the details.