- Researchers develop a new neural network based on a previously unexplored idea – rotations.
- It can replace existing methods to address real-world problems such as text summarization, language modeling, and question answering.
A research paper is usually filled with specialized approaches and technical terminologies, which make it quite difficult to understand for readers without a scientific background.
Recently, scientists at MIT and Qatar Computing Research Institute came up with a new artificial intelligence (AI) model that can read scientific journals and provide a plain-English summary in a couple of sentences.
Although it yields far better results than previous techniques, it certainly cannot replace science writers and editors. However, this AI can help writers scan a larger number of journals and get an idea of what they are about.
The research team was originally trying to develop neural networks to tackle certain physics problems, for example how light behaves in intricate engineered materials.
They soon realized that the same methodology can be used to handle other complex computational tasks, such as speech recognition and natural language interpretation, in much more efficient ways than existing machine learning methods.
What They Actually Did?
In the last few years, Recurrent Neural Network (RNN) has become a standard artificial neural network for addressing a wide range of tasks, from language modeling to text summarization and developing chatbots systems.
Various techniques have been developed to improve the ability to correlate information from a long string of data. The most popular ones are Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU). But they still fail to demonstrate impressive memory capabilities or efficient recall on synthetic tasks.
That’s why researchers developed an alternative approach called Rotational Unit of Memory (RUM). Unlike traditional neural networks that are based on the multiplication of matrices, RUM is based on vectors rotating in multidimensional space.
It uses a vector in multidimensional space (a certain line pointing in a specific direction) to represent every single word in the text. Each successive word diverts the vector to a particular direction in a theoretical space, which might contain thousands of dimensions. The resulting vector (or a group of vectors) is then converted back to its associated string of words.
Overall, RUM does two things: it memorizes complicated sequential dynamics and recalls information accurately. It also shows promising performance for character-level language modeling and question answering.
Researchers tested this system on numerous scientific papers, including their own paper describing these findings, and compared the results with traditional LSTM- and GRU-based neural networks.
Instead of just scanning abstracts, RUM reads the entire paper to generate a simple summary of their content. The summary rendered by this system contained less technical terms and repetitive words. Although it wasn’t elegant prose, it does hit the key points of data.
You can try this system on your own tasks: the code and demo are available on GitHub.