Structure V - DNN with Attention 3

2 minute read

Presenter	Papers	Paper URL	Our Slides
Tianlu	Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, ICML17 ¹	PDF + code	PDF
Jack	Reasoning with Memory Augmented Neural Networks for Language Comprehension, ICLR17 ²	PDF	PDF
Xueying	State-Frequency Memory Recurrent Neural Networks, ICML17 ³	PDF	PDF

_{^{Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, ICML16 / Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook’s bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on trained word vector representations and input-question-answer triplets. (high citation)}} ↩
_{^{Reasoning with Memory Augmented Neural Networks for Language Comprehension, ICLR17 / Hypothesis testing is an important cognitive process that supports human reasoning. In this paper, we introduce a computational hypothesis testing approach based on memory augmented neural networks. Our approach involves a hypothesis testing loop that reconsiders and progressively refines a previously formed hypothesis in order to generate new hypotheses to test. We apply the proposed approach to language comprehension task by using Neural Semantic Encoders (NSE). Our NSE models achieve the state-of-the-art results showing an absolute improvement of 1.2% to 2.6% accuracy over previous results obtained by single and ensemble systems on standard machine comprehension benchmarks such as the Children’s Book Test (CBT) and Who-Did-What (WDW) news article datasets.}} ↩
_{^{State-Frequency Memory Recurrent Neural Networks, ICML17/ Modeling temporal sequences plays a fundamental role in various modern applications and has drawn more and more attentions in the machine learning community. Among those efforts on improving the capability to represent temporal data, the Long Short-Term Memory (LSTM) has achieved great success in many areas. Although the LSTM can capture long-range dependency in the time domain, it does not explicitly model the pattern occurrences in the frequency domain that plays an important role in tracking and predicting data points over various time cycles. We propose the State-Frequency Memory (SFM), a novel recurrent architecture that allows to separate dynamic patterns across different frequency components and their impacts on modeling the temporal contexts of input sequences. By jointly decomposing memorized dynamics into state-frequency components, the SFM is able to offer a fine-grained analysis of temporal sequences by capturing the dependency of uncovered patterns in both time and frequency domains. Evaluations on several temporal modeling tasks demonstrate the SFM can yield competitive performances, in particular as compared with the state-of-the-art LSTM models.}} ↩

Twitter Facebook LinkedIn

Dr. Yanjun Qi

Structure V - DNN with Attention 3

You May Also Enjoy

Safety Benchmark WMDP

KV Cache and Tooling

Advanced Transformer Architectures

LLM fine tuning