Paper Explained - Expire-Span: Not All Memories are Created Equal: Learning to Forget by Expiring (Full Video Analysis)

Facebook AI (FAIR) researchers present Expire-Span, a variant of Transformer XL that dynamically assigns expiration dates to previously encountered signals. Because of this, Expire-Span can handle sequences of many thousand tokens, while keeping the memory and compute requirements at a manageable level. It severely matches or outperforms baseline systems, while consuming much less resources. We discuss its architecture, advantages, and shortcomings.

OUTLINE:
0:00 - Intro & Overview
2:30 - Remembering the past in sequence models
5:45 - Learning to expire past memories
8:30 - Difference to local attention
10:00 - Architecture overview
13:45 - Comparison to Transformer XL
18:50 - Predicting expiration masks
32:30 - Experimental Results
40:00 - Conclusion & Comments

Paper: [2105.06548] Not All Memories are Created Equal: Learning to Forget by Expiring
Code: https://github.com/facebookresearch/t