Arrow Research search
Back to ICML

ICML 2021

Poolingformer: Long Document Modeling with Pooling Attention

Conference Paper Accepted Paper Artificial Intelligence ยท Machine Learning

Abstract

In this paper, we introduce a two-level attention schema, Poolingformer, for long document modeling. Its first level uses a smaller sliding window pattern to aggregate information from neighbors. Its second level employs a larger window to increase receptive fields with pooling attention to reduce both computational cost and memory consumption. We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA. Experimental results show that Poolingformer sits atop three official leaderboards measured by F1, outperforming previous state-of-the-art models by 1. 9 points (79. 8 vs. 77. 9) on NQ long answer, 1. 9 points (79. 5 vs. 77. 6) on TyDi QA passage answer, and 1. 6 points (67. 6 vs. 66. 0) on TyDi QA minimal answer. We further evaluate Poolingformer on a long sequence summarization task. Experimental results on the arXiv benchmark continue to demonstrate its superior performance.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
International Conference on Machine Learning
Archive span
1993-2025
Indexed papers
16471
Paper id
11457205654503474