Skip to content

Transformer

Variants

Title Year Author Link Memo
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View 2019 Yiping Lu1 et al. pdf ffn + attention + ffn
LITE TRANSFORMER WITH LONG-SHORT RANGE ATTENTION ICLR 2020 Zhanghao Wu et al. pdf lightweight conv (short distance) + self-attention (long distance)