Improving Transformer Models by Reordering their Sublayers

von · Jul 5, 2020 · 314 Besichtigungen ·

ACL 2020