Improving Transformer Models by Reordering their Sublayers

by · Jul 5, 2020 · 311 views ·

ACL 2020