Improving Transformer Models by Reordering their Sublayers

by · Jul 5, 2020 · 293 views ·

ACL 2020