Improving Transformer Models by Reordering their Sublayers

od · 5. červenec 2020 · 314 zhlédnutí ·

ACL 2020