Learning Long-term Dependencies Using Cognitive Inductive Biases in Self-attention RNNs