To BERT or not to BERT: Comparing Task-specific and Task-agnostic Semi-supervised Approaches for Sequence Tagging