Outcome-Driven Reinforcement Learning via Variational Inference