Discrete Planning with End-to-end Trained Neuro-algorithmic Policies